Opened 7 years ago
#87 new defect
Inconsistent Assertion Ordering
Reported by: | Rob Schluntz | Owned by: | |
---|---|---|---|
Priority: | minor | Component: | cfa-cc |
Version: | 1.0 | Keywords: | assertion |
Cc: |
Description
Currently we use this comparator to sort assertions in AssertionSet
s.
struct AssertCompare { bool operator()( DeclarationWithType * d1, DeclarationWithType * d2 ) const { int cmp = d1->get_name().compare( d2->get_name() ); return cmp < 0 || ( cmp == 0 && d1->get_type() < d2->get_type() ); } }; typedef std::map< DeclarationWithType*, AssertionSetValue, AssertCompare > AssertionSet;
This leads to potentially large changes in compilation time from seemingly minor changes in the input file, since assertions are partially sorted by type pointer, wherein changes to the input can change the memory layout of the program. The ordering of assertions in AssertionSet
s is important in inferRecursive, where the order determines how many candidates are checked.
One idea is to attempt to order assertions so that the assertions with the fewest number of candidates come first. Here is an example implementation:
size_t Indexer::estimateId( const std::string &id ) const { size_t count = 0; std::unordered_set< std::string > foundMangleNames; Indexer::Impl *searchTables = tables; while ( searchTables ) { IdTable::const_iterator decls = searchTables->idTable.find( id ); if ( decls != searchTables->idTable.end() ) { const MangleTable &mangleTable = decls->second; for ( MangleTable::const_iterator decl = mangleTable.begin(); decl != mangleTable.end(); ++decl ) { // mark the mangled name as found, skipping this insertion if a declaration for that name has already been found if ( foundMangleNames.insert( decl->first ).second == false ) continue; ++count; } } // get declarations from base indexers searchTables = searchTables->base.tables; } return count; } struct AssertCompare2 { AssertCompare compare; const SymTab::Indexer & decls; AssertCompare2( const SymTab::Indexer & decls ) : decls( decls ) {} bool operator()( DeclarationWithType * d1, DeclarationWithType * d2 ) { auto n1 = decls.estimateId( d1->name ); auto n2 = decls.estimateId( d2->name ); if ( n1 == n2 ) { return compare( d1, d2 ); } else { return n1 < n2; } } }; typedef std::map< DeclarationWithType*, AssertionSetValue, AssertCompare2 > AssertionSet2;
Then replace calls to inferRecursive
with
AssertCompare2 comp( decls ); AssertionSet2 ordered( newNeed.begin(), newNeed.end(), comp ); inferRecursive( ordered.begin(), ordered.end(), newAlt, openVars, decls, newerNeed, /*needParents,*/ level+1, indexer, out );
This exact setup leads to the results below:
Here's a few data points from full runs of the test suite:
origin/master
real 22m23.644s
user 179m19.648s
sys 1m55.240s
origin/master + reordering
real 41m27.435s
user 374m52.536s
sys 2m2.412s
origin/master without reordering, but with calls to calculate number of candidates per assertion (to try and measure the overhead of that operation, and the potential for caching)
real 30m59.777s
user 267m18.048s
sys 2m22.136s
origin/master without reordering, without candidate calculations, but with the extra copies required to make new ordered maps where necessary (to measure the overhead of the copies)
real 23m27.677s
user 183m15.980s
sys 1m56.080s
This seems to indicate to me that
1) this ordering is still worse than the somewhat 'random' ordering that we currently have, for at least some inputs
2) if we did take such an approach, there is a noticeable overhead introduced by repeatedly counting the number of candidates at each comparison, so tracking the number of overloads for a name in the Indexer might alleviate some of this
3) There is relatively little overhead to the extra copies needed to take this approach
So I'm not ready to give up on the idea of a consistent ordering, but this specific ordering probably isn't the right approach. I'm not sure offhand how to make it better, so it's probably best to leave it out for now but keep it in mind as a known defect.