Opened 6 years ago

#87 new defect

Inconsistent Assertion Ordering

Reported by: Rob Schluntz Owned by:
Priority: minor Component: cfa-cc
Version: 1.0 Keywords: assertion
Cc:

Description

Currently we use this comparator to sort assertions in AssertionSets.

struct AssertCompare {
  bool operator()( DeclarationWithType * d1, DeclarationWithType * d2 ) const {
    int cmp = d1->get_name().compare( d2->get_name() );
    return cmp < 0 ||
      ( cmp == 0 && d1->get_type() < d2->get_type() );
  }
};

typedef std::map< DeclarationWithType*, AssertionSetValue, AssertCompare > AssertionSet;

This leads to potentially large changes in compilation time from seemingly minor changes in the input file, since assertions are partially sorted by type pointer, wherein changes to the input can change the memory layout of the program. The ordering of assertions in AssertionSets is important in inferRecursive, where the order determines how many candidates are checked.

One idea is to attempt to order assertions so that the assertions with the fewest number of candidates come first. Here is an example implementation:

size_t Indexer::estimateId( const std::string &id ) const {
  size_t count = 0;
  std::unordered_set< std::string > foundMangleNames;
  Indexer::Impl *searchTables = tables;
  while ( searchTables ) {
    IdTable::const_iterator decls = searchTables->idTable.find( id );
    if ( decls != searchTables->idTable.end() ) {
      const MangleTable &mangleTable = decls->second;
      for ( MangleTable::const_iterator decl = mangleTable.begin(); decl != mangleTable.end(); ++decl ) {
      // mark the mangled name as found, skipping this insertion if a declaration for that name has already been found
      if ( foundMangleNames.insert( decl->first ).second == false ) continue;
      ++count;
      }
    }
    // get declarations from base indexers
    searchTables = searchTables->base.tables;
  }
  return count;
}

struct AssertCompare2 {
  AssertCompare compare;
  const SymTab::Indexer & decls;
  AssertCompare2( const SymTab::Indexer & decls ) : decls( decls ) {}
  bool operator()( DeclarationWithType * d1, DeclarationWithType * d2 ) {
    auto n1 = decls.estimateId( d1->name );
    auto n2 = decls.estimateId( d2->name );
    if ( n1 == n2 ) {
      return compare( d1, d2 );
    } else {
      return n1 < n2;
    }
  }
};
typedef std::map< DeclarationWithType*, AssertionSetValue, AssertCompare2 > AssertionSet2;

Then replace calls to inferRecursive with

AssertCompare2 comp( decls );
AssertionSet2 ordered( newNeed.begin(), newNeed.end(), comp );
inferRecursive( ordered.begin(), ordered.end(), newAlt, openVars, decls, newerNeed, /*needParents,*/ level+1, indexer, out );

This exact setup leads to the results below:

Here's a few data points from full runs of the test suite:

origin/master
real 22m23.644s
user 179m19.648s
sys 1m55.240s

origin/master + reordering
real 41m27.435s
user 374m52.536s
sys 2m2.412s

origin/master without reordering, but with calls to calculate number of candidates per assertion (to try and measure the overhead of that operation, and the potential for caching)
real 30m59.777s
user 267m18.048s
sys 2m22.136s

origin/master without reordering, without candidate calculations, but with the extra copies required to make new ordered maps where necessary (to measure the overhead of the copies)
real 23m27.677s
user 183m15.980s
sys 1m56.080s

This seems to indicate to me that
1) this ordering is still worse than the somewhat 'random' ordering that we currently have, for at least some inputs
2) if we did take such an approach, there is a noticeable overhead introduced by repeatedly counting the number of candidates at each comparison, so tracking the number of overloads for a name in the Indexer might alleviate some of this
3) There is relatively little overhead to the extra copies needed to take this approach

So I'm not ready to give up on the idea of a consistent ordering, but this specific ordering probably isn't the right approach. I'm not sure offhand how to make it better, so it's probably best to leave it out for now but keep it in mind as a known defect.

Change History (0)

Note: See TracTickets for help on using tickets.