id	summary	reporter	owner	description	type	status	priority	component	version	resolution	keywords	cc
87	Inconsistent Assertion Ordering	Rob Schluntz		"Currently we use this comparator to sort assertions in `AssertionSet`s.
{{{
struct AssertCompare {
  bool operator()( DeclarationWithType * d1, DeclarationWithType * d2 ) const {
    int cmp = d1->get_name().compare( d2->get_name() );
    return cmp < 0 ||
      ( cmp == 0 && d1->get_type() < d2->get_type() );
  }
};

typedef std::map< DeclarationWithType*, AssertionSetValue, AssertCompare > AssertionSet;
}}}

This leads to potentially large changes in compilation time from seemingly minor changes in the input file, since assertions are partially sorted by type pointer, wherein changes to the input can change the memory layout of the program. The ordering of assertions in `AssertionSet`s is important in inferRecursive, where the order determines how many candidates are checked. 

One idea is to attempt to order assertions so that the assertions with the fewest number of candidates come first. Here is an example implementation:

{{{
size_t Indexer::estimateId( const std::string &id ) const {
  size_t count = 0;
  std::unordered_set< std::string > foundMangleNames;
  Indexer::Impl *searchTables = tables;
  while ( searchTables ) {
    IdTable::const_iterator decls = searchTables->idTable.find( id );
    if ( decls != searchTables->idTable.end() ) {
      const MangleTable &mangleTable = decls->second;
      for ( MangleTable::const_iterator decl = mangleTable.begin(); decl != mangleTable.end(); ++decl ) {
      // mark the mangled name as found, skipping this insertion if a declaration for that name has already been found
      if ( foundMangleNames.insert( decl->first ).second == false ) continue;
      ++count;
      }
    }
    // get declarations from base indexers
    searchTables = searchTables->base.tables;
  }
  return count;
}

struct AssertCompare2 {
  AssertCompare compare;
  const SymTab::Indexer & decls;
  AssertCompare2( const SymTab::Indexer & decls ) : decls( decls ) {}
  bool operator()( DeclarationWithType * d1, DeclarationWithType * d2 ) {
    auto n1 = decls.estimateId( d1->name );
    auto n2 = decls.estimateId( d2->name );
    if ( n1 == n2 ) {
      return compare( d1, d2 );
    } else {
      return n1 < n2;
    }
  }
};
typedef std::map< DeclarationWithType*, AssertionSetValue, AssertCompare2 > AssertionSet2;
}}}

Then replace calls to `inferRecursive` with
{{{
AssertCompare2 comp( decls );
AssertionSet2 ordered( newNeed.begin(), newNeed.end(), comp );
inferRecursive( ordered.begin(), ordered.end(), newAlt, openVars, decls, newerNeed, /*needParents,*/ level+1, indexer, out );
}}}

This exact setup leads to the results below:
> Here's a few data points from full runs of the test suite:
> 
> origin/master
> real  22m23.644s
> user  179m19.648s
> sys 1m55.240s
> 
> origin/master + reordering
> real  41m27.435s
> user  374m52.536s
> sys 2m2.412s
> 
> origin/master without reordering, but with calls to calculate number of candidates per assertion (to try and measure the overhead of that operation, and the potential for caching)
> real  30m59.777s
> user  267m18.048s
> sys 2m22.136s
> 
> origin/master without reordering, without candidate calculations, but with the extra copies required to make new ordered maps where necessary (to measure the overhead of the copies)
> real  23m27.677s
> user  183m15.980s
> sys 1m56.080s
> 
> This seems to indicate to me that
> 1) this ordering is still worse than the somewhat 'random' ordering that we currently have, for at least some inputs
> 2) if we did take such an approach, there is a noticeable overhead introduced by repeatedly counting the number of candidates at each comparison, so tracking the number of overloads for a name in the Indexer might alleviate some of this
> 3) There is relatively little overhead to the extra copies needed to take this approach
> 
> So I'm not ready to give up on the idea of a consistent ordering, but this specific ordering probably isn't the right approach. I'm not sure offhand how to make it better, so it's probably best to leave it out for now but keep it in mind as a known defect. "	defect	new	minor	cfa-cc	1.0		assertion