Context Navigation

← Previous Change
Next Change →

Changeset eb0d9b7 for libcfa

Timestamp:

Dec 20, 2025, 4:52:54 AM (3 months ago)

Author:

Michael Brooks <mlbrooks@…>

Branches:

master, stuck-waitfor-destruct

Children:

80e83b6c

Parents:

0210a543

Message:

Improve libcfa-array's bound-check removal and write that thesis section.

The libcfa change adds a more performant alternative for a subset of multidimensional indexing cases that were already functionally correct.
That the new alternative is more performant is not shown in the test suite.
There is an associated new high-performance option for passing an array-or-slice to a function.
The added test cases cover those options.

The added in-thesis demos rely on the new more-performant alternative for multidimensional indexing.

File:

: 1 edited

libcfa/src/collections/array.hfa (modified) (1 diff)

Legend:

: Unmodified
: Added
: Removed

libcfa/src/collections/array.hfa

-              r0210a543
+              reb0d9b7
+}
+// Further form of -[-,-,-] that avoids using the trait system.
+// Above overloads work for any type with (recursively valid) subscript operator,
+// provided said subscript is passed as an assertion.
+// Below works only on arpk variations but never passes its subscript though an assertion.
+//
+// When arpk implements the trait used above,
+// the critical assertion is backed by a nontrivial thunk.
+// There is no "thunk problem" (lifetime) issue, when used as shown in the test suite.
+// But the optimizer has shown difficulty removing these thunks in cases where "it should,"
+// i.e. when all user code is in one compilation unit.
+// Not that every attempt at removing such a thunk fails; cases have been found going both ways.
+// Cases have been found with unnecessary bound-checks removed successfully,
+// on user code written against the overloads below,
+// but where these bound checks (which occur within `call`ed thunks) are not removed,
+// on user code written against the overloads above.
+//
+// The overloads below provide specializations of the above
+// that are a little harder to use than the ones above,
+// but where array API erasure has been seen to be more effective.
+// Note that the style below does not appeal to a case where thunk inlining is more effective;
+// rather, it simply does not rely on thunks in the first place.
+//
+// Both usage styles are shown in test array-md-sbscr-cases#numSubscrTypeCompatibility,
+// with the more general one above being "high abstraction,"
+// and the more performant one below being "mid abstraction" and "low abstraction."
+//
+// A breadth of index types is not given here (providing -[size_t,size_t,...] only)
+// because these declarations are not feeding a trait, so safe implicit arithmetic conversion kiks in.
+// Even so, there may still be an un-met need for accepting
+// either ptrdiff_t or size_t (signed or unsigned)
+// because Mike has seen the optimizer resist removing bound checks when sign-conversion is in play.
+// "Only size_t" is meeting today's need
+// and no solution is known that avoids 2^D overloads for D dimensions
+// while offering multiple subscript types and staying assertion-free.
+//
+// This approach, of avoiding traits entirely, is likely incompatible with the original desire
+// to have one recursive multidimensional subscript operator (TRY_BROKEN_DESIRED_MD_SUBSCRIPT).
+// To make a single declaration work,
+// we would probably have to get better at coaxing the optimizer into inlining thunks.
+forall( [N2], S2*, [N1], S1*, Timmed1, Tbase )
+static inline Timmed1 & ?[?]( arpk( N2, S2, arpk( N1, S1, Timmed1, Tbase ), Tbase ) & this, size_t ix2, size_t ix1 ) {
+        return this[ix2][ix1];
+}
+forall( [N3], S3*, [N2], S2*, [N1], S1*, Timmed1, Tbase )
+static inline Timmed1 & ?[?]( arpk( N3, S3, arpk( N2, S2, arpk( N1, S1, Timmed1, Tbase ), Tbase ), Tbase ) & this, size_t ix3, size_t ix2, size_t ix1 ) {
+        return this[ix3][ix2][ix1];
+}
+forall( [N4], S4*, [N3], S3*, [N2], S2*, [N1], S1*, Timmed1, Tbase )
+static inline Timmed1 & ?[?]( arpk( N4, S4, arpk( N3, S3, arpk( N2, S2, arpk( N1, S1, Timmed1, Tbase ), Tbase ), Tbase ), Tbase ) & this, size_t ix4, size_t ix3, size_t ix2, size_t ix1 ) {
+        return this[ix4][ix3][ix2][ix1];
+}
 #endif

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset eb0d9b7 for libcfa

Legend:

libcfa/src/collections/array.hfa

Download in other formats: