Changeset e1107ec for doc


Ignore:
Timestamp:
Nov 7, 2024, 10:11:31 AM (6 weeks ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Children:
04c6340
Parents:
174a11a
Message:

proofread section Array lifecycle

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/mike_brooks_MMath/array.tex

    r174a11a re1107ec  
    10911091
    10921092
    1093 
    10941093\section{Array lifecycle}
    10951094
    1096 An array is an aggregate, like a struct.
    1097 Both wrap subordinate objects.
    1098 An arbitrary other type, like @string@, can be used as a struct member or an array element.
    1099 When it is, that type's lifecycle assumptions must be respected.
    1100 In the case of @string@, it would be disasterous to start calling string-management functions on uninitialized storage.
    1101 The @string@ type provides its constructors and destructor;
    1102 \CFA is supposed to provide a reasonable assurance that they will be called, even if the string is wrapped in something else, even if the user-programmer isn't thinking about it.
    1103 
    1104 Preexisting \CFA mechanisms achieve this requirement in correctness, but not with reasonable performance.
    1105 Furthermore, advanced users of arrays need an exception to the bluntly-stated requirement, which does not come up when working with other aggregates.
    1106 So the \CFA array has subtleties in its support for element lifecycle.
    1107 
    1108 The preexisting \CFA support for contained-element lifecycle is based on a recursive implementation of the object-type (otype) pseudo-trait.
    1109 A type is an otype if it provides a parameterless constructor, copy constructor, assignment operator and destructor.
    1110 When declaring a struct whose members are all otypes, the compiler generates implementations of the four otype functions for the outer struct.
    1111 The generated parameterless constructor for the outer struct calls the parameterless constructor for each memeber, and the other otype functions work similarly.
    1112 For a member that is a C array, these calls occur in loops, which work correctly for dynamic bounds.
    1113 This logic works the same, whether the member is a concrete type (that happens to be an otype) or if the member is a polymorphic type variable asserted to be an otype (which is implicit in the default syntax, @forall(T)@).
     1095An array is an aggregate, like a structure;
     1096both are containers wrapping subordinate objects.
     1097Any arbitrary object type, like @string@, can be an array element or structure member.
     1098A consequence is that the lifetime of the container must match with its subordinate objects.
     1099For example, all elements and members must be initialized/uninitialized implicitly as part of the container's allocation/deallocation.
     1100Modern programming languages implicitly perform these operations via a type's constructor and destructor.
     1101Therefore, \CFA must assure these lifetime operations are called, regardless of the lexical nesting depth of containers and their components.
     1102
     1103Preexisting \CFA mechanisms achieve this requirement, but with poor performance (discussed shortly).
     1104Furthermore, advanced array users need an exception to the basic mechanism, which does not occur with other aggregates.
     1105Hence, there are subtleties in supporting an element's lifecycle with arrays.
     1106
     1107The preexisting \CFA support for contained-element lifecycle is based on a recursive implementation of the object-type (@otype@) pseudo-trait.
     1108A type is an @otype@, if it provides a parameterless (default) constructor, copy constructor, assignment operator, and destructor (like \CC).
     1109When declaring a structure with @otype@ members, the compiler implicitly generates implementations of the four @otype@ functions for the outer structure.
     1110Then the generated default constructor for the outer structure calls the default constructor for each member, and the other @otype@ functions work similarly.
     1111For a member that is a C array, these calls occur in a loop for each array element, which even for VLAs.
     1112This logic works the same, whether the member is a concrete type (that happens to be an @otype@) or if the member is a polymorphic type asserted to be an @otype@ (which is implicit in the syntax, @forall(T)@).
    11141113The \CFA array has the simplified form (similar to one seen before):
    11151114\begin{cfa}
    1116 forall( T* )   // non-otype element, no lifecycle functions
     1115forall( T * )   // non-otype element, no lifecycle functions
    11171116// forall( T )  // otype element, lifecycle functions asserted
    11181117struct array5 {
     
    11201119};
    11211120\end{cfa}
    1122 Being a struct with a C-array member, using the otype-form declaration for @T@ causes @array5(float)@ to implement otype too.
    1123 
    1124 But this otype-recursion pattern has a performance issue.
    1125 A cube of @float@ is, in the running simplification:
     1121Being a structure with a C-array member, using the otype-form declaration for @T@ causes @array5(float)@ to implement @otype@ too.
     1122
     1123But this @otype@-recursion pattern has a performance issue.
     1124For example, in a cube of @float@:
    11261125\begin{cfa}
    11271126array5( array5( array5( float ) ) )
    11281127\end{cfa}
    1129 Its lifecycle functions, under the otype-recursion pattern, would be:
    1130 \begin{cfa}
    1131 TODO: rework this crude exposition into a readable figure
    1132 
     1128its lifecycle functions, under the @otype@-recursion pattern, is shown in \VRef[Figure]{}, where
     1129\begin{cfa}
     1130array5(array5(array5(float)))
     1131\end{cfa}
     1132has 256 leaves.
     1133
     1134%array5(T) offers
     1135%1 parameterless ctor, which asks for T to have
     1136%       1 parameterless ctor
     1137%       2 copy ctor
     1138%       3 asgt
     1139%       4 dtor
     1140%2 copy ctor, which asks for T to have
     1141%       1 parameterless ctor
     1142%       2 copy ctor
     1143%       3 asgt
     1144%       4 dtor
     1145%3 asgt, which asks for T to have
     1146%       1 parameterless ctor
     1147%       2 copy ctor
     1148%       3 asgt
     1149%       4 dtor
     1150%4 dtor, which asks for T to have
     1151%       1 parameterless ctor
     1152%       2 copy ctor
     1153%       3 asgt
     1154%       4 dtor
     1155
     1156\begin{figure}
     1157\centering
     1158\setlength{\tabcolsep}{15pt}
     1159\begin{tabular}{@{}lll@{}}
     1160\begin{cfa}[deletekeywords={default}]
    11331161float offers
    1134 1 parameterless ctor
     11621 default ctor
    113511632 copy ctor
    113611643 asgt
    113711654 dtor
    11381166
    1139 array5(T) offers
    1140 1 parameterless ctor, which asks for T to have
    1141         1 parameterless ctor
     1167
     1168
     1169
     1170
     1171
     1172
     1173
     1174
     1175
     1176
     1177
     1178
     1179
     1180
     1181
     1182
     1183
     1184
     1185
     1186
     1187
     1188
     1189
     1190\end{cfa}
     1191&
     1192\begin{cfa}[deletekeywords={default}]
     1193array5(float) has
     11941 default ctor, using float's
     1195        1 default ctor
    11421196        2 copy ctor
    11431197        3 asgt
    11441198        4 dtor
    1145 2 copy ctor, which asks for T to have
    1146         1 parameterless ctor
     11992 copy ctor, using float's
     1200        1 default ctor
    11471201        2 copy ctor
    11481202        3 asgt
    11491203        4 dtor
    1150 3 asgt, which asks for T to have
    1151         1 parameterless ctor
     12043 asgt, using float's
     1205        1 default ctor
    11521206        2 copy ctor
    11531207        3 asgt
    11541208        4 dtor
    1155 4 dtor, which asks for T to have
    1156         1 parameterless ctor
     12094 dtor, using float's
     1210        1 default ctor
    11571211        2 copy ctor
    11581212        3 asgt
    11591213        4 dtor
    11601214
    1161 array5(float) has
    1162 1 parameterless ctor, which gets float's
    1163         1 parameterless ctor
    1164         2 copy ctor
    1165         3 asgt
    1166         4 dtor
    1167 2 copy ctor, which gets float's
    1168         1 parameterless ctor
    1169         2 copy ctor
    1170         3 asgt
    1171         4 dtor
    1172 3 asgt, which gets float's
    1173         1 parameterless ctor
    1174         2 copy ctor
    1175         3 asgt
    1176         4 dtor
    1177 4 dtor, which gets float's
    1178         1 parameterless ctor
    1179         2 copy ctor
    1180         3 asgt
    1181         4 dtor
    1182 
     1215
     1216
     1217
     1218
     1219
     1220
     1221
     1222\end{cfa}
     1223&
     1224\begin{cfa}[deletekeywords={default}]
    11831225array5(array5(float)) has
    1184 1 parameterless ctor, which gets array5(float)'s
    1185         1 parameterless ctor, which gets float's
    1186                 1 parameterless ctor
     12261 default ctor, using array5(float)'s
     1227        1 default ctor, using float's
     1228                1 default ctor
    11871229                2 copy ctor
    11881230                3 asgt
    11891231                4 dtor
    1190         2 copy ctor, which gets float's
    1191                 1 parameterless ctor
     1232        2 copy ctor, using float's
     1233                1 default ctor
    11921234                2 copy ctor
    11931235                3 asgt
    11941236                4 dtor
    1195         3 asgt, which gets float's
    1196                 1 parameterless ctor
     1237        3 asgt, using float's
     1238                1 default ctor
    11971239                2 copy ctor
    11981240                3 asgt
    11991241                4 dtor
    1200         4 dtor, which gets float's
    1201                 1 parameterless ctor
     1242        4 dtor, using float's
     1243                1 default ctor
    12021244                2 copy ctor
    12031245                3 asgt
    12041246                4 dtor
    1205 2 copy ctor, which gets array5(float)'s
     12472 copy ctor, using array5(float)'s
    12061248        ... 4 children, 16 leaves
    1207 3 asgt, which gets array5(float)'s
     12493 asgt, using array5(float)'s
    12081250        ... 4 children, 16 leaves
    1209 4 dtor, which gets array5(float)'s
     12514 dtor, using array5(float)'s
    12101252        ... 4 children, 16 leaves
    12111253(64 leaves)
    1212 
    1213 array5(array5(array5(float))) has
    1214 ...(256 leaves)
    1215 \end{cfa}
     1254\end{cfa}
     1255\end{tabular}
     1256\caption{write a caption}
     1257\end{figure}
    12161258
    12171259Each xxx corresponds to a generated thunk.
    1218 So the otype-recursion pattern generates a quantity of thunks that is exponential in the number of dimensions.
    1219 Anecdotal experience with this solutuion found the resulting compile times annoying at three dimensions, and unusable at four.
    1220 
    1221 But the otype-recursion pattern uses more assertions than are really needed.
    1222 Consider how @array5(float)@'s parameterless constructor is getting all four lifecycle assertions about the element type, @float@.
    1223 It only needs @float@'s parameterless constructor; it is getting a destructor (among others) and never using it.
    1224 The same idea applies to the copy constructor and the destructor.
    1225 For the assignment operator, it turns out that today's RAII pattern has it using a copy constructor, assignment operator and destructor.
    1226 Concurrent work on the \CFA team aims to improve this whole situation.
    1227 A workaround is needed for now.
    1228 
    1229 The workaround is to provide parameterless constructor, copy constructor and destructor, defined with lean, bespoke assertions:
    1230 
    1231 \noindent
     1260So the @otype@-recursion pattern generates a quantity of thunks that is exponential in the number of dimensions.
     1261Anecdotal experience with this solution found the resulting compile times annoyingly slow at three dimensions, and unusable at four.
     1262
     1263The issue is that the otype-recursion pattern uses more assertions than needed.
     1264Consider how @array5(float)@'s default constructor is getting all four lifecycle assertions about the element type, @float@.
     1265It only needs @float@'s default constructor;
     1266all the operations are never used.
     1267%For the assignment operator, it turns out that today's RAII pattern has it using a copy constructor, assignment operator and destructor.
     1268Current by the \CFA team aims to improve this situation.
     1269Therefore, a workaround is needed for now.
     1270
     1271The workaround is to provide a default constructor, copy constructor and destructor, defined with lean, bespoke assertions:
     1272\begin{cquote}
    12321273\begin{tabular}{@{}l@{\hspace{0.5in}}l@{}}
    12331274\begin{cfa}
     
    12631304\end{cfa}
    12641305\end{tabular}
     1306\end{cquote}
    12651307Moreover, the assignment operator is skipped, because array assignment is not a common operation.
    12661308This way, the critical lifecycle functions are available, with no growth in thunk creation.
    12671309
    1268 Finally, the intuition that a programmer using an array always wants element constructors called \emph{automatically} is simplistic.
     1310Finally, the intuition that a programmer using an array always wants element's default constructor called \emph{automatically} is simplistic.
    12691311Arrays exist to store different values at each position.
    1270 So, an array initialization, in general, needs to let the prorammer provide different constructor arguments to each element.
    1271 \begin{cfa}
    1272 thread worker;
    1273 void ?{}( worker & ) = void;
    1274 void ?{}( worker &, int worker_num );
    1275 array(worker, 5) ws @= ???@;
    1276 for (i; 5) ( ws[i] ){ @i@ };
    1277 \end{cfa}
     1312So, array initialization needs to let the programmer provide different constructor arguments to each element.
     1313\begin{cfa}
     1314thread worker { int id; };
     1315void ?{}( worker & ) = void; // remove default constructor
     1316void ?{}( worker &, int id );
     1317array( worker, 5 ) ws = {}; // no initialization
     1318for (i; 5) (ws[i]){ @i@ }; // explicitly initialize each thread's id
     1319\end{cfa}
     1320Note the ability to explicitly call a constructor in \CFA, unlike in \CC.
    12781321
    12791322Two \CFA features may, at first, seem viable for initializing the array @ws@, but on closer inspection, they are not.
    12801323\begin{itemize}
    12811324\item
    1282         Empty initializer / default constructor: \lstinline{array(worker, 5) ws;}.
    1283         This option does not work with the semantics of a thread: it has forked as soon as the constructor returns.
    1284         The type does not provide a parameterless constructor because it is not possible to fork the thread until the @worker_num@ parameter is received.
    1285 \item
    1286         C initialization: \lstinline{array(worker, 5) ws AT EQUALS OPEN CLOSE;} (TODO: typeset at-equals).
     1325        Empty initializer / elided default constructor.
     1326        The problem is that a thread type needs to fork a thread at the \emph{end} of each constructor and join with a thread at the \emph{start} of each destructor.
     1327        Therefore, there is a conflict between the implicit actions by a builtin @thread@ type and a user's desire to remove actions with respect to allocation/deallocation.
     1328\item
     1329        C initialization: \lstinline|array(worker, 5) ws @= {};|, which ignores all \CFA initialization and uses C empty initialization.
    12871330        This option does achieve the desired semantics on the construction side.
    1288         But destruction side, the desired semantics is for implicit destructor calls to continue, because with threads, the destructor is a join operation.
    1289         C initialization disables \emph{all} implicit lifecycle management; here, the goal is to disable only the implicit construction.
     1331        But on destruction side, the desired semantics is for implicit destructor calls to continue for the join operation.
     1332        C initialization disables \emph{all} implicit lifecycle management; whereas, the goal is to disable only the implicit construction.
    12901333\end{itemize}
    12911334
    1292 So, I enhanced the \CFA standard library to provide the missing semantics, available in either form:
    1293 \begin{cfa}
    1294 array(@uninit@(worker), 5) ws1;
    1295 array(worker, 5) ws2 = { @delay_init@ };
     1335To fix this problem, I enhanced the \CFA standard library to provide the missing semantics, available in either form:
     1336\begin{cfa}
     1337array( @uninit@(worker), 5 ) ws1;
     1338array( worker, 5) ws2 = { @delay_init@ };
    12961339\end{cfa}
    12971340Both cause the @ws@-construction-time implicit call chain to stop before reaching a @worker@ constructor, while leaving the implicit destruction calls intact.
    12981341Two forms are available, to parallel the development of this feature in \uCpp.
    1299 Originally \uCpp offered only the @ws1@ form, there with the class-tempate @uNoCtor@ replacing \CFA's @uninit@.
    1300 More recently, \uCpp got a declaration-helping macro, @uArray@, whose usage similar to the @ws2@ case.
    1301 Based on results of piloting @uArray@ (as a replacement of @uNoCtor@), future work may choose to remove one of the options.
     1342Originally \uCpp offered only the @ws1@ form, using the class-template @uNoCtor@ equivalent to \CFA's @uninit@.
     1343More recently, \uCpp was extended with declaration macro, @uArray@, with usage similar to the @ws2@ case.
     1344Based on experience piloting @uArray@ as a replacement of @uNoCtor@, it might be possible to remove the first option.
    13021345
    13031346% note to Mike, I have more fragments on some details available in push24\fragments\uNoCtor.txt
Note: See TracChangeset for help on using the changeset viewer.