Changeset 7493339 for doc/rob_thesis/ctordtor.tex
- Timestamp:
- Apr 3, 2017, 7:04:30 PM (8 years ago)
- Branches:
- ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, resolv-new, with_gc
- Children:
- fbd7ad6
- Parents:
- ae6cc8b
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/rob_thesis/ctordtor.tex
rae6cc8b r7493339 2 2 \chapter{Constructors and Destructors} 3 3 %====================================================================== 4 5 % TODO: discuss move semantics; they haven't been implemented, but could be. Currently looking at alternative models. (future work)6 4 7 5 % TODO: as an experiment, implement Andrei Alexandrescu's ScopeGuard http://www.drdobbs.com/cpp/generic-change-the-way-you-write-excepti/184403758?pgno=2 … … 553 551 % // and so on 554 552 555 556 557 % TODO: talk somewhere about compound literals?558 559 553 Since \CFA is a true systems language, it does not provide a garbage collector. 560 As well, \CFA is not an object-oriented programming language, i.e. structures cannot have routine members.554 As well, \CFA is not an object-oriented programming language, i.e., structures cannot have routine members. 561 555 Nevertheless, one important goal is to reduce programming complexity and increase safety. 562 556 To that end, \CFA provides support for implicit pre/post-execution of routines for objects, via constructors and destructors. 563 564 % TODO: this is old. remove or refactor565 % Manual resource management is difficult.566 % Part of the difficulty results from not having any guarantees about the current state of an object.567 % Objects can be internally composed of pointers that may reference resources which may or may not need to be manually released, and keeping track of that state for each object can be difficult for the end user.568 569 % Constructors and destructors provide a mechanism to bookend the lifetime of an object, allowing the designer of a type to establish invariants for objects of that type.570 % Constructors guarantee that object initialization code is run before the object can be used, while destructors provide a mechanism that is guaranteed to be run immediately before an object's lifetime ends.571 % Constructors and destructors can help to simplify resource management when used in a disciplined way.572 % In particular, when all resources are acquired in a constructor, and all resources are released in a destructor, no resource leaks are possible.573 % This pattern is a popular idiom in several languages, such as \CC, known as RAII (Resource Acquisition Is Initialization).574 557 575 558 This chapter details the design of constructors and destructors in \CFA, along with their current implementation in the translator. … … 592 575 Next, @x@ is assigned the value of @y@. 593 576 In the last line, @z@ is implicitly initialized to 0 since it is marked @static@. 594 The key difference between assignment and initialization being that assignment occurs on a live object (i.e. an object that contains data).577 The key difference between assignment and initialization being that assignment occurs on a live object (i.e., an object that contains data). 595 578 It is important to note that this means @x@ could have been used uninitialized prior to being assigned, while @y@ could not be used uninitialized. 596 Use of uninitialized variables yields undefined behaviour, which is a common source of errors in C programs. % TODO: *citation* 597 598 Declaration initialization is insufficient, because it permits uninitialized variables to exist and because it does not allow for the insertion of arbitrary code before the variable is live. 599 Many C compilers give good warnings most of the time, but they cannot in all cases. 600 \begin{cfacode} 601 int f(int *); // never reads the parameter, only writes 602 int g(int *); // reads the parameter - expects an initialized variable 579 Use of uninitialized variables yields undefined behaviour, which is a common source of errors in C programs. 580 581 Declaration initialization is insufficient, because it permits uninitialized variables to exist and because it does not allow for the insertion of arbitrary code before a variable is live. 582 Many C compilers give good warnings for uninitialized variables most of the time, but they cannot in all cases. 583 \begin{cfacode} 584 int f(int *); // output parameter: never reads, only writes 585 int g(int *); // input parameter: never writes, only reads, 586 // so requires initialized variable 603 587 604 588 int x, y; 605 589 f(&x); // okay - only writes to x 606 g(&y); // will usey uninitialized607 \end{cfacode} 608 Other languages are able to give errors in the case of uninitialized variable use, but due to backwards compatibility concerns, this cannot bethe case in \CFA.590 g(&y); // uses y uninitialized 591 \end{cfacode} 592 Other languages are able to give errors in the case of uninitialized variable use, but due to backwards compatibility concerns, this is not the case in \CFA. 609 593 610 594 In C, constructors and destructors are often mimicked by providing routines that create and teardown objects, where the teardown function is typically only necessary if the type modifies the execution environment. … … 614 598 }; 615 599 struct array_int create_array(int sz) { 616 return (struct array_int) { malloc(sizeof(int)*sz) };600 return (struct array_int) { calloc(sizeof(int)*sz) }; 617 601 } 618 602 void destroy_rh(struct resource_holder * rh) { … … 639 623 640 624 In \CFA, a constructor is a function with the name @?{}@. 625 Like other operators in \CFA, the name represents the syntax used to call the constructor, e.g., @struct S = { ... };@. 641 626 Every constructor must have a return type of @void@ and at least one parameter, the first of which is colloquially referred to as the \emph{this} parameter, as in many object-oriented programming-languages (however, a programmer can give it an arbitrary name). 642 627 The @this@ parameter must have a pointer type, whose base type is the type of object that the function constructs. … … 655 640 656 641 In C, if the user creates an @Array@ object, the fields @data@ and @len@ are uninitialized, unless an explicit initializer list is present. 657 It is the user's responsibility to remember to initialize both of the fields to sensible values .642 It is the user's responsibility to remember to initialize both of the fields to sensible values, since there are no implicit checks for invalid values or reasonable defaults. 658 643 In \CFA, the user can define a constructor to handle initialization of @Array@ objects. 659 644 … … 671 656 This constructor initializes @x@ so that its @length@ field has the value 10, and its @data@ field holds a pointer to a block of memory large enough to hold 10 @int@s, and sets the value of each element of the array to 0. 672 657 This particular form of constructor is called the \emph{default constructor}, because it is called on an object defined without an initializer. 673 In other words, a default constructor is a constructor that takes a single argument ,the @this@ parameter.658 In other words, a default constructor is a constructor that takes a single argument: the @this@ parameter. 674 659 675 660 In \CFA, a destructor is a function much like a constructor, except that its name is \lstinline!^?{}!. … … 680 665 } 681 666 \end{cfacode} 682 Since the destructor is automatically called at deallocation for all objects of type @Array@, the memory associated with an @Array@ is automatically freed when the object's lifetime ends. 667 The destructor is automatically called at deallocation for all objects of type @Array@. 668 Hence, the memory associated with an @Array@ is automatically freed when the object's lifetime ends. 683 669 The exact guarantees made by \CFA with respect to the calling of destructors are discussed in section \ref{sub:implicit_dtor}. 684 670 … … 691 677 \end{cfacode} 692 678 By the previous definition of the default constructor for @Array@, @x@ and @y@ are initialized to valid arrays of length 10 after their respective definitions. 693 On line 3, @z@ is initialized with the value of @x@, while on line @4@, @y@ is assigned the value of @x@.679 On line 2, @z@ is initialized with the value of @x@, while on line 3, @y@ is assigned the value of @x@. 694 680 The key distinction between initialization and assignment is that a value to be initialized does not hold any meaningful values, whereas an object to be assigned might. 695 681 In particular, these cases cannot be handled the same way because in the former case @z@ does not currently own an array, while @y@ does. … … 712 698 The first function is called a \emph{copy constructor}, because it constructs its argument by copying the values from another object of the same type. 713 699 The second function is the standard copy-assignment operator. 714 The se four functions are special in that they control the state of most objects.700 The four functions (default constructor, destructor, copy constructor, and assignment operator) are special in that they safely control the state of most objects. 715 701 716 702 It is possible to define a constructor that takes any combination of parameters to provide additional initialization options. … … 729 715 Array x, y = { 20, 0xdeadbeef }, z = y; 730 716 \end{cfacode} 717 731 718 In \CFA, constructor calls look just like C initializers, which allows them to be inserted into legacy C code with minimal code changes, and also provides a very simple syntax that veteran C programmers are familiar with. 732 719 One downside of reusing C initialization syntax is that it isn't possible to determine whether an object is constructed just by looking at its declaration, since that requires knowledge of whether the type is managed at that point. … … 748 735 Destructors are implicitly called in reverse declaration-order so that objects with dependencies are destructed before the objects they are dependent on. 749 736 750 \subsection{ Syntax}751 \label{sub:syntax} % TODO: finish this section737 \subsection{Calling Syntax} 738 \label{sub:syntax} 752 739 There are several ways to construct an object in \CFA. 753 740 As previously introduced, every variable is automatically constructed at its definition, which is the most natural way to construct an object. … … 773 760 A * y = malloc(); // copy construct: ?{}(&y, malloc()) 774 761 775 ?{}(&x); // explicit construct x 776 ?{}(y, x); // explit construct y from x 777 ^?{}(&x); // explicit destroy x 762 ?{}(&x); // explicit construct x, second construction 763 ?{}(y, x); // explit construct y from x, second construction 764 ^?{}(&x); // explicit destroy x, in different order 778 765 ^?{}(y); // explicit destroy y 779 766 … … 781 768 // implicit ^?{}(&x); 782 769 \end{cfacode} 783 Calling a constructor or destructor directly is a flexible feature that allows complete control over the management of a piece ofstorage.770 Calling a constructor or destructor directly is a flexible feature that allows complete control over the management of storage. 784 771 In particular, constructors double as a placement syntax. 785 772 \begin{cfacode} … … 804 791 Finally, constructors and destructors support \emph{operator syntax}. 805 792 Like other operators in \CFA, the function name mirrors the use-case, in that the first $N$ arguments fill in the place of the question mark. 793 This syntactic form is similar to the new initialization syntax in \CCeleven, except that it is used in expression contexts, rather than declaration contexts. 806 794 \begin{cfacode} 807 795 struct A { ... }; … … 822 810 Destructor operator syntax is actually an statement, and requires parentheses for symmetry with constructor syntax. 823 811 812 One of these three syntactic forms should appeal to either C or \CC programmers using \CFA. 813 824 814 \subsection{Function Generation} 825 815 In \CFA, every type is defined to have the core set of four functions described previously. … … 833 823 There are several options for user-defined types: structures, unions, and enumerations. 834 824 To aid in ease of use, the standard set of four functions is automatically generated for a user-defined type after its definition is completed. 835 By auto-generating these functions, it is ensured that legacy C code will continueto work correctly in every context where \CFA expects these functions to exist, since they are generated for every complete type.825 By auto-generating these functions, it is ensured that legacy C code continues to work correctly in every context where \CFA expects these functions to exist, since they are generated for every complete type. 836 826 837 827 The generated functions for enumerations are the simplest. 838 828 Since enumerations in C are essentially just another integral type, the generated functions behave in the same way that the builtin functions for the basic types work. 839 % TODO: examples for enums840 829 For example, given the enumeration 841 830 \begin{cfacode} … … 860 849 \end{cfacode} 861 850 In the future, \CFA will introduce strongly-typed enumerations, like those in \CC. 862 The existing generated routines will be sufficient to express this restriction, since they are currently set up to take in values of that enumeration type.851 The existing generated routines are sufficient to express this restriction, since they are currently set up to take in values of that enumeration type. 863 852 Changes related to this feature only need to affect the expression resolution phase, where more strict rules will be applied to prevent implicit conversions from integral types to enumeration types, but should continue to permit conversions from enumeration types to @int@. 864 In this way, it will still be possible to add an @int@ to an enumeration, but the resulting value will be an @int@, meaning that it won't be possible to reassign the value into an enumeration without a cast.853 In this way, it is still possible to add an @int@ to an enumeration, but the resulting value is an @int@, meaning it cannot be reassigned to an enumeration without a cast. 865 854 866 855 For structures, the situation is more complicated. 867 Fora structure @S@ with members @M$_0$@, @M$_1$@, ... @M$_{N-1}$@, each function @f@ in the standard set calls \lstinline{f(s->M$_i$, ...)} for each @$i$@.868 That is, a default constructor for @S@ default constructs the members of @S@, the copy constructor with copy constructthem, and so on.869 For example given the structdefinition856 Given a structure @S@ with members @M$_0$@, @M$_1$@, ... @M$_{N-1}$@, each function @f@ in the standard set calls \lstinline{f(s->M$_i$, ...)} for each @$i$@. 857 That is, a default constructor for @S@ default constructs the members of @S@, the copy constructor copy constructs them, and so on. 858 For example, given the structure definition 870 859 \begin{cfacode} 871 860 struct A { … … 893 882 } 894 883 \end{cfacode} 895 It is important to note that the destructors are called in reverse declaration order to resolveconflicts in the event there are dependencies among members.884 It is important to note that the destructors are called in reverse declaration order to prevent conflicts in the event there are dependencies among members. 896 885 897 886 In addition to the standard set, a set of \emph{field constructors} is also generated for structures. 898 The field constructors are constructors that consume a prefix of the struct 's memberlist.887 The field constructors are constructors that consume a prefix of the structure's member-list. 899 888 That is, $N$ constructors are built of the form @void ?{}(S *, T$_{\text{M}_0}$)@, @void ?{}(S *, T$_{\text{M}_0}$, T$_{\text{M}_1}$)@, ..., @void ?{}(S *, T$_{\text{M}_0}$, T$_{\text{M}_1}$, ..., T$_{\text{M}_{N-1}}$)@, where members are copy constructed if they have a corresponding positional argument and are default constructed otherwise. 900 The addition of field constructors allows struct s in \CFA to be used naturally in the same ways that they could be used in C (i.e. to initialize any prefix of the struct), e.g., @A a0 = { b }, a1 = { b, c }@.889 The addition of field constructors allows structures in \CFA to be used naturally in the same ways as used in C (i.e., to initialize any prefix of the structure), e.g., @A a0 = { b }, a1 = { b, c }@. 901 890 Extending the previous example, the following constructors are implicitly generated for @A@. 902 891 \begin{cfacode} … … 911 900 \end{cfacode} 912 901 913 For unions, the default constructor and destructor do nothing, as it is not obvious which member if anyshould be constructed.902 For unions, the default constructor and destructor do nothing, as it is not obvious which member, if any, should be constructed. 914 903 For copy constructor and assignment operations, a bitwise @memcpy@ is applied. 915 904 In standard C, a union can also be initialized using a value of the same type as its first member, and so a corresponding field constructor is generated to perform a bitwise @memcpy@ of the object. … … 947 936 948 937 % This feature works in the \CFA model, since constructors are simply special functions and can be called explicitly, unlike in \CC. % this sentence isn't really true => placement new 949 In \CCeleven, this restriction has been loosened to allow unions with managed members, with the caveat that anyif there are any members with a user-defined operation, then that operation is not implicitly defined, forcing the user to define the operation if necessary.938 In \CCeleven, unions may have managed members, with the caveat that if there are any members with a user-defined operation, then that operation is not implicitly defined, forcing the user to define the operation if necessary. 950 939 This restriction could easily be added into \CFA once \emph{deleted} functions are added. 951 940 … … 970 959 Here, @&s@ and @&s2@ are cast to unqualified pointer types. 971 960 This mechanism allows the same constructors and destructors to be used for qualified objects as for unqualified objects. 972 Since this applies only to implicitly generated constructor calls, the language does not allow qualified objects to be re-initialized with a constructor without an explicit cast. 961 This applies only to implicitly generated constructor calls. 962 Hence, explicitly re-initializing qualified objects with a constructor requires an explicit cast. 963 964 As discussed in Section \ref{sub:c_background}, compound literals create unnamed objects. 965 This mechanism can continue to be used seamlessly in \CFA with managed types to create temporary objects. 966 The object created by a compound literal is constructed using the provided brace-enclosed initializer-list, and is destructed at the end of the scope it is used in. 967 For example, 968 \begin{cfacode} 969 struct A { int x; }; 970 void ?{}(A *, int, int); 971 { 972 int x = (A){ 10, 20 }.x; 973 } 974 \end{cfacode} 975 is equivalent to 976 \begin{cfacode} 977 struct A { int x, y; }; 978 void ?{}(A *, int, int); 979 { 980 A _tmp; 981 ?{}(&_tmp, 10, 20); 982 int x = _tmp.x; 983 ^?{}(&tmp); 984 } 985 \end{cfacode} 973 986 974 987 Unlike \CC, \CFA provides an escape hatch that allows a user to decide at an object's definition whether it should be managed or not. … … 984 997 A a2 @= { 0 }; // unmanaged 985 998 \end{cfacode} 986 In this example, @a1@ is a managed object, and thus is default constructed and destructed at the end of @a1@'s lifetime, while @a2@ is an unmanaged object and is not implicitly constructed or destructed. 987 Instead, @a2->x@ is initialized to @0@ as if it were a C object, due to the explicit initializer. 988 Existing constructors are ignored when \ateq is used, so that any valid C initializer is able to initialize the object. 999 In this example, @a1@ is a managed object, and thus is default constructed and destructed at the start/end of @a1@'s lifetime, while @a2@ is an unmanaged object and is not implicitly constructed or destructed. 1000 Instead, @a2->x@ is initialized to @0@ as if it were a C object, because of the explicit initializer. 989 1001 990 1002 In addition to freedom, \ateq provides a simple path to migrating legacy C code to Cforall, in that objects can be moved from C-style initialization to \CFA gradually and individually. … … 992 1004 It is recommended that most objects be managed by sensible constructors and destructors, except where absolutely necessary. 993 1005 994 When the user declares any constructor or destructor, the corresponding intrinsic/generated function and all field constructors for that type are hidden, so that they will not be found during expression resolution unlessthe user-defined function goes out of scope.995 Furthermore, if the user declares any constructor, then the intrinsic/generated default constructor is also hidden, making it so that objects of a type may not be default constructable.996 Th is closely mirrors the rule for implicit declaration of constructors in \CC, wherein the default constructor is implicitly declared if there is no user-declared constructor. % TODO: cite C++98 page 186??1006 When a user declares any constructor or destructor, the corresponding intrinsic/generated function and all field constructors for that type are hidden, so that they are not found during expression resolution until the user-defined function goes out of scope. 1007 Furthermore, if the user declares any constructor, then the intrinsic/generated default constructor is also hidden, precluding default construction. 1008 These semantics closely mirror the rule for implicit declaration of constructors in \CC, wherein the default constructor is implicitly declared if there is no user-declared constructor \cite[p.~186]{ANSI98:C++}. 997 1009 \begin{cfacode} 998 1010 struct S { int x, y; }; … … 1001 1013 S s0, s1 = { 0 }, s2 = { 0, 2 }, s3 = s2; // okay 1002 1014 { 1003 void ?{}(S * s, int i) { s->x = i*2; } 1015 void ?{}(S * s, int i) { s->x = i*2; } // locally hide autogen constructors 1004 1016 S s4; // error 1005 1017 S s5 = { 3 }; // okay … … 1058 1070 } // z, y, w implicitly destructed, in this order 1059 1071 \end{cfacode} 1060 If at any point, the @this@ parameter is passed directly as the target of another constructor, then it is assumed that constructor handles the initialization of all of the object's members and no implicit constructor calls are added. % TODO: confirm that this is correct. It might be possible to get subtle errors if you initialize some members then call another constructor... -- in fact, this is basically always wrong. if anything, I should check that such a constructor does not initialize any members, otherwise it'll always initialize the member twice (once locally, once by the called constructor).1072 If at any point, the @this@ parameter is passed directly as the target of another constructor, then it is assumed that constructor handles the initialization of all of the object's members and no implicit constructor calls are added. % TODO: this is basically always wrong. if anything, I should check that such a constructor does not initialize any members, otherwise it'll always initialize the member twice (once locally, once by the called constructor). This might be okay in some situations, but it deserves a warning at the very least. 1061 1073 To override this rule, \ateq can be used to force the translator to trust the programmer's discretion. 1062 1074 This form of \ateq is not yet implemented. … … 1064 1076 Despite great effort, some forms of C syntax do not work well with constructors in \CFA. 1065 1077 In particular, constructor calls cannot contain designations (see \ref{sub:c_background}), since this is equivalent to allowing designations on the arguments to arbitrary function calls. 1066 In C, function prototypes are permitted to have arbitrary parameter names, including no names at all, which may have no connection to the actual names used at function definition.1067 Furthermore, a function prototype can be repeated an arbitrary number of times, each time using different names.1068 1078 \begin{cfacode} 1069 1079 // all legal forward declarations in C … … 1076 1086 f(b:10, a:20, c:30); // which parameter is which? 1077 1087 \end{cfacode} 1088 In C, function prototypes are permitted to have arbitrary parameter names, including no names at all, which may have no connection to the actual names used at function definition. 1089 Furthermore, a function prototype can be repeated an arbitrary number of times, each time using different names. 1078 1090 As a result, it was decided that any attempt to resolve designated function calls with C's function prototype rules would be brittle, and thus it is not sensible to allow designations in constructor calls. 1079 % Many other languages do allow named arguments, such as Python and Scala, but they do not allow multiple arbitrarily named forward declarations of a function. 1080 1081 In addition, constructor calls cannot have a nesting depth greater than the number of array components in the type of the initialized object, plus one. 1091 1092 In addition, constructor calls do not support unnamed nesting. 1093 \begin{cfacode} 1094 struct B { int x; }; 1095 struct C { int y; }; 1096 struct A { B b; C c; }; 1097 void ?{}(A *, B); 1098 void ?{}(A *, C); 1099 1100 A a = { 1101 { 10 }, // construct B? - invalid 1102 }; 1103 \end{cfacode} 1104 In C, nesting initializers means that the programmer intends to initialize subobjects with the nested initializers. 1105 The reason for this omission is to both simplify the mental model for using constructors, and to make initialization simpler for the expression resolver. 1106 If this were allowed, it would be necessary for the expression resolver to decide whether each argument to the constructor call could initialize to some argument in one of the available constructors, making the problem highly recursive and potentially much more expensive. 1107 That is, in the previous example the line marked as an error could mean construct using @?{}(A *, B)@ or with @?{}(A *, C)@, since the inner initializer @{ 10 }@ could be taken as an intermediate object of type @B@ or @C@. 1108 In practice, however, there could be many objects that can be constructed from a given @int@ (or, indeed, any arbitrary parameter list), and thus a complete solution to this problem would require fully exploring all possibilities. 1109 1110 More precisely, constructor calls cannot have a nesting depth greater than the number of array components in the type of the initialized object, plus one. 1082 1111 For example, 1083 1112 \begin{cfacode} … … 1098 1127 % TODO: in CFA if the array dimension is empty, no object constructors are added -- need to fix this. 1099 1128 The body of @A@ has been omitted, since only the constructor interfaces are important. 1100 In C, having a greater nesting depth means that the programmer intends to initialize subobjects with the nested initializer. 1101 The reason for this omission is to both simplify the mental model for using constructors, and to make initialization simpler for the expression resolver. 1102 If this were allowed, it would be necessary for the expression resolver to decide whether each argument to the constructor call could initialize to some argument in one of the available constructors, making the problem highly recursive and potentially much more expensive. 1103 That is, in the previous example the line marked as an error could mean construct using @?{}(A *, A, A)@, since the inner initializer @{ 11 }@ could be taken as an intermediate object of type @A@ constructed with @?{}(A *, int)@. 1104 In practice, however, there could be many objects that can be constructed from a given @int@ (or, indeed, any arbitrary parameter list), and thus a complete solution to this problem would require fully exploring all possibilities. 1129 1105 1130 It should be noted that unmanaged objects can still make use of designations and nested initializers in \CFA. 1131 It is simple to overcome this limitation for managed objects by making use of compound literals, so that the arguments to the constructor call are explicitly typed. 1106 1132 1107 1133 \subsection{Implicit Destructors} … … 1130 1156 \end{cfacode} 1131 1157 1132 %% having this feels excessive, but it's here if necessary1133 % This procedure generates the following code.1134 % \begin{cfacode}1135 % void f(int i){1136 % struct A x;1137 % ?{}(&x);1138 % {1139 % struct A y;1140 % ?{}(&y);1141 % {1142 % struct A z;1143 % ?{}(&z);1144 % {1145 % if ((i==0)!=0) {1146 % ^?{}(&z);1147 % ^?{}(&y);1148 % ^?{}(&x);1149 % return;1150 % }1151 % }1152 % if (((i==1)!=0) {1153 % ^?{}(&z);1154 % ^?{}(&y);1155 % ^?{}(&x);1156 % return ;1157 % }1158 % ^?{}(&z);1159 % }1160 1161 % if ((i==2)!=0) {1162 % ^?{}(&y);1163 % ^?{}(&x);1164 % return;1165 % }1166 % ^?{}(&y);1167 % }1168 1169 % ^?{}(&x);1170 % }1171 % \end{cfacode}1172 1173 1158 The next example illustrates the use of simple continue and break statements and the manner that they interact with implicit destructors. 1174 1159 \begin{cfacode} … … 1183 1168 \end{cfacode} 1184 1169 Since a destructor call is automatically inserted at the end of the block, nothing special needs to happen to destruct @x@ in the case where control reaches the end of the loop. 1185 In the case where @i@ is @2@, the continue statement runs the loop update expression and attemp s to begin the next iteration of the loop.1170 In the case where @i@ is @2@, the continue statement runs the loop update expression and attempts to begin the next iteration of the loop. 1186 1171 Since continue is a C statement, which does not understand destructors, a destructor call is added just before the continue statement to ensure that @x@ is destructed. 1187 1172 When @i@ is @3@, the break statement moves control to just past the end of the loop. … … 1193 1178 L1: for (int i = 0; i < 10; i++) { 1194 1179 A x; 1195 L2:for (int j = 0; j < 10; j++) {1180 for (int j = 0; j < 10; j++) { 1196 1181 A y; 1197 if (j == 0) { 1198 continue; // destruct y 1199 } else if (j == 1) { 1200 break; // destruct y 1201 } else if (i == 1) { 1182 if (i == 1) { 1202 1183 continue L1; // destruct y 1203 1184 } else if (i == 2) { … … 1209 1190 The statement @continue L1@ begins the next iteration of the outer for-loop. 1210 1191 Since the semantics of continue require the loop update expression to execute, control branches to the \emph{end} of the outer for loop, meaning that the block destructor for @x@ can be reused, and it is only necessary to generate the destructor for @y@. 1192 % TODO: "why not do this all the time? fix or justify" 1211 1193 Break, on the other hand, requires jumping out of the loop, so the destructors for both @x@ and @y@ are generated and inserted before the @break L1@ statement. 1212 1194 … … 1277 1259 Exempt from these rules are intrinsic and builtin functions. 1278 1260 It should be noted that unmanaged objects are subject to copy constructor calls when passed as arguments to a function or when returned from a function, since they are not the \emph{target} of the copy constructor call. 1261 That is, since the parameter is not marked as an unmanaged object using \ateq, it will be copy constructed if it is returned by value or passed as an argument to another function, so to guarantee consistent behaviour, unmanaged objects must be copy constructed when passed as arguments. 1279 1262 This is an important detail to bear in mind when using unmanaged objects, and could produce unexpected results when mixed with objects that are explicitly constructed. 1280 1263 \begin{cfacode} … … 1284 1267 void ^?{}(A *); 1285 1268 1286 A f(A x) {1287 return x; 1269 A identity(A x) { // pass by value => need local copy 1270 return x; // return by value => make call-site copy 1288 1271 } 1289 1272 1290 1273 A y, z @= {}; 1291 identity(y); 1292 identity(z); 1274 identity(y); // copy construct y into x 1275 identity(z); // copy construct z into x 1293 1276 \end{cfacode} 1294 1277 Note that @z@ is copy constructed into a temporary variable to be passed as an argument, which is also destructed after the call. 1295 A special syntactic form, such as a variant of \ateq, could be implemented to specify at the call site that an argument should not be copy constructed, to regain some control for the C programmer.1296 1278 1297 1279 This generates the following 1298 1280 \begin{cfacode} 1299 1281 struct A f(struct A x){ 1300 struct A _retval_f; 1301 ?{}((&_retval_f), x); 1282 struct A _retval_f; // return value 1283 ?{}((&_retval_f), x); // copy construct return value 1302 1284 return _retval_f; 1303 1285 } 1304 1286 1305 1287 struct A y; 1306 ?{}(&y); 1307 struct A z = { 0 }; 1308 1309 struct A _tmp_cp1; // argument 1 1310 struct A _tmp_cp_ret0; // return value 1311 _tmp_cp_ret0=f((?{}(&_tmp_cp1, y) , _tmp_cp1)), _tmp_cp_ret0; 1312 ^?{}(&_tmp_cp_ret0); // return value 1313 ^?{}(&_tmp_cp1); // argument 1 1314 1315 struct A _tmp_cp2; // argument 1 1316 struct A _tmp_cp_ret1; // return value 1317 _tmp_cp_ret1=f((?{}(&_tmp_cp2, z), _tmp_cp2)), _tmp_cp_ret1; 1318 ^?{}(&_tmp_cp_ret1); // return value 1319 ^?{}(&_tmp_cp2); // argument 1 1288 ?{}(&y); // default construct 1289 struct A z = { 0 }; // C default 1290 1291 struct A _tmp_cp1; // argument 1 1292 struct A _tmp_cp_ret0; // return value 1293 _tmp_cp_ret0=f( 1294 (?{}(&_tmp_cp1, y) , _tmp_cp1) // argument is a comma expression 1295 ), _tmp_cp_ret0; // return value for cascading 1296 ^?{}(&_tmp_cp_ret0); // destruct return value 1297 ^?{}(&_tmp_cp1); // destruct argument 1 1298 1299 struct A _tmp_cp2; // argument 1 1300 struct A _tmp_cp_ret1; // return value 1301 _tmp_cp_ret1=f( 1302 (?{}(&_tmp_cp2, z), _tmp_cp2) // argument is a common expression 1303 ), _tmp_cp_ret1; // return value for cascading 1304 ^?{}(&_tmp_cp_ret1); // destruct return value 1305 ^?{}(&_tmp_cp2); // destruct argument 1 1320 1306 ^?{}(&y); 1321 1307 \end{cfacode} 1308 1309 A special syntactic form, such as a variant of \ateq, can be implemented to specify at the call site that an argument should not be copy constructed, to regain some control for the C programmer. 1310 \begin{cfacode} 1311 identity(z@); // do not copy construct argument 1312 // - will copy construct/destruct return value 1313 A@ identity_nocopy(A @ x) { // argument not copy constructed or destructed 1314 return x; // not copy constructed 1315 // return type marked @ => not destructed 1316 } 1317 \end{cfacode} 1318 It should be noted that reference types will allow specifying that a value does not need to be copied, however reference types do not provide a means of preventing implicit copy construction from uses of the reference, so the problem is still present when passing or returning the reference by value. 1322 1319 1323 1320 A known issue with this implementation is that the return value of a function is not guaranteed to have the same address for its entire lifetime. 1324 1321 Specifically, since @_retval_f@ is allocated and constructed in @f@ then returned by value, the internal data is bitwise copied into the caller's stack frame. 1325 1322 This approach works out most of the time, because typically destructors need to only access the fields of the object and recursively destroy. 1326 It is currently the case that constructors and destructors which use the @this@ pointer as a unique identifier to store data externally willnot work correctly for return value objects.1327 Thus is itnot safe to rely on an object's @this@ pointer to remain constant throughout execution of the program.1323 It is currently the case that constructors and destructors that use the @this@ pointer as a unique identifier to store data externally do not work correctly for return value objects. 1324 Thus, it is not safe to rely on an object's @this@ pointer to remain constant throughout execution of the program. 1328 1325 \begin{cfacode} 1329 1326 A * external_data[32]; … … 1341 1338 } 1342 1339 } 1340 1341 A makeA() { 1342 A x; // stores &x in external_data 1343 return x; 1344 } 1345 makeA(); // return temporary has a different address than x 1346 // equivalent to: 1347 // A _tmp; 1348 // _tmp = makeA(), _tmp; 1349 // ^?{}(&_tmp); 1343 1350 \end{cfacode} 1344 1351 In the above example, a global array of pointers is used to keep track of all of the allocated @A@ objects. 1345 Due to copying on return, the current object being destructed willnot exist in the array if an @A@ object is ever returned by value from a function.1346 1347 This problem could be solved in the translator by mutating the function signatures so that the return value is moved into the parameter list.1352 Due to copying on return, the current object being destructed does not exist in the array if an @A@ object is ever returned by value from a function. 1353 1354 This problem could be solved in the translator by changing the function signatures so that the return value is moved into the parameter list. 1348 1355 For example, the translator could restructure the code like so 1349 1356 \begin{cfacode} … … 1363 1370 \end{cfacode} 1364 1371 This transformation provides @f@ with the address of the return variable so that it can be constructed into directly. 1365 It is worth pointing out that this kind of signature rewriting already occurs in polymorphic functions whichreturn by value, as discussed in \cite{Bilson03}.1372 It is worth pointing out that this kind of signature rewriting already occurs in polymorphic functions that return by value, as discussed in \cite{Bilson03}. 1366 1373 A key difference in this case is that every function would need to be rewritten like this, since types can switch between managed and unmanaged at different scope levels, e.g. 1367 1374 \begin{cfacode} 1368 1375 struct A { int v; }; 1369 A x; // unmanaged 1376 A x; // unmanaged, since only trivial constructors are available 1370 1377 { 1371 1378 void ?{}(A * a) { ... } … … 1375 1382 A z; // unmanaged 1376 1383 \end{cfacode} 1377 Hence there is not enough information to determine at function declaration to determinewhether a type is managed or not, and thus it is the case that all signatures have to be rewritten to account for possible copy constructor and destructor calls.1384 Hence there is not enough information to determine at function declaration whether a type is managed or not, and thus it is the case that all signatures have to be rewritten to account for possible copy constructor and destructor calls. 1378 1385 Even with this change, it would still be possible to declare backwards compatible function prototypes with an @extern "C"@ block, which allows for the definition of C-compatible functions within \CFA code, however this would require actual changes to the way code inside of an @extern "C"@ function is generated as compared with normal code generation. 1379 Furthermore, it is n't possible to overload C functions, so using @extern "C"@ to declare functions is of limited use.1380 1381 It would be possible to regain some control by adding an attribute to structs whichspecifies whether they can be managed or not (perhaps \emph{manageable} or \emph{unmanageable}), and to emit an error in the case that a constructor or destructor is declared for an unmanageable type.1386 Furthermore, it is not possible to overload C functions, so using @extern "C"@ to declare functions is of limited use. 1387 1388 It would be possible to regain some control by adding an attribute to structs that specifies whether they can be managed or not (perhaps \emph{manageable} or \emph{unmanageable}), and to emit an error in the case that a constructor or destructor is declared for an unmanageable type. 1382 1389 Ideally, structs should be manageable by default, since otherwise the default case becomes more verbose. 1383 1390 This means that in general, function signatures would have to be rewritten, and in a select few cases the signatures would not be rewritten. … … 1408 1415 \section{Implementation} 1409 1416 \subsection{Array Initialization} 1410 Arrays are a special case in the C type 1417 Arrays are a special case in the C type-system. 1411 1418 C arrays do not carry around their size, making it impossible to write a standalone \CFA function that constructs or destructs an array while maintaining the standard interface for constructors and destructors. 1412 1419 Instead, \CFA defines the initialization and destruction of an array recursively. … … 1525 1532 By default, objects within a translation unit are constructed in declaration order, and destructed in the reverse order. 1526 1533 The default order of construction of objects amongst translation units is unspecified. 1527 % TODO: not yet implemented, but g++ provides attribute init_priority, which allows specifying the order of global construction on a per object basis1528 % https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Attributes.html#C_002b_002b-Attributes1529 % suggestion: implement this in CFA by picking objects with a specified priority and pulling them into their own init functions (could even group them by priority level -> map<int, list<ObjectDecl*>>) and pull init_priority forward into constructor and destructor attributes with the same priority level1530 1534 It is, however, guaranteed that any global objects in the standard library are initialized prior to the initialization of any object in the user program. 1531 1535 1532 This feature is implemented in the \CFA translator by grouping every global constructor call into a function with the GCC attribute \emph{constructor}, which performs most of the heavy lifting. % CITE: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes1536 This feature is implemented in the \CFA translator by grouping every global constructor call into a function with the GCC attribute \emph{constructor}, which performs most of the heavy lifting. % TODO: CITE: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes 1533 1537 A similar function is generated with the \emph{destructor} attribute, which handles all global destructor calls. 1534 1538 At the time of writing, initialization routines in the library are specified with priority \emph{101}, which is the highest priority level that GCC allows, whereas initialization routines in the user's code are implicitly given the default priority level, which ensures they have a lower priority than any code with a specified priority level. … … 1559 1563 \end{cfacode} 1560 1564 1565 % https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Attributes.html#C_002b_002b-Attributes 1566 % suggestion: implement this in CFA by picking objects with a specified priority and pulling them into their own init functions (could even group them by priority level -> map<int, list<ObjectDecl*>>) and pull init_priority forward into constructor and destructor attributes with the same priority level 1567 GCC provides an attribute @init_priority@, which specifies allows specifying the relative priority for initialization of global objects on a per-object basis in \CC. 1568 A similar attribute can be implemented in \CFA by pulling marked objects into global constructor/destructor-attribute functions with the specified priority. 1569 For example, 1570 \begin{cfacode} 1571 struct A { ... }; 1572 void ?{}(A *, int); 1573 void ^?{}(A *); 1574 __attribute__((init_priority(200))) A x = { 123 }; 1575 \end{cfacode} 1576 would generate 1577 \begin{cfacode} 1578 A x; 1579 __attribute__((constructor(200))) __init_x() { 1580 ?{}(&x, 123); // construct x with priority 200 1581 } 1582 __attribute__((destructor(200))) __destroy_x() { 1583 ?{}(&x); // destruct x with priority 200 1584 } 1585 \end{cfacode} 1586 1561 1587 \subsection{Static Local Variables} 1562 1588 In standard C, it is possible to mark variables that are local to a function with the @static@ storage class. 1563 1589 Unlike normal local variables, a @static@ local variable is defined to live for the entire duration of the program, so that each call to the function has access to the same variable with the same address and value as it had in the previous call to the function. % TODO: mention dynamic loading caveat?? 1564 Much like global variables, in C @static@ variables must be initialized to a \emph{compile-time constant value} so that a compiler is able to create storage for the variable and initialize it before the program begins running.1590 Much like global variables, in C @static@ variables can only be initialized to a \emph{compile-time constant value} so that a compiler is able to create storage for the variable and initialize it at compile-time. 1565 1591 1566 1592 Yet again, this rule is too restrictive for a language with constructors and destructors. … … 1573 1599 Construction of @static@ local objects is implemented via an accompanying @static bool@ variable, which records whether the variable has already been constructed. 1574 1600 A conditional branch checks the value of the companion @bool@, and if the variable has not yet been constructed then the object is constructed. 1575 The object's destructor is scheduled to be run when the program terminates using @atexit@, and the companion @bool@'s value is set so that subsequent invocations of the function willnot reconstruct the object.1601 The object's destructor is scheduled to be run when the program terminates using @atexit@, and the companion @bool@'s value is set so that subsequent invocations of the function do not reconstruct the object. 1576 1602 Since the parameter to @atexit@ is a parameter-less function, some additional tweaking is required. 1577 1603 First, the @static@ variable must be hoisted up to global scope and uniquely renamed to prevent name clashes with other global objects. … … 1630 1656 \end{cfacode} 1631 1657 1658 % TODO: move this section forward?? maybe just after constructor syntax? would need to remove _tmp_cp_ret0, since copy constructors are not discussed yet, but this might not be a big issue. 1632 1659 \subsection{Constructor Expressions} 1633 1660 In \CFA, it is possible to use a constructor as an expression. 1634 1661 Like other operators, the function name @?{}@ matches its operator syntax. 1635 1662 For example, @(&x){}@ calls the default constructor on the variable @x@, and produces @&x@ as a result. 1636 The significance of constructors as expressions rather than as statements is that the result of a constructor expression can be used as part of a larger expression. 1637 A key example is the use of constructor expressions to initialize the result of a call to standard C routine @malloc@. 1663 A key example for this capability is the use of constructor expressions to initialize the result of a call to standard C routine @malloc@. 1638 1664 \begin{cfacode} 1639 1665 struct X { ... };
Note: See TracChangeset
for help on using the changeset viewer.