Changes in doc/papers/general/Paper.tex [271326e:48786bc8]
- File:
-
- 1 edited
-
doc/papers/general/Paper.tex (modified) (31 diffs)
Legend:
- Unmodified
- Added
- Removed
-
doc/papers/general/Paper.tex
r271326e r48786bc8 4 4 \usepackage{epic,eepic} 5 5 \usepackage{xspace,calc,comment} 6 \usepackage{upquote} % switch curled `'" to straight 7 \usepackage{listings} % format program code 8 \usepackage{enumitem} 9 \usepackage[flushmargin]{footmisc} % support label/reference in footnote 6 \usepackage{upquote} % switch curled `'" to straight 7 \usepackage{listings} % format program code 10 8 \usepackage{rotating} 11 9 \usepackage[usenames]{color} 12 \usepackage{pslatex} % reduce size of san serif font10 \usepackage{pslatex} % reduce size of san serif font 13 11 \usepackage[plainpages=false,pdfpagelabels,pdfpagemode=UseNone,pagebackref=true,breaklinks=true,colorlinks=true,linkcolor=blue,citecolor=blue,urlcolor=blue]{hyperref} 14 12 15 13 \setlength{\textheight}{9in} 16 14 %\oddsidemargin 0.0in 17 \renewcommand{\topfraction}{0.8} % float must be greater than X of the page before it is forced onto its own page18 \renewcommand{\bottomfraction}{0.8} % float must be greater than X of the page before it is forced onto its own page19 \renewcommand{\floatpagefraction}{0.8} % float must be greater than X of the page before it is forced onto its own page20 \renewcommand{\textfraction}{0.0} % the entire page maybe devoted to floats with no text on the page at all21 22 \lefthyphenmin=4 % hyphen only after 4 characters15 \renewcommand{\topfraction}{0.8} % float must be greater than X of the page before it is forced onto its own page 16 \renewcommand{\bottomfraction}{0.8} % float must be greater than X of the page before it is forced onto its own page 17 \renewcommand{\floatpagefraction}{0.8} % float must be greater than X of the page before it is forced onto its own page 18 \renewcommand{\textfraction}{0.0} % the entire page maybe devoted to floats with no text on the page at all 19 20 \lefthyphenmin=4 % hyphen only after 4 characters 23 21 \righthyphenmin=4 24 22 … … 26 24 27 25 \newcommand{\CFAIcon}{\textsf{C}\raisebox{\depth}{\rotatebox{180}{\textsf{A}}}\xspace} % Cforall symbolic name 28 \newcommand{\CFA}{\protect\CFAIcon} % safe for section/caption29 \newcommand{\CFL}{\textrm{Cforall}\xspace} % Cforall symbolic name30 \newcommand{\Celeven}{\textrm{C11}\xspace} % C11 symbolic name26 \newcommand{\CFA}{\protect\CFAIcon} % safe for section/caption 27 \newcommand{\CFL}{\textrm{Cforall}\xspace} % Cforall symbolic name 28 \newcommand{\Celeven}{\textrm{C11}\xspace} % C11 symbolic name 31 29 \newcommand{\CC}{\textrm{C}\kern-.1em\hbox{+\kern-.25em+}\xspace} % C++ symbolic name 32 30 \newcommand{\CCeleven}{\textrm{C}\kern-.1em\hbox{+\kern-.25em+}11\xspace} % C++11 symbolic name … … 58 56 \newcommand{\LstCommentStyle}[1]{{\lst@basicstyle{\lst@commentstyle{#1}}}} 59 57 60 \newlength{\gcolumnposn} % temporary hack because lstlisting does not handle tabs correctly58 \newlength{\gcolumnposn} % temporary hack because lstlisting does not handle tabs correctly 61 59 \newlength{\columnposn} 62 60 \setlength{\gcolumnposn}{2.75in} … … 74 72 75 73 % Latin abbreviation 76 \newcommand{\abbrevFont}{\textit} % set empty for no italics74 \newcommand{\abbrevFont}{\textit} % set empty for no italics 77 75 \newcommand{\EG}{\abbrevFont{e}.\abbrevFont{g}.} 78 76 \newcommand*{\eg}{% … … 105 103 106 104 \newenvironment{cquote}{% 107 \list{}{\lstset{resetmargins=true,aboveskip=0pt,belowskip=0pt}\topsep=4pt\parsep=0pt\leftmargin=\parindent lnth\rightmargin\leftmargin}%105 \list{}{\lstset{resetmargins=true,aboveskip=0pt,belowskip=0pt}\topsep=4pt\parsep=0pt\leftmargin=\parindent\rightmargin\leftmargin}% 108 106 \item\relax 109 107 }{% … … 189 187 190 188 191 \section{Introduction }189 \section{Introduction and Background} 192 190 193 191 The C programming language is a foundational technology for modern computing with millions of lines of code implementing everything from commercial operating-systems to hobby projects. … … 195 193 The TIOBE~\cite{TIOBE} ranks the top 5 most popular programming languages as: Java 16\%, \Textbf{C 7\%}, \Textbf{\CC 5\%}, \Csharp 4\%, Python 4\% = 36\%, where the next 50 languages are less than 3\% each with a long tail. 196 194 The top 3 rankings over the past 30 years are: 195 \lstDeleteShortInline@% 197 196 \begin{center} 198 197 \setlength{\tabcolsep}{10pt} 199 \lstDeleteShortInline@%200 198 \begin{tabular}{@{}rccccccc@{}} 201 199 & 2017 & 2012 & 2007 & 2002 & 1997 & 1992 & 1987 \\ \hline … … 204 202 \CC & 3 & 3 & 3 & 3 & 2 & 2 & 4 \\ 205 203 \end{tabular} 204 \end{center} 206 205 \lstMakeShortInline@% 207 \end{center}208 206 Love it or hate it, C is extremely popular, highly used, and one of the few systems languages. 209 207 In many cases, \CC is often used solely as a better C. … … 226 224 The new constructs are empirically compared with both standard C and \CC; the results show the new design is comparable in performance. 227 225 228 \section{Polymorphic Functions} 229 230 \CFA introduces both ad-hoc and parametric polymorphism to C, with a design originally formalized by Ditchfield~\cite{Ditchfield92}, and first implemented by Bilson~\cite{Bilson03}. 231 232 \subsection{Name Overloading} 233 234 C already has a limited form of ad-hoc polymorphism in the form of its basic arithmetic operators, which apply to a variety of different types using identical syntax. 235 \CFA extends the built-in operator overloading by allowing users to define overloads for any function, not just operators, and even any variable; Section~\ref{sec:libraries} includes a number of examples of how this overloading simplifies \CFA programming relative to C. 236 Code generation for these overloaded functions and variables is implemented by the usual approach of mangling the identifier names to include a representation of their type, while \CFA decides which overload to apply based on the same ``usual arithmetic conversions'' used in C to disambiguate operator overloads. 237 As an example: 238 239 \begin{cfa} 240 int max(int a, int b) { return a < b ? b : a; } // (1) 241 double max(double a, double b) { return a < b ? b : a; } // (2) 242 243 int max = INT_MAX; // (3) 244 double max = DBL_MAX; // (4) 245 246 max(7, -max); $\C{// uses (1) and (3), by matching int from constant 7}$ 247 max(max, 3.14); $\C{// uses (2) and (4), by matching double from constant 3.14}$ 248 249 //max(max, -max); $\C{// ERROR: ambiguous}$ 250 int m = max(max, -max); $\C{// uses (1) once and (3) twice, by matching return type}$ 251 \end{cfa} 252 253 \Celeven did add @_Generic@ expressions, which can be used in preprocessor macros to provide a form of ad-hoc polymorphism; however, this polymorphism is both functionally and ergonomically inferior to \CFA name overloading. 254 The macro wrapping the generic expression imposes some limitations; as an example, it could not implement the example above, because the variables @max@ would collide with the functions @max@. 255 Ergonomic limitations of @_Generic@ include the necessity to put a fixed list of supported types in a single place and manually dispatch to appropriate overloads, as well as possible namespace pollution from the functions dispatched to, which must all have distinct names. 256 257 \subsection{\texorpdfstring{\LstKeywordStyle{forall} Functions}{forall Functions}} 226 227 \subsection{Polymorphic Functions} 258 228 \label{sec:poly-fns} 259 229 230 \CFA{}\hspace{1pt}'s polymorphism was originally formalized by Ditchfield~\cite{Ditchfield92}, and first implemented by Bilson~\cite{Bilson03}. 260 231 The signature feature of \CFA is parametric-polymorphic functions~\cite{forceone:impl,Cormack90,Duggan96} with functions generalized using a @forall@ clause (giving the language its name): 261 232 \begin{lstlisting} … … 286 257 Crucial to the design of a new programming language are the libraries to access thousands of external software features. 287 258 Like \CC, \CFA inherits a massive compatible library-base, where other programming languages must rewrite or provide fragile inter-language communication with C. 288 A simple example is leveraging the existing type-unsafe (@void *@) C @bsearch@ to binary search a sorted float array:259 A simple example is leveraging the existing type-unsafe (@void *@) C @bsearch@ to binary search a sorted floating-point array: 289 260 \begin{lstlisting} 290 261 void * bsearch( const void * key, const void * base, size_t nmemb, size_t size, … … 292 263 int comp( const void * t1, const void * t2 ) { return *(double *)t1 < *(double *)t2 ? -1 : 293 264 *(double *)t2 < *(double *)t1 ? 1 : 0; } 294 double key = 5.0, vals[10] = { /* 10 sorted float values */ };265 double key = 5.0, vals[10] = { /* 10 sorted floating-point values */ }; 295 266 double * val = (double *)bsearch( &key, vals, 10, sizeof(vals[0]), comp ); $\C{// search sorted array}$ 296 267 \end{lstlisting} … … 333 304 Hence, programmers can easily form local environments, adding and modifying appropriate functions, to maximize reuse of other existing functions and types. 334 305 335 %% Redundant with Section~\ref{sec:libraries} %% 336 337 % Finally, \CFA allows variable overloading: 338 % \begin{lstlisting} 339 % short int MAX = ...; int MAX = ...; double MAX = ...; 340 % short int s = MAX; int i = MAX; double d = MAX; $\C{// select correct MAX}$ 341 % \end{lstlisting} 342 % Here, the single name @MAX@ replaces all the C type-specific names: @SHRT_MAX@, @INT_MAX@, @DBL_MAX@. 306 Finally, \CFA allows variable overloading: 307 \begin{lstlisting} 308 short int MAX = ...; int MAX = ...; double MAX = ...; 309 short int s = MAX; int i = MAX; double d = MAX; $\C{// select correct MAX}$ 310 \end{lstlisting} 311 Here, the single name @MAX@ replaces all the C type-specific names: @SHRT_MAX@, @INT_MAX@, @DBL_MAX@. 343 312 344 313 \subsection{Traits} … … 536 505 In many languages, functions can return at most one value; 537 506 however, many operations have multiple outcomes, some exceptional. 538 Consider C's @div@ and @remquo@ functions, which return the quotient and remainder for a division of integer and float values, respectively.507 Consider C's @div@ and @remquo@ functions, which return the quotient and remainder for a division of integer and floating-point values, respectively. 539 508 \begin{lstlisting} 540 509 typedef struct { int quo, rem; } div_t; $\C{// from include stdlib.h}$ … … 967 936 968 937 \section{Control Structures} 969 970 \CFA identifies missing and problematic control structures in C, and extends and modifies these control structures to increase functionality and safety.971 938 972 939 … … 1077 1044 The implicit targets of the current @continue@ and @break@, \ie the closest enclosing loop or @switch@, change as certain constructs are added or removed. 1078 1045 1079 1080 1046 \subsection{\texorpdfstring{Enhanced \LstKeywordStyle{switch} Statement}{Enhanced switch Statement}} 1081 1047 1082 There are a number of deficiencies with the C @switch@ statements: enumerating @case@ lists, placement of @case@ clauses, scope of the switch body, and fall through between case clauses. 1083 1084 C has no shorthand for specifying a list of case values, whether the list is non-contiguous or contiguous\footnote{C provides this mechanism via fall through.}. 1085 \CFA provides a shorthand for a non-contiguous list: 1086 \begin{cquote} 1087 \lstDeleteShortInline@% 1088 \begin{tabular}{@{}l@{\hspace{\parindentlnth}}l@{}} 1089 \multicolumn{1}{c@{\hspace{\parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\ 1090 \begin{cfa} 1091 case 2, 10, 34, 42: 1092 \end{cfa} 1093 & 1094 \begin{cfa} 1095 case 2: case 10: case 34: case 42: 1096 \end{cfa} 1097 \end{tabular} 1098 \lstMakeShortInline@% 1099 \end{cquote} 1100 for a contiguous list:\footnote{gcc provides the same mechanism with awkward syntax, \lstinline@2 ... 42@, where spaces are required around the ellipse.} 1101 \begin{cquote} 1102 \lstDeleteShortInline@% 1103 \begin{tabular}{@{}l@{\hspace{\parindentlnth}}l@{}} 1104 \multicolumn{1}{c@{\hspace{\parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\ 1105 \begin{cfa} 1106 case 2~42: 1107 \end{cfa} 1108 & 1109 \begin{cfa} 1110 case 2: case 3: ... case 41: case 42: 1111 \end{cfa} 1112 \end{tabular} 1113 \lstMakeShortInline@% 1114 \end{cquote} 1115 and a combination: 1116 \begin{cfa} 1117 case -12~-4, -1~5, 14~21, 34~42: 1118 \end{cfa} 1119 1120 C allows placement of @case@ clauses \emph{within} statements nested in the @switch@ body (see Duff's device~\cite{Duff83}); 1121 \begin{cfa} 1122 switch ( i ) { 1123 case 0: 1124 for ( int i = 0; i < 10; i += 1 ) { 1125 ... 1126 `case 1:` // no initialization of loop index 1127 ... 1128 } 1129 } 1130 \end{cfa} 1131 \CFA precludes this form of transfer into a control structure because it causes undefined behaviour, especially with respect to missed initialization, and provides very limited functionality. 1132 1133 C allows placement of declaration within the @switch@ body and unreachable code at the start, resulting in undefined behaviour: 1134 \begin{cfa} 1135 switch ( x ) { 1136 `int y = 1;` $\C{// unreachable initialization}$ 1137 `x = 7;` $\C{// unreachable code without label/branch}$ 1138 case 0: 1139 ... 1140 `int z = 0;` $\C{// unreachable initialization, cannot appear after case}$ 1141 z = 2; 1142 case 1: 1143 `x = z;` $\C{// without fall through, z is undefined}$ 1144 } 1145 \end{cfa} 1146 \CFA allows the declaration of local variables, \eg @y@, at the start of the @switch@ with scope across the entire @switch@ body, \ie all @case@ clauses, but no statements. 1147 \CFA disallows the declaration of local variable, \eg @z@, directly within the @switch@ body, because a declaration cannot occur immediately after a @case@ since a label can only be attached to a statement, and the use of @z@ is undefined in @case 1@ as neither storage allocation nor initialization may have occurred. 1148 1149 C @switch@ provides multiple entry points into the statement body, but once an entry point is selected, control continues across \emph{all} @case@ clauses until the end of the @switch@ body, called \newterm{fall through}; 1150 @case@ clauses are made disjoint by the @break@ statement. 1151 While the ability to fall through \emph{is} a useful form of control flow, it does not match well with programmer intuition, resulting in many errors from missing @break@ statements. 1152 \CFA provides a new control structure, @choose@, which mimics @switch@, but reverses the meaning of fall through: 1153 \begin{cquote} 1154 \lstDeleteShortInline@% 1155 \begin{tabular}{@{}l@{\hspace{\parindentlnth}}l@{}} 1156 \multicolumn{1}{c@{\hspace{\parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\ 1157 \begin{cfa} 1158 `choose` ( day ) { 1159 case Mon~Thu: 1160 // program 1161 1162 case Fri: 1163 // program 1164 wallet += pay; 1165 `fallthrough;` 1166 case Sat: 1167 // party 1168 wallet -= party; 1169 1170 case Sun: 1171 // rest 1172 1173 default: 1174 // error 1175 } 1176 \end{cfa} 1177 & 1178 \begin{cfa} 1179 switch ( day ) { 1180 case Mon: case Tue: case Wed: case Thu: 1181 // program 1182 `break;` 1183 case Fri: 1184 // program 1185 wallet += pay; 1186 1187 case Sat: 1188 // party 1189 wallet -= party; 1190 `break;` 1191 case Sun: 1192 // rest 1193 `break;` 1194 default: 1195 // error 1196 } 1197 \end{cfa} 1198 \end{tabular} 1199 \lstMakeShortInline@% 1200 \end{cquote} 1201 Collectively, these enhancements reduce programmer burden and increase readability and safety. 1202 1203 \begin{comment} 1048 \CFA also fixes a number of ergonomic defecits in the @switch@ statements of standard C. 1049 C can specify a number of equivalent cases by using the default ``fall-through'' semantics of @case@ clauses, \eg @case 1: case 2: case 3:@ -- this syntax is cluttered, however, so \CFA includes a more concise list syntax, @case 1, 2, 3:@. 1050 For contiguous ranges, \CFA provides an even more concise range syntax as well, @case 1~3:@; lists of ranges are also allowed in case selectors. 1051 1204 1052 Forgotten @break@ statements at the end of @switch@ cases are a persistent sort of programmer error in C, and the @break@ statements themselves introduce visual clutter and an un-C-like keyword-based block delimiter. 1205 1053 \CFA addresses this error by introducing a @choose@ statement, which works identically to a @switch@ except that its default end-of-case behaviour is to break rather than to fall through for all non-empty cases. … … 1222 1070 } 1223 1071 \end{cfa} 1224 \end{comment}1225 1226 1072 1227 1073 \subsection{\texorpdfstring{\LstKeywordStyle{with} Clause / Statement}{with Clause / Statement}} … … 1360 1206 \end{cfa} 1361 1207 1362 % \subsection{Exception Handling ???} 1208 1209 \subsection{Exception Handling ???} 1210 1363 1211 1364 1212 \section{Declarations} … … 1407 1255 \lstDeleteShortInline@% 1408 1256 \lstset{moredelim=**[is][\color{blue}]{+}{+}} 1409 \begin{tabular}{@{}l@{\hspace{ \parindentlnth}}l@{}}1410 \multicolumn{1}{c@{\hspace{ \parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\1257 \begin{tabular}{@{}l@{\hspace{3em}}l@{}} 1258 \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\ 1411 1259 \begin{cfa} 1412 1260 +[5] *+ `int` x1; 1413 1261 +* [5]+ `int` x2; 1414 `[* [5] int]` f+( int p )+;1262 +[* [5] int]+ f`( int p )`; 1415 1263 \end{cfa} 1416 1264 & … … 1418 1266 `int` +*+ x1 +[5]+; 1419 1267 `int` +(*+x2+)[5]+; 1420 `int (*`f+( int p )+`)[5]`;1268 +int (*+f`( int p )`+)[5]+; 1421 1269 \end{cfa} 1422 1270 \end{tabular} … … 1429 1277 \begin{cquote} 1430 1278 \lstDeleteShortInline@% 1431 \begin{tabular}{@{}l@{\hspace{ \parindentlnth}}l@{}}1432 \multicolumn{1}{c@{\hspace{ \parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\1279 \begin{tabular}{@{}l@{\hspace{3em}}l@{}} 1280 \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\ 1433 1281 \begin{cfa} 1434 1282 `*` int x, y; … … 1444 1292 \begin{cquote} 1445 1293 \lstDeleteShortInline@% 1446 \begin{tabular}{@{}l@{\hspace{ \parindentlnth}}l@{}}1447 \multicolumn{1}{c@{\hspace{ \parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\1294 \begin{tabular}{@{}l@{\hspace{3em}}l@{}} 1295 \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\ 1448 1296 \begin{cfa} 1449 1297 `*` int x; … … 1462 1310 \begin{cquote} 1463 1311 \lstDeleteShortInline@% 1464 \begin{tabular}{@{}l@{\hspace{ \parindentlnth}}l@{\hspace{\parindentlnth}}l@{}}1465 \multicolumn{1}{c@{\hspace{ \parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{c@{\hspace{\parindentlnth}}}{\textbf{C}} \\1312 \begin{tabular}{@{}l@{\hspace{3em}}l@{\hspace{2em}}l@{}} 1313 \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}} & \multicolumn{1}{c@{\hspace{2em}}}{\textbf{C}} \\ 1466 1314 \begin{cfa} 1467 1315 [ 5 ] int z; … … 1503 1351 \begin{cquote} 1504 1352 \lstDeleteShortInline@% 1505 \begin{tabular}{@{}l@{\hspace{ \parindentlnth}}l@{\hspace{\parindentlnth}}l@{}}1506 \multicolumn{1}{c@{\hspace{ \parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{c@{\hspace{\parindentlnth}}}{\textbf{C}} \\1353 \begin{tabular}{@{}l@{\hspace{1em}}l@{\hspace{1em}}l@{}} 1354 \multicolumn{1}{c@{\hspace{1em}}}{\textbf{\CFA}} & \multicolumn{1}{c@{\hspace{1em}}}{\textbf{C}} \\ 1507 1355 \begin{cfa} 1508 1356 const * const int x; … … 1526 1374 \begin{cquote} 1527 1375 \lstDeleteShortInline@% 1528 \begin{tabular}{@{}l@{\hspace{ \parindentlnth}}l@{\hspace{\parindentlnth}}l@{}}1529 \multicolumn{1}{c@{\hspace{ \parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{c@{\hspace{\parindentlnth}}}{\textbf{C}} \\1376 \begin{tabular}{@{}l@{\hspace{3em}}l@{\hspace{2em}}l@{}} 1377 \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}} & \multicolumn{1}{c@{\hspace{2em}}}{\textbf{C}} \\ 1530 1378 \begin{cfa} 1531 1379 extern [ 5 ] int x; … … 1549 1397 \begin{cquote} 1550 1398 \lstDeleteShortInline@% 1551 \begin{tabular}{@{}l@{\hspace{ \parindentlnth}}l@{}}1552 \multicolumn{1}{c@{\hspace{ \parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\1399 \begin{tabular}{@{}l@{\hspace{3em}}l@{}} 1400 \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\ 1553 1401 \begin{cfa} 1554 1402 y = (* int)x; … … 1567 1415 Therefore, a programmer has the option of either continuing to use traditional C declarations or take advantage of the new style. 1568 1416 Clearly, both styles need to be supported for some time due to existing C-style header-files, particularly for UNIX-like systems. 1569 1570 The syntax of the new routine prototype declaration follows directly from the new routine definition syntax;1571 as well, parameter names are optional, \eg:1572 \begin{cfa}1573 [ int x ] f (); $\C{// returning int with no parameters}$1574 [ * int ] g (int y); $\C{// returning pointer to int with int parameter}$1575 [ ] h ( int, char ); $\C{// returning no result with int and char parameters}$1576 [ * int, int ] j ( int ); $\C{// returning pointer to int and int, with int parameter}$1577 \end{cfa}1578 This syntax allows a prototype declaration to be created by cutting and pasting source text from the routine definition header (or vice versa).1579 Like C, it is possible to declare multiple routine-prototypes in a single declaration, where the return type is distributed across \emph{all} routine names in the declaration list, \eg:1580 \begin{cquote}1581 \lstDeleteShortInline@%1582 \begin{tabular}{@{}l@{\hspace{\parindentlnth}}l@{}}1583 \multicolumn{1}{c@{\hspace{\parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\1584 \begin{cfa}1585 [double] foo(), foo( int ), foo( double ) {...}1586 \end{cfa}1587 &1588 \begin{cfa}1589 double foo1(), foo2( int ), foo3( double );1590 \end{cfa}1591 \end{tabular}1592 \lstMakeShortInline@%1593 \end{cquote}1594 \CFA allows the last routine in the list to define its body.1595 1596 Declaration qualifiers can only appear at the start of a \CFA routine declaration,\footref{StorageClassSpecifier} \eg:1597 \begin{cfa}1598 extern [ int ] f ( int );1599 static [ int ] g ( int );1600 \end{cfa}1601 1602 The syntax for pointers to \CFA routines specifies the pointer name on the right, \eg:1603 \begin{cfa}1604 * [ int x ] () fp; $\C{// pointer to routine returning int with no parameters}$1605 * [ * int ] (int y) gp; $\C{// pointer to routine returning pointer to int with int parameter}$1606 * [ ] (int,char) hp; $\C{// pointer to routine returning no result with int and char parameters}$1607 * [ * int,int ] ( int ) jp; $\C{// pointer to routine returning pointer to int and int, with int parameter}$1608 \end{cfa}1609 While parameter names are optional, \emph{a routine name cannot be specified};1610 for example, the following is incorrect:1611 \begin{cfa}1612 * [ int x ] f () fp; $\C{// routine name "f" is not allowed}$1613 \end{cfa}1614 1417 1615 1418 … … 1790 1593 In addition to the expressive power, \lstinline|@=| provides a simple path for migrating legacy C code to \CFA, by providing a mechanism to incrementally convert initializers; the \CFA design team decided to introduce a new syntax for this escape hatch because we believe that our RAII implementation will handle the vast majority of code in a desirable way, and we wished to maintain familiar syntax for this common case. 1791 1594 1792 1793 \subsection{Type Nesting}1794 1795 \CFA allows \newterm{type nesting}, and type qualification of the nested types (see Figure~\ref{f:TypeNestingQualification}), where as C hoists (refactors) nested types into the enclosing scope and has no type qualification.1796 \begin{figure}1797 \centering1798 \lstDeleteShortInline@%1799 \begin{tabular}{@{}l@{\hspace{3em}}l|l@{}}1800 \multicolumn{1}{c@{\hspace{3em}}}{\textbf{C Type Nesting}} & \multicolumn{1}{c}{\textbf{C Implicit Hoisting}} & \multicolumn{1}{|c}{\textbf{\CFA}} \\1801 \hline1802 \begin{cfa}1803 struct S {1804 enum C { R, G, B };1805 struct T {1806 union U { int i, j; };1807 enum C c;1808 short int i, j;1809 };1810 struct T t;1811 } s;1812 1813 int rtn() {1814 s.t.c = R;1815 struct T t = { R, 1, 2 };1816 enum C c;1817 union U u;1818 }1819 \end{cfa}1820 &1821 \begin{cfa}1822 enum C { R, G, B };1823 union U { int i, j; };1824 struct T {1825 enum C c;1826 short int i, j;1827 };1828 struct S {1829 struct T t;1830 } s;1831 1832 1833 1834 1835 1836 1837 1838 \end{cfa}1839 &1840 \begin{cfa}1841 struct S {1842 enum C { R, G, B };1843 struct T {1844 union U { int i, j; };1845 enum C c;1846 short int i, j;1847 };1848 struct T t;1849 } s;1850 1851 int rtn() {1852 s.t.c = `S.`R; // type qualification1853 struct `S.`T t = { `S.`R, 1, 2 };1854 enum `S.`C c;1855 union `S.T.`U u;1856 }1857 \end{cfa}1858 \end{tabular}1859 \lstMakeShortInline@%1860 \caption{Type Nesting / Qualification}1861 \label{f:TypeNestingQualification}1862 \end{figure}1863 In the left example in C, types @C@, @U@ and @T@ are implicitly hoisted outside of type @S@ into the containing block scope.1864 In the right example in \CFA, the types are not hoisted and accessed using the field-selection operator ``@.@'' for type qualification, as does Java, rather than the \CC type-selection operator ``@::@''.1865 1866 1867 1595 \subsection{Default Parameters} 1868 1596 … … 1870 1598 \section{Literals} 1871 1599 1872 C already includes limited polymorphism for literals -- @0@ can be either an integer or a pointer literal, depending on context, while the syntactic forms of literals of the various integer and float types are very similar, differing from each other only in suffix.1600 C already includes limited polymorphism for literals -- @0@ can be either an integer or a pointer literal, depending on context, while the syntactic forms of literals of the various integer and floating-point types are very similar, differing from each other only in suffix. 1873 1601 In keeping with the general \CFA approach of adding features while respecting ``the C way'' of doing things, we have extended both C's polymorphic zero and typed literal syntax to interoperate with user-defined types, while maintaining a backwards-compatible semantics. 1874 1602 … … 1894 1622 struct Weight { double stones; }; 1895 1623 1896 void ?{}( Weight & w ) { w.stones = 0; } $\C{// operations}$1624 void ?{}( Weight & w ) { w.stones = 0; } $\C{// operations}$ 1897 1625 void ?{}( Weight & w, double w ) { w.stones = w; } 1898 1626 Weight ?+?( Weight l, Weight r ) { return (Weight){ l.stones + r.stones }; } … … 1903 1631 1904 1632 int main() { 1905 Weight w, hw = { 14 }; $\C{// 14 stone}$1633 Weight w, hw = { 14 }; $\C{// 14 stone}$ 1906 1634 w = 11@`st@ + 1@`lb@; 1907 1635 w = 70.3@`kg@; 1908 1636 w = 155@`lb@; 1909 w = 0x_9b_u@`lb@; $\C{// hexadecimal unsigned weight (155)}$1910 w = 0_233@`lb@; $\C{// octal weight (155)}$1637 w = 0x_9b_u@`lb@; $\C{// hexadecimal unsigned weight (155)}$ 1638 w = 0_233@`lb@; $\C{// octal weight (155)}$ 1911 1639 w = 5@`st@ + 8@`kg@ + 25@`lb@ + hw; 1912 1640 } 1913 1641 \end{cfa} 1914 1642 }% 1915 1916 1917 \section{Libraries}1918 \label{sec:libraries}1919 1920 As stated in Section~\ref{sec:poly-fns}, \CFA inherits a large corpus of library code, where other programming languages must rewrite or provide fragile inter-language communication with C.1921 \CFA has replacement libraries condensing hundreds of existing C names into tens of \CFA overloaded names, all without rewriting the actual computations.1922 In many cases, the interface is an inline wrapper providing overloading during compilation but zero cost at runtime.1923 The following sections give a glimpse of the interface reduction to many C libraries.1924 In many cases, @signed@/@unsigned@ @char@ and @short@ routines are available (but not shown) to ensure expression computations remain in a single type, as conversions can distort results.1925 1926 1927 \subsection{Limits}1928 1929 C library @limits.h@ provides lower and upper bound constants for the basic types.1930 \CFA name overloading is used to condense these typed constants, \eg:1931 \begin{cquote}1932 \lstDeleteShortInline@%1933 \begin{tabular}{@{}l@{\hspace{\parindentlnth}}l@{}}1934 \multicolumn{1}{c@{\hspace{\parindentlnth}}}{\textbf{Definition}} & \multicolumn{1}{c}{\textbf{Usage}} \\1935 \begin{cfa}1936 const short int `MIN` = -32768;1937 const int `MIN` = -2147483648;1938 const long int `MIN` = -9223372036854775808L;1939 \end{cfa}1940 &1941 \begin{cfa}1942 short int si = `MIN`;1943 int i = `MIN`;1944 long int li = `MIN`;1945 \end{cfa}1946 \end{tabular}1947 \lstMakeShortInline@%1948 \end{cquote}1949 The result is a significant reduction in names to access typed constants, \eg:1950 \begin{cquote}1951 \lstDeleteShortInline@%1952 \begin{tabular}{@{}l@{\hspace{\parindentlnth}}l@{}}1953 \multicolumn{1}{c@{\hspace{\parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\1954 \begin{cfa}1955 MIN1956 MAX1957 M_PI1958 M_E1959 \end{cfa}1960 &1961 \begin{cfa}1962 SCHAR_MIN, CHAR_MIN, SHRT_MIN, INT_MIN, LONG_MIN, LLONG_MIN,1963 SCHAR_MAX, UCHAR_MAX, SHRT_MAX, INT_MAX, LONG_MAX, LLONG_MAX,1964 M_PI, M_PIl, M_CPI, M_CPIl,1965 M_E, M_El, M_CE, M_CEl1966 \end{cfa}1967 \end{tabular}1968 \lstMakeShortInline@%1969 \end{cquote}1970 1971 1972 \subsection{Math}1973 1974 C library @math.h@ provides many mathematical routines.1975 \CFA routine overloading is used to condense these mathematical routines, \eg:1976 \begin{cquote}1977 \lstDeleteShortInline@%1978 \begin{tabular}{@{}l@{\hspace{\parindentlnth}}l@{}}1979 \multicolumn{1}{c@{\hspace{\parindentlnth}}}{\textbf{Definition}} & \multicolumn{1}{c}{\textbf{Usage}} \\1980 \begin{cfa}1981 float `log`( float x );1982 double `log`( double );1983 double _Complex `log`( double _Complex x );1984 \end{cfa}1985 &1986 \begin{cfa}1987 float f = `log`( 3.5 );1988 double d = `log`( 3.5 );1989 double _Complex dc = `log`( 3.5+0.5I );1990 \end{cfa}1991 \end{tabular}1992 \lstMakeShortInline@%1993 \end{cquote}1994 The result is a significant reduction in names to access math routines, \eg:1995 \begin{cquote}1996 \lstDeleteShortInline@%1997 \begin{tabular}{@{}l@{\hspace{\parindentlnth}}l@{}}1998 \multicolumn{1}{c@{\hspace{\parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\1999 \begin{cfa}2000 log2001 sqrt2002 sin2003 \end{cfa}2004 &2005 \begin{cfa}2006 logf, log, logl, clogf, clog, clogl2007 sqrtf, sqrt, sqrtl, csqrtf, csqrt, csqrtl2008 sinf, sin, sinl, csinf, csin, csinl2009 \end{cfa}2010 \end{tabular}2011 \lstMakeShortInline@%2012 \end{cquote}2013 While \Celeven has type-generic math~\cite[\S~7.25]{C11} in @tgmath.h@ to provide a similar mechanism, these macros are limited, matching a routine name with a single set of floating type(s).2014 For example, it is not possible to overload @atan@ for both one and two arguments;2015 instead the names @atan@ and @atan2@ are required.2016 The key observation is that only a restricted set of type-generic macros are provided for a limited set of routine names, which do not generalize across the type system, as in \CFA.2017 2018 2019 \subsection{Standard}2020 2021 C library @stdlib.h@ provides many general routines.2022 \CFA routine overloading is used to condense these utility routines, \eg:2023 \begin{cquote}2024 \lstDeleteShortInline@%2025 \begin{tabular}{@{}l@{\hspace{\parindentlnth}}l@{}}2026 \multicolumn{1}{c@{\hspace{\parindentlnth}}}{\textbf{Definition}} & \multicolumn{1}{c}{\textbf{Usage}} \\2027 \begin{cfa}2028 unsigned int `abs`( int );2029 double `abs`( double );2030 double abs( double _Complex );2031 \end{cfa}2032 &2033 \begin{cfa}2034 unsigned int i = `abs`( -1 );2035 double d = `abs`( -1.5 );2036 double d = `abs`( -1.5+0.5I );2037 \end{cfa}2038 \end{tabular}2039 \lstMakeShortInline@%2040 \end{cquote}2041 The result is a significant reduction in names to access utility routines, \eg:2042 \begin{cquote}2043 \lstDeleteShortInline@%2044 \begin{tabular}{@{}l@{\hspace{\parindentlnth}}l@{}}2045 \multicolumn{1}{c@{\hspace{\parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\2046 \begin{cfa}2047 abs2048 strto2049 random2050 \end{cfa}2051 &2052 \begin{cfa}2053 abs, labs, llabs, fabsf, fabs, fabsl, cabsf, cabs, cabsl2054 strtol, strtoul, strtoll, strtoull, strtof, strtod, strtold2055 srand48, mrand48, lrand48, drand482056 \end{cfa}2057 \end{tabular}2058 \lstMakeShortInline@%2059 \end{cquote}2060 In additon, there are polymorphic routines, like @min@ and @max@, which work on any type with operators @?<?@ or @?>?@.2061 2062 The following shows one example where \CFA \emph{extends} an existing standard C interface to reduce complexity and provide safety.2063 C/\Celeven provide a number of complex and overlapping storage-management operation to support the following capabilities:2064 \begin{description}[itemsep=2pt,parsep=0pt]2065 \item[fill]2066 after allocation the storage is filled with a specified character.2067 \item[resize]2068 an existing allocation is decreased or increased in size.2069 In either case, new storage may or may not be allocated and, if there is a new allocation, as much data from the existing allocation is copied.2070 For an increase in storage size, new storage after the copied data may be filled.2071 \item[alignment]2072 an allocation starts on a specified memory boundary, \eg, an address multiple of 64 or 128 for cache-line purposes.2073 \item[array]2074 the allocation size is scaled to the specified number of array elements.2075 An array may be filled, resized, or aligned.2076 \end{description}2077 Table~\ref{t:StorageManagementOperations} shows the capabilities provided by C/\Celeven allocation-routines and how all the capabilities can be combined into two \CFA routines.2078 2079 \CFA storage-management routines extend the C equivalents by overloading, providing shallow type-safety, and removing the need to specify the base allocation-size.2080 The following example contrasts \CFA and C storage-allocation operation performing the same operations with the same type safety:2081 \begin{cquote}2082 \begin{cfa}[aboveskip=0pt]2083 size_t dim = 10; $\C{// array dimension}$2084 char fill = '\xff'; $\C{// initialization fill value}$2085 int * ip;2086 \end{cfa}2087 \lstDeleteShortInline@%2088 \begin{tabular}{@{}l@{\hspace{\parindentlnth}}l@{}}2089 \multicolumn{1}{c@{\hspace{\parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\2090 \begin{cfa}2091 ip = alloc();2092 ip = alloc( fill );2093 ip = alloc( dim );2094 ip = alloc( dim, fill );2095 ip = alloc( ip, 2 * dim );2096 ip = alloc( ip, 4 * dim, fill );2097 2098 ip = align_alloc( 16 );2099 ip = align_alloc( 16, fill );2100 ip = align_alloc( 16, dim );2101 ip = align_alloc( 16, dim, fill );2102 \end{cfa}2103 &2104 \begin{cfa}2105 ip = (int *)malloc( sizeof( int ) );2106 ip = (int *)malloc( sizeof( int ) ); memset( ip, fill, sizeof( int ) );2107 ip = (int *)malloc( dim * sizeof( int ) );2108 ip = (int *)malloc( sizeof( int ) ); memset( ip, fill, dim * sizeof( int ) );2109 ip = (int *)realloc( ip, 2 * dim * sizeof( int ) );2110 ip = (int *)realloc( ip, 4 * dim * sizeof( int ) ); memset( ip, fill, 4 * dim * sizeof( int ) );2111 2112 ip = memalign( 16, sizeof( int ) );2113 ip = memalign( 16, sizeof( int ) ); memset( ip, fill, sizeof( int ) );2114 ip = memalign( 16, dim * sizeof( int ) );2115 ip = memalign( 16, dim * sizeof( int ) ); memset( ip, fill, dim * sizeof( int ) );2116 \end{cfa}2117 \end{tabular}2118 \lstMakeShortInline@%2119 \end{cquote}2120 Variadic @new@ (see Section~\ref{sec:variadic-tuples}) cannot support the same overloading because extra parameters are for initialization.2121 Hence, there are @new@ and @anew@ routines for single and array variables, and the fill value is the arguments to the constructor, \eg:2122 \begin{cfa}2123 struct S { int i, j; };2124 void ?{}( S & s, int i, int j ) { s.i = i; s.j = j; }2125 S * s = new( 2, 3 ); $\C{// allocate storage and run constructor}$2126 S * as = anew( dim, 2, 3 ); $\C{// each array element initialized to 2, 3}$2127 \end{cfa}2128 Note, \CC can only initialization array elements via the default constructor.2129 2130 Finally, the \CFA memory-allocator has \newterm{sticky properties} for dynamic storage: fill and alignment are remembered with an object's storage in the heap.2131 When a @realloc@ is performed, the sticky properties are respected, so that new storage is correctly aligned and initialized with the fill character.2132 2133 \begin{table}2134 \centering2135 \lstDeleteShortInline@%2136 \lstMakeShortInline~%2137 \begin{tabular}{@{}r|r|l|l|l|l@{}}2138 \multicolumn{1}{c}{}& & \multicolumn{1}{c|}{fill} & resize & alignment & array \\2139 \hline2140 C & ~malloc~ & no & no & no & no \\2141 & ~calloc~ & yes (0 only) & no & no & yes \\2142 & ~realloc~ & no/copy & yes & no & no \\2143 & ~memalign~ & no & no & yes & no \\2144 & ~posix_memalign~ & no & no & yes & no \\2145 \hline2146 C11 & ~aligned_alloc~ & no & no & yes & no \\2147 \hline2148 \CFA & ~alloc~ & yes/copy & no/yes & no & yes \\2149 & ~align_alloc~ & yes & no & yes & yes \\2150 \end{tabular}2151 \lstDeleteShortInline~%2152 \lstMakeShortInline@%2153 \caption{Storage-Management Operations}2154 \label{t:StorageManagementOperations}2155 \end{table}2156 2157 2158 \subsection{I/O}2159 \label{s:IOLibrary}2160 2161 The goal of \CFA I/O is to simplify the common cases, while fully supporting polymorphism and user defined types in a consistent way.2162 The approach combines ideas from \CC and Python.2163 The \CFA header file for the I/O library is @fstream@.2164 2165 The common case is printing out a sequence of variables separated by whitespace.2166 \begin{cquote}2167 \lstDeleteShortInline@%2168 \begin{tabular}{@{}l@{\hspace{\parindentlnth}}l@{}}2169 \multicolumn{1}{c@{\hspace{\parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{\CC}} \\2170 \begin{cfa}2171 int x = 1, y = 2, z = 3;2172 sout | x `|` y `|` z | endl;2173 \end{cfa}2174 &2175 \begin{cfa}2176 2177 cout << x `<< " "` << y `<< " "` << z << endl;2178 \end{cfa}2179 \\2180 \begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt]2181 1` `2` `32182 \end{cfa}2183 &2184 \begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt]2185 1 2 32186 \end{cfa}2187 \end{tabular}2188 \lstMakeShortInline@%2189 \end{cquote}2190 The \CFA form has half the characters of the \CC form, and is similar to Python I/O with respect to implicit separators.2191 Similar simplification occurs for tuple I/O, which prints all tuple values separated by ``\lstinline[showspaces=true]@, @''.2192 \begin{cfa}2193 [int, [ int, int ] ] t1 = [ 1, [ 2, 3 ] ], t2 = [ 4, [ 5, 6 ] ];2194 sout | t1 | t2 | endl; $\C{// print tuples}$2195 \end{cfa}2196 \begin{cfa}[showspaces=true,aboveskip=0pt]2197 1`, `2`, `3 4`, `5`, `62198 \end{cfa}2199 Finally, \CFA uses the logical-or operator for I/O as it is the lowest-priority overloadable operator, other than assignment.2200 Therefore, fewer output expressions require parenthesis.2201 \begin{cquote}2202 \lstDeleteShortInline@%2203 \begin{tabular}{@{}ll@{}}2204 \textbf{\CFA:}2205 &2206 \begin{cfa}2207 sout | x * 3 | y + 1 | z << 2 | x == y | (x | y) | (x || y) | (x > z ? 1 : 2) | endl;2208 \end{cfa}2209 \\2210 \textbf{\CC:}2211 &2212 \begin{cfa}2213 cout << x * 3 << y + 1 << `(`z << 2`)` << `(`x == y`)` << (x | y) << (x || y) << (x > z ? 1 : 2) << endl;2214 \end{cfa}2215 \\2216 \textbf{output:}2217 &2218 \begin{cfa}[showspaces=true,aboveskip=0pt]2219 3 3 12 0 3 1 22220 \end{cfa}2221 \end{tabular}2222 \lstMakeShortInline@%2223 \end{cquote}2224 There is a weak similarity between the \CFA logical-or operator and the Shell pipe-operator for moving data, where data flows in the correct direction for input but the opposite direction for output.2225 2226 The implicit separator character (space/blank) is a separator not a terminator.2227 The rules for implicitly adding the separator are:2228 \begin{itemize}[itemsep=2pt,parsep=0pt]2229 \item2230 A separator does not appear at the start or end of a line.2231 \item2232 A separator does not appear before or after a character literal or variable.2233 \item2234 A separator does not appear before or after a null (empty) C string, which is a local mechanism to disable insertion of the separator character.2235 \item2236 A separator does not appear before a C string starting with the characters: \lstinline[mathescape=off,basicstyle=\tt]@([{=$@2237 \item2238 A seperator does not appear after a C string ending with the characters: \lstinline[basicstyle=\tt]@,.;!?)]}%@2239 \item2240 {\lstset{language=CFA,deletedelim=**[is][]{`}{`}}2241 A seperator does not appear before or after a C string begining/ending with the quote or whitespace characters: \lstinline[basicstyle=\tt,showspaces=true]@`'": \t\v\f\r\n@2242 }%2243 \item2244 There are routines to set and get the separator string, and manipulators to toggle separation on and off in the middle of output.2245 \end{itemize}2246 2247 2248 \subsection{Multi-precision Integers}2249 \label{s:MultiPrecisionIntegers}2250 2251 \CFA has an interface to the GMP multi-precision signed-integers~\cite{GMP}, similar to the \CC interface provided by GMP.2252 The \CFA interface wraps GMP routines into operator routines to make programming with multi-precision integers identical to using fixed-sized integers.2253 The \CFA type name for multi-precision signed-integers is @Int@ and the header file is @gmp@.2254 The following multi-precision factorial programs contrast using GMP with the \CFA and C interfaces.2255 \begin{cquote}2256 \lstDeleteShortInline@%2257 \begin{tabular}{@{}l@{\hspace{\parindentlnth}}@{\hspace{\parindentlnth}}l@{}}2258 \multicolumn{1}{c@{\hspace{\parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{@{\hspace{\parindentlnth}}c}{\textbf{C}} \\2259 \begin{cfa}2260 #include <gmp>2261 int main( void ) {2262 sout | "Factorial Numbers" | endl;2263 Int fact = 1;2264 2265 sout | 0 | fact | endl;2266 for ( unsigned int i = 1; i <= 40; i += 1 ) {2267 fact *= i;2268 sout | i | fact | endl;2269 }2270 }2271 \end{cfa}2272 &2273 \begin{cfa}2274 #include <gmp.h>2275 int main( void ) {2276 `gmp_printf`( "Factorial Numbers\n" );2277 `mpz_t` fact;2278 `mpz_init_set_ui`( fact, 1 );2279 `gmp_printf`( "%d %Zd\n", 0, fact );2280 for ( unsigned int i = 1; i <= 40; i += 1 ) {2281 `mpz_mul_ui`( fact, fact, i );2282 `gmp_printf`( "%d %Zd\n", i, fact );2283 }2284 }2285 \end{cfa}2286 \end{tabular}2287 \lstMakeShortInline@%2288 \end{cquote}2289 2290 1643 2291 1644 \section{Evaluation} … … 2351 1704 2352 1705 \begin{table} 2353 \centering2354 1706 \caption{Properties of benchmark code} 2355 1707 \label{tab:eval}
Note:
See TracChangeset
for help on using the changeset viewer.