Changeset becba789 for doc


Ignore:
Timestamp:
Aug 2, 2016, 6:40:20 PM (9 years ago)
Author:
Rob Schluntz <rschlunt@…>
Branches:
ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, ctor, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, memory, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, resolv-new, with_gc
Children:
8a443f4, 9799ec8
Parents:
e4957e7 (diff), 155cce0f (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'master' of plg.uwaterloo.ca:/u/cforall/software/cfa/cfa-cc

Location:
doc
Files:
4 edited

Legend:

Unmodified
Added
Removed
  • doc/LaTeXmacros/common.tex

    re4957e7 rbecba789  
    1111%% Created On       : Sat Apr  9 10:06:17 2016
    1212%% Last Modified By : Peter A. Buhr
    13 %% Last Modified On : Tue Jul 12 20:37:57 2016
    14 %% Update Count     : 206
     13%% Last Modified On : Mon Aug  1 09:11:20 2016
     14%% Update Count     : 225
    1515%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    16 
    17 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    1816
    1917\setlength{\textheight}{9in}
    2018%\oddsidemargin 0.0in
    21 \renewcommand{\topfraction}{0.8}        % float must be greater than X of the page before it is forced onto its own page
    22 \renewcommand{\bottomfraction}{0.8}     % float must be greater than X of the page before it is forced onto its own page
     19\renewcommand{\topfraction}{0.8}                % float must be greater than X of the page before it is forced onto its own page
     20\renewcommand{\bottomfraction}{0.8}             % float must be greater than X of the page before it is forced onto its own page
    2321\renewcommand{\floatpagefraction}{0.8}  % float must be greater than X of the page before it is forced onto its own page
    24 \renewcommand{\textfraction}{0.0}       % the entire page maybe devoted to floats with no text on the page at all
    25 
    26 \lefthyphenmin=4
     22\renewcommand{\textfraction}{0.0}               % the entire page maybe devoted to floats with no text on the page at all
     23
     24\lefthyphenmin=4                                                % hyphen only after 4 characters
    2725\righthyphenmin=4
    2826
     
    3836% Names used in the document.
    3937
    40 \newcommand{\CFA}{C$\mathbf\forall$\xspace}              % set language symbolic name
    41 \newcommand{\CFL}{Cforall\xspace}                        % set language text name
     38\newcommand{\CFA}{C$\mathbf\forall$\xspace} % set language symbolic name
     39\newcommand{\CFL}{Cforall\xspace}               % set language text name
    4240\newcommand{\CC}{\rm C\kern-.1em\hbox{+\kern-.25em+}\xspace} % CC symbolic name
    4341\newcommand{\CCeleven}{\rm C\kern-.1em\hbox{+\kern-.25em+}11\xspace} % C++11 symbolic name
    44 \def\c11{ISO/IEC C} % C11 name (cannot have numbers in latex command name)
     42\def\c11{ISO/IEC C}                                             % C11 name (cannot have numbers in latex command name)
    4543
    4644%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
     
    5250\setlength{\parindentlnth}{\parindent}
    5351
    54 \newlength{\gcolumnposn}
     52\newlength{\gcolumnposn}                                % temporary hack because lstlisting does handle tabs correctly
    5553\newlength{\columnposn}
    5654\setlength{\gcolumnposn}{2.5in}
     
    6361%\patchcmd{\lsthk@TextStyle}{\let\lst@DefEsc\@empty}{}{}{\errmessage{failed to patch}}
    6462
    65 \usepackage{pslatex}                                                                    % reduce size of san serif font
    66 \usepackage{relsize}                                    % must be after change to small or selects old size
     63\usepackage{pslatex}                                    % reduce size of san serif font
     64\usepackage{relsize}                                    % must be after change to small or selects old size
    6765
    6866% reduce size of chapter/section titles
     
    120118
    121119% inline text and lowercase index: \Index{inline and lowercase index text}
     120\newcommand{\Index}{\@ifstar\@sIndex\@Index}
    122121% inline text and as-in index: \Index[as-is index text]{inline text}
     122\newcommand{\@Index}[2][\@empty]{\lowercase{\def\temp{#2}}#2\ifx#1\@empty\index{\temp}\else\index{#1@{\protect#2}}\fi}
    123123% inline text but index with different as-is text: \Index[index text]{inline text}
    124 \newcommand{\Index}{\@ifstar\@sIndex\@Index}
    125 \newcommand{\@Index}[2][\@empty]{\lowercase{\def\temp{#2}}#2\ifx#1\@empty\index{\temp}\else\index{#1@{\protect#2}}\fi}
    126124\newcommand{\@sIndex}[2][\@empty]{#2\ifx#1\@empty\index{#2}\else\index{#1@{\protect#2}}\fi}
    127125
    128 % cannot use ©
     126% inline text and code index (cannot use ©)
    129127\newcommand{\Indexc}[1]{\lstinline$#1$\index{#1@\lstinline$#1$}}
     128% code index (cannot use ©)
    130129\newcommand{\indexc}[1]{\index{#1@\lstinline$#1$}}
    131130
     
    137136\newcommand{\@newterm}[2][\@empty]{\lowercase{\def\temp{#2}}{\newtermFontInline{#2}}\ifx#1\@empty\index{\temp}\else\index{#1@{\protect#2}}\fi}
    138137\newcommand{\@snewterm}[2][\@empty]{{\newtermFontInline{#2}}\ifx#1\@empty\index{#2}\else\index{#1@{\protect#2}}\fi}
     138
     139% Latin abbreviation
     140\newcommand{\abbrevFont}{\textit}       % set empty for no italics
     141\newcommand*{\eg}{%
     142        \@ifnextchar{,}{\abbrevFont{e}.\abbrevFont{g}.}%
     143                {\@ifnextchar{:}{\abbrevFont{e}.\abbrevFont{g}.}%
     144                        {\abbrevFont{e}.\abbrevFont{g}.,\xspace}}%
     145}%
     146\newcommand*{\ie}{%
     147        \@ifnextchar{,}{\abbrevFont{i}.\abbrevFont{e}.}%
     148                {\@ifnextchar{:}{\abbrevFont{i}.\abbrevFont{e}.}%
     149                        {\abbrevFont{i}.\abbrevFont{e}.,\xspace}}%
     150}%
     151\newcommand*{\etc}{%
     152        \@ifnextchar{.}{\abbrevFont{etc}}%
     153        {\abbrevFont{etc}.\xspace}%
     154}%
    139155\makeatother
    140156
     
    145161        \endlist
    146162}% quote2
     163
    147164\newenvironment{rationale}{%
    148165  \begin{quote2}\noindent$\Box$\enspace
     
    188205\newcommand{\VPageref}[2][page]{\ifx#1\@empty\else{#1}\nobreakspace\fi\pageref{#2}}
    189206
    190 % Go programming language
     207% Go programming language: https://github.com/julienc91/listings-golang/blob/master/listings-golang.sty
    191208\lstdefinelanguage{Golang}{
    192209        morekeywords=[1]{package,import,func,type,struct,return,defer,panic, recover,select,var,const,iota,},%
     
    204221}
    205222
    206 % CFA programming language, based on ANSI C
     223% CFA programming language, based on ANSI C (with some gcc additions)
    207224\lstdefinelanguage{CFA}[ANSI]{C}{
    208225        morekeywords={_Alignas,_Alignof,__alignof,__alignof__,asm,__asm,__asm__,_At,_Atomic,__attribute,__attribute__,auto,
    209226                _Bool,catch,catchResume,choose,_Complex,__complex,__complex__,__const,__const__,disable,dtype,enable,__extension__,
    210                 fallthru,finally,forall,ftype,_Generic,_Imaginary,inline,__label__,lvalue,_Noreturn,otype,restrict,_Static_assert,
     227                fallthrough,fallthru,finally,forall,ftype,_Generic,_Imaginary,inline,__label__,lvalue,_Noreturn,otype,restrict,_Static_assert,
    211228                _Thread_local,throw,throwResume,trait,try,typeof,__typeof,__typeof__,},
    212229}%
     
    215232language=CFA,
    216233columns=fullflexible,
    217 basicstyle=\linespread{0.9}\sf,
    218 stringstyle=\tt,
    219 tabsize=4,
    220 xleftmargin=\parindentlnth,
    221 extendedchars=true,
    222 escapechar=§,
    223 mathescape=true,
    224 keepspaces=true,
    225 showstringspaces=false,
    226 showlines=true,
    227 aboveskip=4pt,
     234basicstyle=\linespread{0.9}\sf,                 % reduce line spacing and use sanserif font
     235stringstyle=\tt,                                                % use typewriter font
     236tabsize=4,                                                              % 4 space tabbing
     237xleftmargin=\parindentlnth,                             % indent code to paragraph indentation
     238extendedchars=true,                                             % allow ASCII characters in the range 128-255
     239escapechar=§,                                                   % escape to latex in CFA code
     240mathescape=true,                                                % allow $...$ LaTeX math escapes in code
     241%keepspaces=true,                                               %
     242showstringspaces=false,                                 % do not show spaces with cup
     243showlines=true,                                                 % show blank lines at end of code
     244aboveskip=4pt,                                                  % spacing above/below code block
    228245belowskip=3pt,
    229 moredelim=**[is][\color{red}]{®}{®}, % red highlighting
    230 moredelim=**[is][\color{blue}]{ß}{ß}, % blue highlighting
     246moredelim=**[is][\color{red}]{®}{®},    % red highlighting
     247moredelim=**[is][\color{blue}]{ß}{ß},   % blue highlighting
    231248moredelim=**[is][\color{OliveGreen}]{¢}{¢}, % green highlighting
    232249moredelim=[is][\lstset{keywords={}}]{¶}{¶}, % temporarily turn off keywords
     
    242259\renewcommand\thebibliography[1]{
    243260  \Oldthebibliography{#1}
    244   \setlength{\parskip}{0pt}                     % reduce vertical spacing between references
     261  \setlength{\parskip}{0pt}                             % reduce vertical spacing between references
    245262  \setlength{\itemsep}{5pt plus 0.3ex}
    246263}%
    247 
    248 \newcommand*{\eg}{\textit{e.g}.\@\xspace}
    249 \newcommand*{\ie}{\textit{i.e}.\@\xspace}
    250 
    251 \makeatletter
    252 \newcommand*{\etc}{%
    253     \@ifnextchar{.}%
    254         {\textit{etc}}%
    255         {\textit{etc}.\@\xspace}%
    256 }
    257 \makeatother
    258264
    259265% Local Variables: %
  • doc/aaron_comp_II/comp_II.tex

    re4957e7 rbecba789  
    4343%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    4444
    45 \title{\Huge
    46 \vspace*{1in}
    47 Efficient Type Resolution in \CFA \\
    48 \vspace*{0.25in}
    49 \huge
    50 PhD Comprehensive II Research Proposal
     45\title{
     46\Huge \vspace*{1in} Efficient Type Resolution in \CFA \\
     47\huge \vspace*{0.25in} PhD Comprehensive II Research Proposal
    5148\vspace*{1in}
    5249}
    5350
    54 \author{\huge
    55 \vspace*{0.1in}
    56 Aaron Moss \\
     51\author{
     52\huge Aaron Moss \\
     53\Large \vspace*{0.1in} \texttt{a3moss@uwaterloo.ca} \\
    5754\Large Cheriton School of Computer Science \\
    5855\Large University of Waterloo
     
    8885
    8986\CFA\footnote{Pronounced ``C-for-all'', and written \CFA, CFA, or \CFL.} is an evolutionary modernization of the C programming language currently being designed and built at the University of Waterloo by a team led by Peter Buhr.
    90 Features added to C by \CFA include name overloading, user-defined operators, parametric-polymorphic routines, and type constructors and destructors, among others.
    91 These features make \CFA significantly more powerful and expressive than C, but impose a significant compile-time cost to implement, particularly in the expression resolver, which must evaluate the typing rules of a much more complex type system.
     87\CFA adds multiple features to C, including name overloading, user-defined operators, parametric-polymorphic routines, and type constructors and destructors, among others.
     88These features make \CFA significantly more powerful and expressive than C, but impose a significant compile-time cost to support, particularly in the expression resolver, which must evaluate the typing rules of a much more complex type system.
    9289The primary goal of this proposed research project is to develop a sufficiently performant expression resolution algorithm, experimentally validate its performance, and integrate it into \Index*{CFA-CC}, the \CFA reference compiler.
    9390Secondary goals of this project include the development of various new language features for \CFA; parametric-polymorphic (``generic'') types have already been designed and implemented, and reference types and user-defined conversions are under design consideration.
    94 The experimental performance-testing architecture for resolution algorithms will also be used to determine the compile-time cost of adding such new features to the \CFA type system.
     91The experimental performance-testing architecture for resolution algorithms will also be used to determine the compile-time cost of adding such new features to the \CFA type system.
     92More broadly, this research should provide valuable data for implementers of compilers for other programming languages with similarly powerful static type systems.
    9593
    9694\section{\CFA}
     
    10199\subsection{Polymorphic Functions}
    102100The most significant feature \CFA adds is parametric-polymorphic functions.
    103 Such functions are written using a ©forall© clause, the feature that gave the language its name:
     101Such functions are written using a ©forall© clause (which gives the language its name):
    104102\begin{lstlisting}
    105103forall(otype T)
     
    129127
    130128Monomorphic specializations of polymorphic functions can themselves be used to satisfy type assertions.
    131 For instance, ©twice© could have been define as below, using the \CFA syntax for operator overloading:
     129For instance, ©twice© could have been defined as below, using the \CFA syntax for operator overloading:
    132130\begin{lstlisting}
    133131forall(otype S | { S ?+?(S, S); })
     
    135133\end{lstlisting}
    136134This version of ©twice© will work for any type ©S© that has an addition operator defined for it, and it could have been used to satisfy the type assertion on ©four_times©.
    137 The compiler accomplishes this by creating a wrapper function calling ©twice // (2)© with ©S© bound to ©double©, then providing this wrapper function to ©four_times©\footnote{©twice // (2)© could have had a type parameter named ©T©; \CFA specifies a renaming the type parameters, which would avoid the name conflict with the parameter ©T© of ©four_times©.}.
     135The compiler accomplishes this by creating a wrapper function calling ©twice // (2)© with ©S© bound to ©double©, then providing this wrapper function to ©four_times©\footnote{©twice // (2)© could also have had a type parameter named ©T©; \CFA specifies renaming of the type parameters, which would avoid the name conflict with the type variable ©T© of ©four_times©.}.
    138136
    139137Finding appropriate functions to satisfy type assertions is essentially a recursive case of expression resolution, as it takes a name (that of the type assertion) and attempts to match it to a suitable declaration in the current scope.
    140138If a polymorphic function can be used to satisfy one of its own type assertions, this recursion may not terminate, as it is possible that function will be examined as a candidate for its own type assertion unboundedly repeatedly.
    141139To avoid infinite loops, the current \Index*{CFA-CC} compiler imposes a fixed limit on the possible depth of recursion, similar to that employed by most \Index*[C++]{\CC} compilers for template expansion; this restriction means that there are some semantically well-typed expressions which cannot be resolved by {CFA-CC}.
    142 One area of potential improvement this project proposes to investigate is the possibility of using the compiler's knowledge of the current set of declarations to make a more precise judgement of when further type assertion satisfaction recursion will not produce a well-typed expression.
     140One area of potential improvement this project proposes to investigate is the possibility of using the compiler's knowledge of the current set of declarations to more precicely determine when further type assertion satisfaction recursion will not produce a well-typed expression.
    143141
    144142\subsection{Name Overloading}
     
    171169Such a conversion system should be simple for user programmers to utilize, and fit naturally with the existing design of implicit conversions in C; ideally it would also be sufficiently powerful to encode C's usual arithmetic conversions itself, so that \CFA only has one set of rules for conversions.
    172170
    173 Glen Ditchfield \textbf{TODO CITE} has laid out a framework for using polymorphic conversion constructor functions to create a directed acyclic graph (DAG) of conversions.
     171Ditchfield\cite{Ditchfield:conversions} has laid out a framework for using polymorphic conversion constructor functions to create a directed acyclic graph (DAG) of conversions.
    174172A monomorphic variant of these functions can be used to mark a conversion arc in the DAG as only usable as the final step in a conversion.
    175173With these two types of conversion arcs, separate DAGs can be created for the safe and the unsafe conversions, and conversion cost can be represented as path length through the DAG.
     
    210208\CFA adds \emph{tuple types} to C, a facility for referring to multiple values with a single identifier.
    211209A variable may name a tuple, and a function may return one.
    212 Particularly relevantly for resolution, a tuple may be automatically \emph{destructured} into a list of values, as in the ©swap© function below:
     210Particularly relevantly for resolution, a tuple may be implicitly \emph{destructured} into a list of values, as in the call to ©swap© below:
    213211\begin{lstlisting}
    214212[char, char] x = [ '!', '?' ];
     
    218216
    219217x = swap( x ); // destructure [char, char] x into two elements of parameter list
    220 // ^ can't use int x for parameter, not enough arguments to swap
     218// can't use int x for parameter, not enough arguments to swap
    221219\end{lstlisting}
    222220Tuple destructuring means that the mapping from the position of a subexpression in the argument list to the position of a paramter in the function declaration is not straightforward, as some arguments may be expandable to different numbers of parameters, like ©x© above.
     
    245243
    246244\section{Expression Resolution}
    247 % TODO cite Baker, Cormack, etc.
    248 
     245The expression resolution problem is essentially to determine an optimal matching between some combination of argument interpretations and the parameter list of some overloaded instance of a function; the argument interpretations are produced by recursive invocations of expression resolution, where the base case is zero-argument functions (which are, for purposes of this discussion, semantically equivalent to named variables or constant literal expressions).
     246Assuming that the matching between a function's parameter list and a combination of argument interpretations can be done in $O(p^k)$ time, where $p$ is the number of parameters and $k$ is some positive number, if there are $O(i)$ valid interpretations for each subexpression, there will be $O(i)$ candidate functions and $O(i^p)$ possible argument combinations for each expression, so a single recursive call to expression resolution will take $O(i^{p+1} \cdot p^k)$ time if it compares all combinations.
     247Given this bound, resolution of a single top-level expression tree of depth $d$ takes $O(i^{p+1} \cdot p^{k \cdot d})$ time\footnote{The call tree will have leaves at depth $O(d)$, and each internal node will have $O(p)$ fan-out, producing $O(p^d)$ total recursive calls.}.
     248Expression resolution is somewhat unavoidably exponential in $p$, the number of function parameters, and $d$, the depth of the expression tree, but these values are fixed by the user programmer, and generally bounded by reasonably small constants.
     249$k$, on the other hand, is mostly dependent on the representation of types in the system and the efficiency of type assertion checking; if a candidate argument combination can be compared to a function parameter list in linear time in the length of the list (\ie $k = 1$), then the $p^{k \cdot d}$ term is linear in the input size of the source code for the expression, otherwise the resolution algorithm will exibit sub-linear performance scaling on code containing more-deeply nested expressions.
     250The number of valid interpretations of any subexpression, $i$, is bounded by the number of types in the system, which is possibly infinite, though practical resolution algorithms for \CFA must be able to place some finite bound on $i$, possibly at the expense of type system completeness.
     251
     252The research goal of this project is to develop a performant expression resolver for \CFA; this analysis suggests two primary areas of investigation to accomplish that end.
     253The first is efficient argument-parameter matching; Bilson\cite{Bilson03} mentions significant optimization opportunities available in the current literature to improve on the existing {CFA-CC} compiler.
     254%TODO: look up and lit review
     255The second, and likely more fruitful, area of investigation is heuristics and algorithmic approaches to reduce the number of argument interpretations considered in the common case; given the large ($p+1$) exponent on number of interpretations considered in the runtime analysis, even small reductions here could have a significant effect on overall resolver runtime.
     256The discussion below presents a number of largely orthagonal axes for expression resolution algorithm design to be investigated, noting prior work where applicable.
     257
     258\subsection{Argument-Parameter Matching}
     259The first axis we consider is argument-parameter matching --- whether the type matching for a candidate function to a set of candidate arguments is directed by the argument types or the parameter types.
     260
     261\subsubsection{Argument-directed (``Bottom-up'')}
     262Baker's algorithm for expression resolution\cite{Baker82} pre-computes argument candidates, from the leaves of the expression tree up.
     263For each candidate function, Baker attempts to match argument types to parameter types in sequence, failing if any parameter cannot be matched.
     264
     265Bilson\cite{Bilson03} similarly pre-computes argument candidates in the original \CFA compiler, but then explicitly enumerates all possible argument combinations for a multi-parameter function; these argument combinations are matched to the parameter types of the candidate function as a unit rather than individual arguments.
     266This is less efficient than Baker's approach, as the same argument may be compared to the same parameter many times, but allows a more straightforward handling of polymorphic type binding and multiple return types.
     267It is possible the efficiency losses here relative to Baker could be significantly reduced by application of memoization to the argument-parameter type comparisons.
     268
     269\subsubsection{Parameter-directed (``Top-down'')}
     270Unlike Baker and Bilson, Cormack's algorithm\cite{Cormack81} requests argument candidates which match the type of each parameter of each candidate function, from the top-level expression down; memoization of these requests is presented as an optimization.
     271As presented, this algorithm requires the result of the expression to have a known type, though an algorithm based on Cormack's could reasonably request a candidate set of any return type, though such a set may be quite large.
     272
     273\subsubsection{Hybrid}
     274This proposal includes the investigation of hybrid top-down/bottom-up argument-parameter matching.
     275A reasonable hybrid approach might be to take a top-down approach when the expression to be matched is known to have a fixed type, and a bottom-up approach in untyped contexts.
     276This may include switches from one type to another at different levels of the expression tree, for instance:
     277\begin{lstlisting}
     278forall(otype T)
     279int f(T x);  // (1)
     280
     281void* f(char y);  // (2)
     282
     283int x = f( f( '!' ) );
     284\end{lstlisting}
     285Here, the outer call to ©f© must have a return type that is (implicitly convertable to) ©int©, so a top-down approach could be used to select \textit{(1)} as the proper interpretation of ©f©. \textit{(1)}'s parameter ©x© here, however, is an unbound type variable, and can thus take a value of any complete type, providing no guidance for the choice of candidate for the inner ©f©. The leaf expression ©'!'©, however, gives us a zero-cost interpretation of the inner ©f© as \textit{(2)}, providing a minimal-cost expression resolution where ©T© is bound to ©void*©.
     286
     287Deciding when to switch between bottom-up and top-down resolution in a hybrid algorithm is a necessarily heuristic process, and though finding good heuristics for it is an open question, one reasonable approach might be to switch from top-down to bottom-up when the number of candidate functions exceeds some threshold.
     288
     289\subsection{Implicit Conversion Application}
     290Baker's\cite{Baker82} and Cormack's\cite{Cormack81} algorithms do not account for implicit conversions\footnote{Baker does briefly comment on an approach for handling implicit conversions.}; both assume that there is at most one valid interpretation of a given expression for each distinct type.
     291Integrating implicit conversion handling into their algorithms provides some choice of implementation approach.
     292
     293\subsubsection{On Parameters}
     294Bilson\cite{Bilson03} did account for implicit conversions in his algorithm, but it is not clear his approach is optimal.
     295His algorithm integrates checking for valid implicit conversions into the argument-parameter matching step, essentially trading more expensive matching for a smaller number of argument interpretations.
     296This approach may result in the same subexpression being checked for a type match with the same type multiple times, though again memoization may mitigate this cost, and this approach will not generate implicit conversions that are not useful to match the containing function.
     297
     298\subsubsection{On Arguments}
     299Another approach would be to generate a set of possible implicit conversions for each set of interpretations of a given argument.
     300This would have the benefit of detecting ambiguous interpretations of arguments at the level of the argument rather than its containing call, would also never find more than one interpretation of the argument with a given type, and would re-use calculation of implicit conversions between function candidates.
     301On the other hand, this approach may unncessarily generate argument interpretations that will never match a parameter, wasting work.
     302
     303\subsection{Candidate Set Generation}
     304Cormack\cite{Cormack81}, Baker\cite{Baker82} \& Bilson\cite{Bilson03} all generate the complete set of candidate argument interpretations before attempting to match the containing function call expression.
     305However, given that the top-level expression interpretation that is ultimately chosen will be the minimal-cost valid interpretation, any consideration of non-minimal-cost interpretations is in some sense wasted work.
     306If we assume that user programmers will generally write function calls with relatively low-cost interpretations, a possible work-saving heuristic is to generate only the lowest-cost argument interpretations first, attempt to find a valid top-level interpretation using them, and only if that fails generate the higher-cost argument interpretations.
     307
     308\subsubsection{Eager}
     309Within the eager approach taken by Cormack, Baker \& Bilson, there are still variants to explore.
     310Cormack \& Baker do not account for implict conversions, and thus do not account for the possibility of multiple valid interpretations with distinct costs; Bilson, on the other hand, sorts the list of interpretations to aid in finding minimal-cost interpretations.
     311Sorting the lists of argument or function call interpretations by cost at some point during resolution may provide useful opportunities to short-circuit expression evaluation when a minimal-cost interpretation is found, though it is not clear if this short-circuiting behaviour would justify the cost of the sort.
     312
     313\subsubsection{Lazy}
     314In the presence of implicit conversions, many argument interpretations may match a given parameter by application of an appropriate implicit conversion.
     315However, if user programmers actually use relatively few implicit conversions, then the ``on arguments'' approach to implicit conversions will generate a large number of high-cost interpretations which may never be used.
     316The essence of the lazy approach to candidate set generation is to wrap the matching algorithm into the element generator of a lazy list type, only generating as few elements at a time as possible to ensure that the next-smallest-cost interpretation has been generated.
     317Assuming that argument interpretations are provided to the parameter matching algorithm in sorted order, a sorted list of function call interpretations can be produced by generating combinations of arguments sorted by total cost\footnote{The author has developed a lazy $n$-way combination generation algorithm that can perform this task.}, then generating function call interpretations in the order suggested by this list.
     318Note that the function call interpretation chosen may have costs of its own, for instance polymorphic type binding, so in some cases a number of argument combinations (any combination whose marginal cost does not exceed the cost of the function call interpretation itself) may need to be considered to determine the next-smallest-cost function call interpretation.
     319Ideally, this candidate generation approach will lead to very few unused candidates being generated (in the expected case where the user programmer has, in fact, provided a validly-typable program), but this research project will need to determine whether or not the overheads of lazy generation exceed the benefit produced from considering fewer interpretations.
     320
     321\subsubsection{Stepwise Lazy}
     322As a compromise between the trade-offs of the eager and lazy approaches, it would also be interesting to investigate a ``stepwise lazy'' approach, where all the interpretations for some ``step'' are eagerly generated, then the interpretations in the later steps are only generated on demand.
     323Under this approach the \CFA resolver could, for instance, try expression interpretations in the following order:
     324\begin{enumerate}
     325\item Interpretations with no polymorphic type binding or implicit conversions.
     326\item Interpretations containing no polymorphic type binding and at least one safe implicit conversion.
     327\item Interpretations containing polymorphic type binding, but only safe implicit conversions.
     328\item Interpretations containing at least one unsafe implicit conversion.
     329\end{enumerate}
     330If a valid expression interpretation is found in one step, it is guaranteed to be lower-cost than any interpretation in a later step (by the structure of \CFA interpretation costs), so no step after the first one where a valid interpretation can be found need be considered.
     331This may save significant amounts of work, especially given that the first steps avoid potentially expensive handling of implicit conversions and type assertion satisfaction entirely.
     332
     333%\subsection{Parameter-Directed}
     334%\textbf{TODO: Richard's algorithm isn't Baker (Cormack?), disentangle from this section \ldots}.
     335%The expression resolution algorithm used by the existing iteration of {CFA-CC} is based on Baker's\cite{Baker82} algorithm for overload resolution in Ada.
     336%The essential idea of this algorithm is to first find the possible interpretations of the most deeply nested subexpressions, then to use these interpretations to recursively generate valid interpretations of their superexpressions.
     337%To simplify matters, the only expressions considered in this discussion of the algorithm are function application and literal expressions; other expression types can generally be considered to be variants of one of these for the purposes of the resolver, \eg variables are essentially zero-argument functions.
     338%If we consider expressions as graph nodes with arcs connecting them to their subexpressions, these expressions form a DAG, generated by the algorithm from the bottom up.
     339%Literal expressions are represented by leaf nodes, annotated with the type of the expression, while a function application will have a reference to the function declaration chosen, as well as arcs to the interpretation nodes for its argument expressions; functions are annotated with their return type (or types, in the case of multiple return values).
     340%
     341%\textbf{TODO: Figure}
     342%
     343%Baker's algorithm was designed to account for name overloading; Richard Bilson\cite{Bilson03} extended this algorithm to also handle polymorphic functions, implicit conversions \& multiple return types when designing the original \CFA compiler.
     344%The core of the algorithm is a function which Baker refers to as $gen\_calls$.
     345%$gen\_calls$ takes as arguments the name of a function $f$ and a list containing the set of possible subexpression interpretations $S_j$ for each argument of the function and returns a set of possible interpretations of calling that function on those arguments.
     346%The subexpression interpretations are generally either singleton sets generated by the single valid interpretation of a literal expression, or the results of a previous call to $gen\_calls$.
     347%If there are no valid interpretations of an expression, the set returned by $gen\_calls$ will be empty, at which point resolution can cease, since each subexpression must have at least one valid interpretation to produce an interpretation of the whole expression.
     348%On the other hand, if for some type $T$ there is more than one valid interpretation of an expression with type $T$, all interpretations of that expression with type $T$ can be collapsed into a single \emph{ambiguous expression} of type $T$, since the only way to disambiguate expressions is by their return types.
     349%If a subexpression interpretation is ambiguous, than any expression interpretation containing it will also be ambiguous.
     350%In the variant of this algorithm including implicit conversions, the interpretation of an expression as type $T$ is ambiguous only if there is more than one \emph{minimal-cost} interpretation of the expression as type $T$, as cheaper expressions are always chosen in preference to more expensive ones.
     351%
     352%Given this description of the behaviour of $gen\_calls$, its implementation is quite straightforward: for each function declaration $f_i$ matching the name of the function, consider each of the parameter types $p_j$ of $f_i$, attempting to match the type of an element of $S_j$ to $p_j$ (this may include checking of implicit conversions).
     353%If no such element can be found, there is no valid interpretation of the expression using $f_i$, while if more than one such (minimal-cost) element is found than an ambiguous interpretation with the result type of $f_i$ is produced.
     354%In the \CFA variant, which includes polymorphic functions, it is possible that a single polymorphic function definition $f_i$ can produce multiple valid interpretations by different choices of type variable bindings; these interpretations are unambiguous so long as the return type of $f_i$ is different for each type binding.
     355%If all the parameters $p_j$ of $f_i$ can be uniquely matched to a candidate interpretation, then a valid interpretation based on $f_i$ and those $p_j$ is produced.
     356%$gen\_calls$ collects the produced interpretations for each $f_i$ and returns them; a top level expression is invalid if this list is empty, ambiguous if there is more than one (minimal-cost) result, or if this single result is ambiguous, and valid otherwise.
     357%
     358%In this implementation, resolution of a single top-level expression takes time $O(\ldots)$, where \ldots. \textbf{TODO:} \textit{Look at 2.3.1 in Richard's thesis when working out complexity; I think he does get the Baker algorithm wrong on combinations though, maybe\ldots}
     359%
     360%\textbf{TODO: Basic Lit Review} \textit{Look at 2.4 in Richard's thesis for any possible more-recent citations of Baker\ldots} \textit{Look back at Baker's related work for other papers that look similar to what you're doing, then check their citations as well\ldots} \textit{Look at Richard's citations in 2.3.2 w.r.t. type data structures\ldots}
     361%\textit{CormackWright90 seems to describe a solution for the same problem, mostly focused on how to find the implicit parameters}
     362
     363\section{Proposal}
     364Baker\cite{Baker82} discussed various expression resolution algorithms that could handle name overloading, but left experimental comparison of those algorithms to future work; Bilson\cite{Bilson03} described one extension of Baker's algorithm to handle implicit conversions, but did not fully explore the space of algorithmic approaches to handle both overloaded names and implicit conversions.
     365This project is intended to experimentally test a number of expression resolution algorithms which are powerful enough to handle the \CFA type system, including both name overloading and implicit conversions.
     366This comparison will close Baker's open research question, as well as potentially improving on Bilson's \CFA compiler.
     367
     368Rather than testing all of these algorithms in-place in the \CFA compiler, a resolver prototype will be developed which acts on a simplified input language encapsulating the essential details of the \CFA type system\footnote{Note that this simplified input language is not required to be a usable programming language.}.
     369Multiple variants of this resolver prototype will be implemented, each encapsulating a different expression resolution variant, sharing as much code as feasible.
     370These variants will be instrumented to test runtime performance, and run on a variety of input files; the input files may be generated programmatically or from exisiting code in \CFA or similar languages.
     371These experimental results will allow the research team to determine the algorithm likely to be most performant in practical use, and replace {CFA-CC}'s existing expression resolver with that code.
     372The experimental results will also provide some empirical sense of the compile-time cost of various language features by comparing the results of the most performant resolver variant that supports the feature with the most performant resolver variant that doesn't, a useful capability to guide language design.
     373
     374This proposed project should provide valuable data on how to implement a performant compiler for modern programming languages such as \CFA with powerful static type systems, specifically targeting the feature interaction between name overloading and implicit conversions.
     375
     376\appendix
    249377\section{Completion Timeline}
    250378The following is a preliminary estimate of the time necessary to complete the major components of this research project:
     
    252380\begin{tabular}{ | r @{--} l | p{4in} | }
    253381\hline       May 2015 & April 2016   & Project familiarization and generic types design \& implementation. \\
    254 \hline       May 2016 & April 2017   & Design \& implement prototype resolver and run performance experiments. \\
     382\hline       May 2016 & April 2017   & Design \& implement resolver prototype and run performance experiments. \\
    255383\hline       May 2017 & August 2017  & Integrate new language features and best-performing resolver prototype into {CFA-CC}. \\
    256384\hline September 2017 & January 2018 & Thesis writing \& defense. \\
     
    259387\end{center}
    260388
    261 \section{Conclusion}
    262 
    263 \newpage
    264 
     389\addcontentsline{toc}{section}{\refname}
    265390\bibliographystyle{plain}
    266391\bibliography{cfa}
    267392
    268 
    269 \addcontentsline{toc}{section}{\indexname} % add index name to table of contents
    270 \begin{theindex}
    271 Italic page numbers give the location of the main entry for the referenced term.
    272 Plain page numbers denote uses of the indexed term.
    273 Entries for grammar non-terminals are italicized.
    274 A typewriter font is used for grammar terminals and program identifiers.
    275 \indexspace
    276 \input{comp_II.ind}
    277 \end{theindex}
     393%\addcontentsline{toc}{section}{\indexname} % add index name to table of contents
     394%\begin{theindex}
     395%Italic page numbers give the location of the main entry for the referenced term.
     396%Plain page numbers denote uses of the indexed term.
     397%Entries for grammar non-terminals are italicized.
     398%A typewriter font is used for grammar terminals and program identifiers.
     399%\indexspace
     400%\input{comp_II.ind}
     401%\end{theindex}
    278402
    279403\end{document}
  • doc/bibliography/cfa.bib

    re4957e7 rbecba789  
    16231623    year        = 1974,
    16241624}
     1625
     1626@unpublished{Ditchfield:conversions,
     1627        contributer = {a3moss@uwaterloo.ca},
     1628        author = {Glen Ditchfield},
     1629        title = {Conversions for {Cforall}},
     1630        note = {\href{http://plg.uwaterloo.ca/~cforall/Conversions/index.html}{http://\-plg.uwaterloo.ca/\-\textasciitilde cforall/\-Conversions/\-index.html}},
     1631        month = {Nov},
     1632        year = {2002},
     1633        urldate = {28 July 2016},
     1634}
     1635
    16251636
    16261637@techreport{Dijkstra65,
  • doc/user/user.tex

    re4957e7 rbecba789  
    1111%% Created On       : Wed Apr  6 14:53:29 2016
    1212%% Last Modified By : Peter A. Buhr
    13 %% Last Modified On : Wed Jul 13 08:14:39 2016
    14 %% Update Count     : 1247
     13%% Last Modified On : Mon Aug  1 09:11:24 2016
     14%% Update Count     : 1271
    1515%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    1616
     
    211211however, it largely extended the language, and did not address many existing problems.\footnote{%
    212212Two important existing problems addressed were changing the type of character literals from ©int© to ©char© and enumerator from ©int© to the type of its enumerators.}
    213 \Index*{Fortran}~\cite{Fortran08}, \Index*{Ada}~\cite{Ada12}, and \Index*{Cobol}~\cite{Cobol14} are examples of programming languages that took an evolutionary approach, where modern language features (e.g., objects, concurrency) are added and problems fixed within the framework of the existing language.
     213\Index*{Fortran}~\cite{Fortran08}, \Index*{Ada}~\cite{Ada12}, and \Index*{Cobol}~\cite{Cobol14} are examples of programming languages that took an evolutionary approach, where modern language features (\eg objects, concurrency) are added and problems fixed within the framework of the existing language.
    214214\Index*{Java}~\cite{Java8}, \Index*{Go}~\cite{Go}, \Index*{Rust}~\cite{Rust} and \Index*{D}~\cite{D} are examples of the revolutionary approach for modernizing C/\CC, resulting in a new language rather than an extension of the descendent.
    215215These languages have different syntax and semantics from C, and do not interoperate directly with C, largely because of garbage collection.
     
    265265\section[Compiling CFA Program]{Compiling \CFA Program}
    266266
    267 The command ©cfa© is used to compile \CFA program(s), and is based on the GNU \Indexc{gcc} command, e.g.:
     267The command ©cfa© is used to compile \CFA program(s), and is based on the GNU \Indexc{gcc} command, \eg:
    268268\begin{lstlisting}
    269269cfa§\indexc{cfa}\index{compilation!cfa@©cfa©}§ [ gcc-options ] C/§\CFA§-files [ assembler/loader-files ]
     
    350350\section{Underscores in Constants}
    351351
    352 Numeric constants are extended to allow \Index{underscore}s within constants\index{constant!underscore}, e.g.:
     352Numeric constants are extended to allow \Index{underscore}s within constants\index{constant!underscore}, \eg:
    353353\begin{lstlisting}
    3543542®_®147®_®483®_®648;                    §\C{// decimal constant}§
     
    366366\begin{enumerate}
    367367\item
    368 A sequence of underscores is disallowed, e.g., ©12__34© is invalid.
     368A sequence of underscores is disallowed, \eg ©12__34© is invalid.
    369369\item
    370370Underscores may only appear within a sequence of digits (regardless of the digit radix).
    371 In other words, an underscore cannot start or end a sequence of digits, e.g., ©_1©, ©1_© and ©_1_© are invalid (actually, the 1st and 3rd examples are identifier names).
     371In other words, an underscore cannot start or end a sequence of digits, \eg ©_1©, ©1_© and ©_1_© are invalid (actually, the 1st and 3rd examples are identifier names).
    372372\item
    373373A numeric prefix may end with an underscore;
     
    498498\end{quote2}
    499499
    500 All type qualifiers, e.g., ©const©, ©volatile©, etc., are used in the normal way with the new declarations and also appear left to right, e.g.:
     500All type qualifiers, \eg ©const©, ©volatile©, etc., are used in the normal way with the new declarations and also appear left to right, \eg:
    501501\begin{quote2}
    502502\begin{tabular}{@{}l@{\hspace{1em}}l@{\hspace{1em}}l@{}}
     
    518518\end{tabular}
    519519\end{quote2}
    520 All declaration qualifiers, e.g., ©extern©, ©static©, etc., are used in the normal way with the new declarations but can only appear at the start of a \CFA routine declaration,\footnote{\label{StorageClassSpecifier}
    521 The placement of a storage-class specifier other than at the beginning of the declaration specifiers in a declaration is an obsolescent feature.~\cite[\S~6.11.5(1)]{C11}} e.g.:
     520All declaration qualifiers, \eg ©extern©, ©static©, etc., are used in the normal way with the new declarations but can only appear at the start of a \CFA routine declaration,\footnote{\label{StorageClassSpecifier}
     521The placement of a storage-class specifier other than at the beginning of the declaration specifiers in a declaration is an obsolescent feature.~\cite[\S~6.11.5(1)]{C11}} \eg:
    522522\begin{quote2}
    523523\begin{tabular}{@{}l@{\hspace{3em}}l@{\hspace{2em}}l@{}}
     
    542542Unsupported are K\&R C declarations where the base type defaults to ©int©, if no type is specified,\footnote{
    543543At least one type specifier shall be given in the declaration specifiers in each declaration, and in the specifier-qualifier list in each structure declaration and type name~\cite[\S~6.7.2(2)]{C11}}
    544 e.g.:
     544\eg:
    545545\begin{lstlisting}
    546546x;                                                              §\C{// int x}§
     
    612612A \Index{pointer}/\Index{reference} is a generalization of a variable name, i.e., a mutable address that can point to more than one memory location during its lifetime.
    613613(Similarly, an integer variable can contain multiple integer literals during its lifetime versus an integer constant representing a single literal during its lifetime and may not occupy storage as the literal is embedded directly into instructions.)
    614 Hence, a pointer occupies memory to store its current address, and the pointer's value is loaded by dereferencing, e.g.:
     614Hence, a pointer occupies memory to store its current address, and the pointer's value is loaded by dereferencing, \eg:
    615615\begin{quote2}
    616616\begin{tabular}{@{}ll@{}}
     
    669669Except for auto-dereferencing by the compiler, this reference example is the same as the previous pointer example.
    670670Hence, a reference behaves like the variable name for the current variable it is pointing-to.
    671 The simplest way to understand a reference is to imagine the compiler inserting a dereference operator before the reference variable for each reference qualifier in a declaration, e.g.:
     671The simplest way to understand a reference is to imagine the compiler inserting a dereference operator before the reference variable for each reference qualifier in a declaration, \eg:
    672672\begin{lstlisting}
    673673r2 = ((r1 + r2) * (r3 - r1)) / (r3 - 15);
     
    677677®*®r2 = ((®*®r1 + ®*®r2) ®*® (®**®r3 - ®*®r1)) / (®**®r3 - 15);
    678678\end{lstlisting}
    679 When a reference operation appears beside a dereference operation, e.g., ©&*©, they cancel out.\footnote{
     679When a reference operation appears beside a dereference operation, \eg ©&*©, they cancel out.\footnote{
    680680The unary ©&© operator yields the address of its operand.
    681681If the operand has type ``type'', the result has type ``pointer to type''.
     
    721721®&®crc = &cx;                                   §\C{// error, cannot change crc}§
    722722\end{lstlisting}
    723 Hence, for type ©& const©, there is no pointer assignment, so ©&rc = &x© is disallowed, and \emph{the address value cannot be ©0© unless an arbitrary pointer is assigned to the reference}, e.g.:
     723Hence, for type ©& const©, there is no pointer assignment, so ©&rc = &x© is disallowed, and \emph{the address value cannot be ©0© unless an arbitrary pointer is assigned to the reference}, \eg:
    724724\begin{lstlisting}
    725725int & const r = *0;                             §\C{// where 0 is the int * zero}§
    726726\end{lstlisting}
    727727Otherwise, the compiler is managing the addresses for type ©& const© not the programmer, and by a programming discipline of only using references with references, address errors can be prevented.
     728Finally, the position of the ©const© qualifier \emph{after} the pointer/reference qualifier causes confuse for C programmers.
     729The ©const© qualifier cannot be moved before the pointer/reference qualifier for C style-declarations;
     730\CFA-style declarations attempt to address this issue:
     731\begin{quote2}
     732\begin{tabular}{@{}l@{\hspace{3em}}l@{}}
     733\multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}}        & \multicolumn{1}{c}{\textbf{C}}        \\
     734\begin{lstlisting}
     735®const® * ®const® * const int ccp;
     736®const® & ®const® & const int ccr;
     737\end{lstlisting}
     738&
     739\begin{lstlisting}
     740const int * ®const® * ®const® ccp;
     741
     742\end{lstlisting}
     743\end{tabular}
     744\end{quote2}
     745where the \CFA declaration is read left-to-right (see \VRef{s:Declarations}).
    728746
    729747\Index{Initialization} is different than \Index{assignment} because initialization occurs on the empty (uninitialized) storage on an object, while assignment occurs on possibly initialized storage of an object.
     
    785803\section{Type Operators}
    786804
    787 The new declaration syntax can be used in other contexts where types are required, e.g., casts and the pseudo-routine ©sizeof©:
     805The new declaration syntax can be used in other contexts where types are required, \eg casts and the pseudo-routine ©sizeof©:
    788806\begin{quote2}
    789807\begin{tabular}{@{}l@{\hspace{3em}}l@{}}
     
    805823
    806824\CFA also supports a new syntax for routine definition, as well as ISO C and K\&R routine syntax.
    807 The point of the new syntax is to allow returning multiple values from a routine~\cite{Galletly96,CLU}, e.g.:
     825The point of the new syntax is to allow returning multiple values from a routine~\cite{Galletly96,CLU}, \eg:
    808826\begin{lstlisting}
    809827®[ int o1, int o2, char o3 ]® f( int i1, char i2, char i3 ) {
     
    817835\Index*{Michael Tiemann}, with help from \Index*{Doug Lea}, provided named return values in g++, circa 1989.}
    818836The value of each local return variable is automatically returned at routine termination.
    819 Declaration qualifiers can only appear at the start of a routine definition, e.g.:
     837Declaration qualifiers can only appear at the start of a routine definition, \eg:
    820838\begin{lstlisting}
    821839®extern® [ int x ] g( int y ) {§\,§}
     
    849867The inability to use \CFA declarations in these two contexts is probably a blessing because it precludes programmers from arbitrarily switching between declarations forms within a declaration contexts.
    850868
    851 C-style declarations can be used to declare parameters for \CFA style routine definitions, e.g.:
     869C-style declarations can be used to declare parameters for \CFA style routine definitions, \eg:
    852870\begin{lstlisting}
    853871[ int ] f( * int, int * );              §\C{// returns an integer, accepts 2 pointers to integers}§
     
    898916
    899917The syntax of the new routine prototype declaration follows directly from the new routine definition syntax;
    900 as well, parameter names are optional, e.g.:
     918as well, parameter names are optional, \eg:
    901919\begin{lstlisting}
    902920[ int x ] f ();                                 §\C{// returning int with no parameters}§
     
    906924\end{lstlisting}
    907925This syntax allows a prototype declaration to be created by cutting and pasting source text from the routine definition header (or vice versa).
    908 It is possible to declare multiple routine-prototypes in a single declaration, but the entire type specification is distributed across \emph{all} routine names in the declaration list (see~\VRef{s:Declarations}), e.g.:
     926It is possible to declare multiple routine-prototypes in a single declaration, but the entire type specification is distributed across \emph{all} routine names in the declaration list (see~\VRef{s:Declarations}), \eg:
    909927\begin{quote2}
    910928\begin{tabular}{@{}l@{\hspace{3em}}l@{}}
     
    919937\end{tabular}
    920938\end{quote2}
    921 Declaration qualifiers can only appear at the start of a \CFA routine declaration,\footref{StorageClassSpecifier} e.g.:
     939Declaration qualifiers can only appear at the start of a \CFA routine declaration,\footref{StorageClassSpecifier} \eg:
    922940\begin{lstlisting}
    923941extern [ int ] f (int);
     
    928946\section{Routine Pointers}
    929947
    930 The syntax for pointers to \CFA routines specifies the pointer name on the right, e.g.:
     948The syntax for pointers to \CFA routines specifies the pointer name on the right, \eg:
    931949\begin{lstlisting}
    932950* [ int x ] () fp;                      §\C{// pointer to routine returning int with no parameters}§
     
    10461064p( /* positional */, /* named */, . . . );
    10471065\end{lstlisting}
    1048 While it is possible to implement both approaches, the first possibly is more complex than the second, e.g.:
     1066While it is possible to implement both approaches, the first possibly is more complex than the second, \eg:
    10491067\begin{lstlisting}
    10501068p( int x, int y, int z, . . . );
     
    10561074In the second call, the named arguments separate the positional and ellipse arguments, making it trivial to read the call.
    10571075
    1058 The problem is exacerbated with default arguments, e.g.:
     1076The problem is exacerbated with default arguments, \eg:
    10591077\begin{lstlisting}
    10601078void p( int x, int y = 2, int z = 3. . . );
     
    12641282
    12651283As mentioned, tuples can appear in contexts requiring a list of value, such as an argument list of a routine call.
    1266 In unambiguous situations, the tuple brackets may be omitted, e.g., a tuple that appears as an argument may have its
     1284In unambiguous situations, the tuple brackets may be omitted, \eg a tuple that appears as an argument may have its
    12671285square brackets omitted for convenience; therefore, the following routine invocations are equivalent:
    12681286\begin{lstlisting}
     
    13031321
    13041322Type qualifiers, i.e., const and volatile, may modify a tuple type.
    1305 The meaning is the same as for a type qualifier modifying an aggregate type [Int99, x 6.5.2.3(7),x 6.7.3(11)], i.e., the qualifier is distributed across all of the types in the tuple, e.g.:
     1323The meaning is the same as for a type qualifier modifying an aggregate type [Int99, x 6.5.2.3(7),x 6.7.3(11)], i.e., the qualifier is distributed across all of the types in the tuple, \eg:
    13061324\begin{lstlisting}
    13071325const volatile [ int, float, const int ] x;
     
    13111329[ const volatile int, const volatile float, const volatile int ] x;
    13121330\end{lstlisting}
    1313 Declaration qualifiers can only appear at the start of a \CFA tuple declaration4, e.g.:
     1331Declaration qualifiers can only appear at the start of a \CFA tuple declaration4, \eg:
    13141332\begin{lstlisting}
    13151333extern [ int, int ] w1;
     
    13191337Unfortunately, C's syntax for subscripts precluded treating them as tuples.
    13201338The C subscript list has the form ©[i][j]...© and not ©[i, j, ...]©.
    1321 Therefore, there is no syntactic way for a routine returning multiple values to specify the different subscript values, e.g., ©f[g()]© always means a single subscript value because there is only one set of brackets.
     1339Therefore, there is no syntactic way for a routine returning multiple values to specify the different subscript values, \eg ©f[g()]© always means a single subscript value because there is only one set of brackets.
    13221340Fixing this requires a major change to C because the syntactic form ©M[i, j, k]© already has a particular meaning: ©i, j, k© is a comma expression.
    13231341\end{rationale}
     
    13801398Clearly, the types of the entities being assigned must be type compatible with the value of the expression.
    13811399
    1382 Mass assignment has parallel semantics, e.g., the statement:
     1400Mass assignment has parallel semantics, \eg the statement:
    13831401\begin{lstlisting}
    13841402[ x, y, z ] = 1.5;
     
    14691487\section{Unnamed Structure Fields}
    14701488
    1471 C requires each field of a structure to have a name, except for a bit field associated with a basic type, e.g.:
     1489C requires each field of a structure to have a name, except for a bit field associated with a basic type, \eg:
    14721490\begin{lstlisting}
    14731491struct {
    1474         int f1;                 // named field
    1475         int f2 : 4;             // named field with bit field size
    1476         int : 3;                // unnamed field for basic type with bit field size
    1477         int ;                   // disallowed, unnamed field
    1478         int *;                  // disallowed, unnamed field
    1479         int (*)(int);   // disallowed, unnamed field
     1492        int f1;                                 §\C{// named field}§
     1493        int f2 : 4;                             §\C{// named field with bit field size}§
     1494        int : 3;                                §\C{// unnamed field for basic type with bit field size}§
     1495        int ;                                   §\C{// disallowed, unnamed field}§
     1496        int *;                                  §\C{// disallowed, unnamed field}§
     1497        int (*)(int);                   §\C{// disallowed, unnamed field}§
    14801498};
    14811499\end{lstlisting}
    14821500This requirement is relaxed by making the field name optional for all field declarations; therefore, all the field declarations in the example are allowed.
    14831501As for unnamed bit fields, an unnamed field is used for padding a structure to a particular size.
    1484 A list of unnamed fields is also supported, e.g.:
     1502A list of unnamed fields is also supported, \eg:
    14851503\begin{lstlisting}
    14861504struct {
    1487         int , , ;               // 3 unnamed fields
     1505        int , , ;                               §\C{// 3 unnamed fields}§
    14881506}
    14891507\end{lstlisting}
     
    14981516§\emph{expr}§ -> [ §\emph{fieldlist}§ ]
    14991517\end{lstlisting}
    1500 \emph{expr} is any expression yielding a value of type record, e.g., ©struct©, ©union©.
     1518\emph{expr} is any expression yielding a value of type record, \eg ©struct©, ©union©.
    15011519Each element of \emph{ fieldlist} is an element of the record specified by \emph{expr}.
    15021520A record-field tuple may be used anywhere a tuple can be used. An example of the use of a record-field tuple is
     
    17601778}
    17611779\end{lstlisting}
    1762 While the declaration of the local variable ©y© is useful and its scope is across all ©case© clauses, the initialization for such a variable is defined to never be executed because control always transfers over it.
    1763 Furthermore, any statements before the first ©case© clause can only be executed if labelled and transferred to using a ©goto©, either from outside or inside of the ©switch©.
    1764 As mentioned, transfer into control structures should be forbidden;
    1765 transfers from within the ©switch© body using a ©goto© are equally unpalatable.
    1766 As well, the declaration of ©z© is cannot occur after the ©case© because a label can only be attached to a statement, and without a fall through to case 3, ©z© is uninitialized.
     1780While the declaration of the local variable ©y© is useful with a scope across all ©case© clauses, the initialization for such a variable is defined to never be executed because control always transfers over it.
     1781Furthermore, any statements before the first ©case© clause can only be executed if labelled and transferred to using a ©goto©, either from outside or inside of the ©switch©, both of which are problematic.
     1782As well, the declaration of ©z© cannot occur after the ©case© because a label can only be attached to a statement, and without a fall through to case 3, ©z© is uninitialized.
     1783The key observation is that the ©switch© statement branches into control structure, i.e., there are multiple entry points into its statement body.
    17671784\end{enumerate}
    17681785
     
    17781795and there is only a medium amount of fall-through from one ©case© clause to the next, and most of these result from a list of case values executing common code, rather than a sequence of case actions that compound.
    17791796\end{itemize}
    1780 These observations help to put the suggested changes to the ©switch© into perspective.
     1797These observations help to put the \CFA changes to the ©switch© into perspective.
    17811798\begin{enumerate}
    17821799\item
    17831800Eliminating default fall-through has the greatest potential for affecting existing code.
    1784 However, even if fall-through is removed, most ©switch© statements would continue to work because of the explicit transfers already present at the end of each ©case© clause, the common placement of the ©default© clause at the end of the case list, and the most common use of fall-through, i.e., a list of ©case© clauses executing common code, e.g.:
    1785 \begin{lstlisting}
     1801However, even if fall-through is removed, most ©switch© statements would continue to work because of the explicit transfers already present at the end of each ©case© clause, the common placement of the ©default© clause at the end of the case list, and the most common use of fall-through, i.e., a list of ©case© clauses executing common code, \eg:
     1802 \begin{lstlisting}
    17861803case 1:  case 2:  case 3: ...
    17871804\end{lstlisting}
    17881805still work.
    17891806Nevertheless, reversing the default action would have a non-trivial effect on case actions that compound, such as the above example of processing shell arguments.
    1790 Therefore, to preserve backwards compatibility, it is necessary to introduce a new kind of ©switch© statement, called ©choose©, with no implicit fall-through semantics and an explicit fall-through if the last statement of a case-clause ends with the new keyword ©fallthru©, e.g.:
     1807<<<<<<< HEAD
     1808Therefore, to preserve backwards compatibility, it is necessary to introduce a new kind of ©switch© statement, called ©choose©, with no implicit fall-through semantics and an explicit fall-through if the last statement of a case-clause ends with the new keyword ©fallthru©, \eg:
     1809=======
     1810Therefore, to preserve backwards compatibility, it is necessary to introduce a new kind of ©switch© statement, called ©choose©, with no implicit fall-through semantics and an explicit fall-through if the last statement of a case-clause ends with the new keyword ©fallthrough©/©fallthru©, e.g.:
     1811>>>>>>> 080615890f586cb9954c252b55cab47f52c25758
    17911812\begin{lstlisting}
    17921813®choose® ( i ) {
     
    18151836Therefore, no change is made for this issue.
    18161837\item
    1817 Dealing with unreachable code in a ©switch©/©choose© body is solved by restricting declarations and associated initialization to the start of statement body, which is executed \emph{before} the transfer to the appropriate ©case© clause.\footnote{
    1818 Essentially, these declarations are hoisted before the statement and both declarations and statement are surrounded by a compound statement.} and precluding statements before the first ©case© clause.
    1819 Further declaration in the statement body are disallowed.
     1838Dealing with unreachable code in a ©switch©/©choose© body is solved by restricting declarations and associated initialization to the start of statement body, which is executed \emph{before} the transfer to the appropriate ©case© clause\footnote{
     1839Essentially, these declarations are hoisted before the ©switch©/©choose© statement and both declarations and statement are surrounded by a compound statement.} and precluding statements before the first ©case© clause.
     1840Further declarations at the same nesting level as the statement body are disallowed to ensure every transfer into the body is sound.
    18201841\begin{lstlisting}
    18211842switch ( x ) {
    1822         ®int i = 0;®                            §\C{// allowed
     1843        ®int i = 0;®                            §\C{// allowed only at start
    18231844  case 0:
    18241845        ...
    1825         ®int i = 0;®                            §\C{// disallowed}§
     1846        ®int j = 0;®                            §\C{// disallowed}§
    18261847  case 1:
    18271848    {
    1828                 ®int i = 0;®                    §\C{// allowed in any compound statement
     1849                ®int k = 0;®                    §\C{// allowed at different nesting levels
    18291850                ...
    18301851        }
     
    27072728Like the \Index*[C++]{\CC} lexical problem with closing template-syntax, e.g, ©Foo<Bar<int®>>®©, this issue can be solved with a more powerful lexer/parser.
    27082729
    2709 There are several ambiguous cases with operator identifiers, e.g., ©int *?*?()©, where the string ©*?*?© can be lexed as ©*©/©?*?© or ©*?©/©*?©.
    2710 Since it is common practise to put a unary operator juxtaposed to an identifier, e.g., ©*i©, users will be annoyed if they cannot do this with respect to operator identifiers.
     2730There are several ambiguous cases with operator identifiers, \eg ©int *?*?()©, where the string ©*?*?© can be lexed as ©*©/©?*?© or ©*?©/©*?©.
     2731Since it is common practise to put a unary operator juxtaposed to an identifier, \eg ©*i©, users will be annoyed if they cannot do this with respect to operator identifiers.
    27112732Even with this special hack, there are 5 general cases that cannot be handled.
    27122733The first case is for the function-call identifier ©?()©:
     
    27732794This means that a function requiring mutual exclusion could block if the lock is already held by another thread.
    27742795Blocking on a monitor lock does not block the kernel thread, it simply blocks the user thread, which yields its kernel thread while waiting to obtain the lock.
    2775 If multiple mutex parameters are specified, they will be locked in parameter order (i.e. first parameter is locked first) and unlocked in the
     2796If multiple mutex parameters are specified, they will be locked in parameter order (\ie first parameter is locked first) and unlocked in the
    27762797reverse order.
    27772798\begin{lstlisting}
     
    43424363
    43434364
     4365\section{New Keywowrds}
     4366
     4367©catch©, ©catchResume©, ©choose©, \quad ©disable©, ©dtype©, \quad ©enable©, \quad ©fallthrough©, ©fallthru©, ©finally©, ©forall©, ©ftype©, \quad ©lvalue©, \quad ©otype©, \quad ©throw©, ©throwResume©, ©trait©, ©try©
     4368
     4369
    43444370\section{Incompatible}
    43454371
     
    44714497\CFA is C \emph{incompatible} on this issue, and provides semantics similar to \Index*[C++]{\CC}.
    44724498Nested types are not hoisted and can be referenced using the field selection operator ``©.©'', unlike the \CC scope-resolution operator ``©::©''.
    4473 Given that nested types in C are equivalent to not using them, i.e., they are essentially useless, it is unlikely there are any realistic usages that break because of this incompatibility.
     4499Given that nested types in C are equivalent to not using them, \ie they are essentially useless, it is unlikely there are any realistic usages that break because of this incompatibility.
    44744500\end{description}
    44754501
     
    51625188\label{s:RationalNumbers}
    51635189
    5164 Rational numbers are numbers written as a ratio, i.e., as a fraction, where the numerator (top number) and the denominator (bottom number) are whole numbers.
     5190Rational numbers are numbers written as a ratio, \ie as a fraction, where the numerator (top number) and the denominator (bottom number) are whole numbers.
    51655191When creating and computing with rational numbers, results are constantly reduced to keep the numerator and denominator as small as possible.
    51665192
Note: See TracChangeset for help on using the changeset viewer.