1 | Cforall Going Forward
|
---|
2 | =====================
|
---|
3 | When I was catching up with a friend once, they asked me, roughly: "Is Cforall
|
---|
4 | a real language that should be taken seriously?" At the time I had to answer:
|
---|
5 | "Not yet." So how do we make than answer yes?
|
---|
6 |
|
---|
7 | This is my attempt to answer that question, with a balance of a full answer
|
---|
8 | and not getting into the weeds on every little issue. Taking into account
|
---|
9 | various general language principles along with the particular goals of
|
---|
10 | Cforall.
|
---|
11 |
|
---|
12 | The main goal of Cforall is to be a modern evolution of C.
|
---|
13 | "Evolution" means improvements to C that do not change the fundamentals (the
|
---|
14 | paradigm) of the language. There are some quality of life features and also
|
---|
15 | changes to help provide safety, but not things like adding OO programming.
|
---|
16 |
|
---|
17 | Also the phrase "Describe, not Prescribe" shows up around the language. It
|
---|
18 | has less concrete uses, but generally means to keep things flexable, avoid
|
---|
19 | disruption and artificial constraints.
|
---|
20 |
|
---|
21 | And with that set-up, we are going to go by problem area:
|
---|
22 |
|
---|
23 | Back-end And Code Generation
|
---|
24 | ----------------------------
|
---|
25 | I don't have anything new to say here. But we have known that our output
|
---|
26 | format could use some work for a long time.
|
---|
27 |
|
---|
28 | The long term dream is to rework the code generation from GCC code to LLVM
|
---|
29 | so we can control the entire lowering process. This would be a massive
|
---|
30 | undertaking and is out-of-scope in the immediate future, but still may be
|
---|
31 | necessary in the long run.
|
---|
32 |
|
---|
33 | Optimizations (Run Time Too)
|
---|
34 | ----------------------------
|
---|
35 | I think we need at least one more order of magnitude speed up on compilation.
|
---|
36 | There are also some optimizations that could speed up run-time.
|
---|
37 |
|
---|
38 | My "favourite" example is that "#include <iostream.hfa>" increases the
|
---|
39 | compilation time by about 4 seconds, without actually using anything from it.
|
---|
40 | In a production language, if including the standard I/O library increased
|
---|
41 | the compilation time by a half second, that would likely be an awkwardly
|
---|
42 | long delay already. That means we should be targetting another 10x speed-up
|
---|
43 | at least. We do not believe we can get that from just quality of
|
---|
44 | impliminations matters.
|
---|
45 |
|
---|
46 | I don't know where to find those impovements, although "closed traits"
|
---|
47 | (moving away from call-site binding) might help a lot.
|
---|
48 |
|
---|
49 | Run-time could also use some optimizations. There are less pain points here,
|
---|
50 | and the concurrency tools in particular are actually blazingly fast.
|
---|
51 | But I believe there is a lot of low level optimizations that we should look
|
---|
52 | at, if we are replacing C we need to be fast in every case we can.
|
---|
53 |
|
---|
54 | For example, how expensive are all those assertions at run time? Could we
|
---|
55 | use specializations (rewriting a polymorphic function to be a monomorphic
|
---|
56 | body to apply more optimizations) to make large polymorphic functions run
|
---|
57 | faster?
|
---|
58 |
|
---|
59 | Advantages of C
|
---|
60 | ---------------
|
---|
61 | Although it is dated, C is a good programming language with many positive
|
---|
62 | features. We may have watered down some down too much.
|
---|
63 |
|
---|
64 | C's strength is in how close to assembly code it is. It maps very closely to
|
---|
65 | the underlying assembly instructions and memory allocations. Some of our
|
---|
66 | larger control structures definitely break away from this, but usually only
|
---|
67 | locally. Destructors and execeptions, or the underlying stack unwinding code,
|
---|
68 | spills out more, but is acceptable in most contexts (see C++).
|
---|
69 |
|
---|
70 | The serious issue are memory allocations in headers, particularly the
|
---|
71 | `__attribute__(( cfa_linkonce ))` implicitly generated by some other
|
---|
72 | language features. The less serious but definite issue is the allocation
|
---|
73 | control and optimization this takes away from the programmer. The possibly
|
---|
74 | critical issue that I haven't proven is that linkonce seems to leave multiple
|
---|
75 | copies behind, just changes all uses to a single copy. By release it has
|
---|
76 | to work on system headers and different platforms.
|
---|
77 |
|
---|
78 | Related, compatability with C is still important. Reducing existing
|
---|
79 | incompatabilities, and new ones that might be discovered later, is good.
|
---|
80 | Right now, the only outstanding issue is support for C23 attributes.
|
---|
81 | (As of writing, the fix it there, but has caused the test build to fail.)
|
---|
82 |
|
---|
83 | Disadvantages of C
|
---|
84 | ------------------
|
---|
85 | There are some fundamental issues with C that we have not addressed.
|
---|
86 |
|
---|
87 | The most notable is `#include`, or the the handling of modules. Even C++ has
|
---|
88 | tried to move away from it. That has been unsuccessful, but it shows why
|
---|
89 | there is a need/want for this kind of feature. It is a more powerful tool
|
---|
90 | for separate compilation and name management than simple copy-pasted include
|
---|
91 | system, it can be used to fix some header/implementation issues and it means
|
---|
92 | less recompilation of header code. That last one could really help with some
|
---|
93 | compilation time issues, as headers are often the vast majority of the
|
---|
94 | compilation time of small files.
|
---|
95 |
|
---|
96 | Visibilty could also be improved. C linkage has two settings for global
|
---|
97 | declarations, internal linkage (static) and external linkage (non-static).
|
---|
98 | This is not enough granularity and use `#pragma GCC visibility` to make
|
---|
99 | these names visible outside the translation unit, but only within the given
|
---|
100 | library, an extremely useful option that is not avalible without extensions.
|
---|
101 |
|
---|
102 | In fact another general rule would be to remove the need for extensions,
|
---|
103 | or at least standardize them, including `#pragma` directives.
|
---|
104 |
|
---|
105 | Error Messages
|
---|
106 | --------------
|
---|
107 | We need to improve error messages. Explaining what is wrong when something
|
---|
108 | goes wrong is critical for the user experience, and ours are not up to that.
|
---|
109 |
|
---|
110 | The most general issue is that code samples are impossible to read. Honestly,
|
---|
111 | just switching to code-dump format might help, but the best output may be
|
---|
112 | refering back to the original text with a highlight.
|
---|
113 |
|
---|
114 | Then the formatting of individual error messages should also be reviewed.
|
---|
115 | Consider resolution errors that print the resolution cost as just labelled
|
---|
116 | as "Cost" with no clarification about what all the elements are or mean.
|
---|
117 | I don't know the best solution here, perhaps it needs to be fully labelled,
|
---|
118 | perhaps it should be dropped as noise, perhaps there is a single element of
|
---|
119 | the tuple that should be highlighted.
|
---|
120 |
|
---|
121 | Every error message could probably do with some improvement. This may also
|
---|
122 | be accompained by impovements to the tools in the compiler to build and
|
---|
123 | format error messages.
|
---|
124 |
|
---|
125 | Feature Integration
|
---|
126 | -------------------
|
---|
127 | There are many features that need to be reevaluated in combination with each
|
---|
128 | other. Fewer more flexible features is less to learn and fits in with the
|
---|
129 | larger "Describe, not Prescribe" design philosophy. And for on the language
|
---|
130 | development side, it can mean less to maintain and document.
|
---|
131 |
|
---|
132 | Using is virtuals and virtual destructors as an example. These
|
---|
133 | are two completely unrelated features despite their similar names and very
|
---|
134 | related purposes. They probably have some common implimentation they could
|
---|
135 | share, if they cannot be combined into one general mechanism.
|
---|
136 |
|
---|
137 | There are also single features that could be generalized. For example,
|
---|
138 | enum-indexed arrays are a generalization of typed arrays / arrays with
|
---|
139 | payloads. At the very least, enum-indexing should be implemented and typed
|
---|
140 | arrays implemented in terms of enum-indexing.
|
---|
141 |
|
---|
142 | The extreme case is "fallthrough" vs. "fallthru", this has already been
|
---|
143 | handled but makes for a great example because they are interchangable and
|
---|
144 | having both added nothing except room for confusion.
|
---|
145 |
|
---|
146 | Object Orientated Inspiration
|
---|
147 | -----------------------------
|
---|
148 | Now inspiration is good, but there are a features that feel like they
|
---|
149 | were included because we like them in C++ without really considering how
|
---|
150 | well they would fit into Cforall.
|
---|
151 |
|
---|
152 | The poster child for this has to be my very own exception matching.
|
---|
153 | Not the throw/catch itself, but the object matching via the virtual system,
|
---|
154 | which uses an OO type hierarchy despite the fact there isn't one in the
|
---|
155 | language for anything else.
|
---|
156 | It adds a lot of things to the language for a single user facing feature.
|
---|
157 | And those bits are hidden, some of the extra pieces needed for the mock
|
---|
158 | methods are painfully obvious to the user.
|
---|
159 | I think the entire exception system should be reconsidered without the
|
---|
160 | assumptions of OO programming.
|
---|
161 |
|
---|
162 | There are also small examples, like name qualification. Name qualification
|
---|
163 | is the same a writing longer names unless there are contexts where you can
|
---|
164 | use the name unqualified. As of writing, CFA has almost no situations where
|
---|
165 | you can use the unqualified name.
|
---|
166 |
|
---|
167 | Experianced Revisits
|
---|
168 | --------------------
|
---|
169 | Now most of these sections are about some feature that should be added,
|
---|
170 | changed or removed. This last group has no real root cause, but have just
|
---|
171 | based on additional experience with Cforall.
|
---|
172 |
|
---|
173 | I think the tuple redesign is a good example. Even the small fix around
|
---|
174 | unary tuples fixed some long standing conflicts with designators. And that
|
---|
175 | was only one of the syntax changes. For semantics, experiance has given us
|
---|
176 | more information about what features are used and how they are used in
|
---|
177 | practice. Updating the feature to take advantage of that is great.
|
---|
178 |
|
---|
179 | In the extreme case some features could just be cut because of disuse.
|
---|
180 | Right now the most likely candidate is the alternate type syntax,
|
---|
181 | but that is not a given.
|
---|
182 |
|
---|
183 | Functional Programming
|
---|
184 | ----------------------
|
---|
185 | You are going to need someone else to try and explain functional programming.
|
---|
186 | That person could be you, dear reader, anyone can be a functional programmer!
|
---|
187 |
|
---|
188 | The people with functional programming experience are leaving soon, so the
|
---|
189 | team is going to have to find some other way to try and research those
|
---|
190 | comparisons.
|
---|