Cforall Going Forward ===================== When I was catching up with a friend once, they asked me, roughly: "Is Cforall a real language that should be taken seriously?" At the time I had to answer: "Not yet." So how do we make than answer yes? This is my attempt to answer that question, with a balance of a full answer and not getting into the weeds on every little issue. Taking into account various general language principles along with the particular goals of Cforall. The main goal of Cforall is to be a modern evolution of C. "Evolution" means improvements to C that do not change the fundamentals (the paradigm) of the language. There are some quality of life features and also changes to help provide safety, but not things like adding OO programming. Also the phrase "Describe, not Prescribe" shows up around the language. It has less concrete uses, but generally means to keep things flexable, avoid disruption and artificial constraints. And with that set-up, we are going to go by problem area: Back-end And Code Generation ---------------------------- I don't have anything new to say here. But we have known that our output format could use some work for a long time. The long term dream is to rework the code generation from GCC code to LLVM so we can control the entire lowering process. This would be a massive undertaking and is out-of-scope in the immediate future, but still may be necessary in the long run. Optimizations (Run Time Too) ---------------------------- I think we need at least one more order of magnitude speed up on compilation. There are also some optimizations that could speed up run-time. My "favourite" example is that "#include " increases the compilation time by about 4 seconds, without actually using anything from it. In a production language, if including the standard I/O library increased the compilation time by a half second, that would likely be an awkwardly long delay already. That means we should be targetting another 10x speed-up at least. We do not believe we can get that from just quality of impliminations matters. I don't know where to find those impovements, although "closed traits" (moving away from call-site binding) might help a lot. Run-time could also use some optimizations. There are less pain points here, and the concurrency tools in particular are actually blazingly fast. But I believe there is a lot of low level optimizations that we should look at, if we are replacing C we need to be fast in every case we can. For example, how expensive are all those assertions at run time? Could we use specializations (rewriting a polymorphic function to be a monomorphic body to apply more optimizations) to make large polymorphic functions run faster? Advantages of C --------------- Although it is dated, C is a good programming language with many positive features. We may have watered down some down too much. C's strength is in how close to assembly code it is. It maps very closely to the underlying assembly instructions and memory allocations. Some of our larger control structures definitely break away from this, but usually only locally. Destructors and execeptions, or the underlying stack unwinding code, spills out more, but is acceptable in most contexts (see C++). The serious issue are memory allocations in headers, particularly the `__attribute__(( cfa_linkonce ))` implicitly generated by some other language features. The less serious but definite issue is the allocation control and optimization this takes away from the programmer. The possibly critical issue that I haven't proven is that linkonce seems to leave multiple copies behind, just changes all uses to a single copy. By release it has to work on system headers and different platforms. Related, compatability with C is still important. Reducing existing incompatabilities, and new ones that might be discovered later, is good. Right now, the only outstanding issue is support for C23 attributes. (As of writing, the fix it there, but has caused the test build to fail.) Disadvantages of C ------------------ There are some fundamental issues with C that we have not addressed. The most notable is `#include`, or the the handling of modules. Even C++ has tried to move away from it. That has been unsuccessful, but it shows why there is a need/want for this kind of feature. It is a more powerful tool for separate compilation and name management than simple copy-pasted include system, it can be used to fix some header/implementation issues and it means less recompilation of header code. That last one could really help with some compilation time issues, as headers are often the vast majority of the compilation time of small files. Visibilty could also be improved. C linkage has two settings for global declarations, internal linkage (static) and external linkage (non-static). This is not enough granularity and use `#pragma GCC visibility` to make these names visible outside the translation unit, but only within the given library, an extremely useful option that is not avalible without extensions. In fact another general rule would be to remove the need for non-C extensions, or at least standardize them, including `#pragma` directives. Error Messages -------------- We need to improve error messages. Explaining what is wrong when something goes wrong is critical for the user experience, and ours are not up to that. The most general issue is that code samples are impossible to read. Honestly, just switching to code-dump format might help, but the best output may be refering back to the original text with a highlight. Then the formatting of individual error messages should also be reviewed. Consider resolution errors that print the resolution cost as just labelled as "Cost" with no clarification about what all the elements are or mean. I don't know the best solution here, perhaps it needs to be fully labelled, perhaps it should be dropped as noise, perhaps there is a single element of the tuple that should be highlighted. Every error message could probably do with some improvement. This may also be accompained by impovements to the tools in the compiler to build and format error messages. Feature Integration ------------------- There are many features that need to be reevaluated in combination with each other. Fewer more flexible features is less to learn and fits in with the larger "Describe, not Prescribe" design philosophy. And for on the language development side, it can mean less to maintain and document. Using virtuals and virtual destructors as an example. These are two completely unrelated features despite their similar names and very related purposes. They probably have some common implimentation they could share, if they cannot be combined into one general mechanism. There are also single features that could be generalized. For example, enum-indexed arrays are a generalization of typed arrays / arrays with payloads. At the very least, enum-indexing should be implemented and typed arrays implemented in terms of enum-indexing. The extreme case is "fallthrough" vs. "fallthru", this has already been handled but makes for a great example because they are interchangable and having both added nothing except room for confusion. Object Orientated Inspiration ----------------------------- Now inspiration is good, but there are a features that feel like they were included because we like them in C++ without really considering how well they would fit into Cforall. The poster child for this has to be my very own exception matching. Not the throw/catch itself, but the object matching via the virtual system, which uses an OO type hierarchy despite the fact there isn't one in the language for anything else. It adds a lot of things to the language for a single user facing feature. And those bits are hidden, some of the extra pieces needed for the mock methods are painfully obvious to the user. I think the entire exception system should be reconsidered without the assumptions of OO programming. There are also small examples, like name qualification. Name qualification is the same a writing longer names unless there are contexts where you can use the name unqualified. As of writing, CFA has almost no situations where you can use the unqualified name. Experianced Revisits -------------------- Now most of these sections are about some feature that should be added, changed or removed. This last group has no real root cause, but have just based on additional experience with Cforall. I think the tuple redesign is a good example. Even the small fix around unary tuples fixed some long standing conflicts with designators. And that was only one of the syntax changes. For semantics, experiance has given us more information about what features are used and how they are used in practice. Updating the feature to take advantage of that is great. In the extreme case some features could just be cut because of disuse. Right now the most likely candidate is the alternate type syntax, but that is not a given. Functional Programming ---------------------- You are going to need someone else to try and explain functional programming. That person could be you, dear reader, anyone can be a functional programmer! The people with functional programming experience are leaving soon, so the team is going to have to find some other way to try and research those comparisons.