Changeset 710623a


Ignore:
Timestamp:
Feb 11, 2026, 11:03:24 AM (9 hours ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Parents:
3151bc09
Message:

first proofread of module proposal

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/proposals/modules-alvin/1_stitched_modules/stitched_modules.md

    r3151bc09 r710623a  
    33## Background context
    44
    5 C doesn't have modules -- instead, C relies on the programmer (+ preprocessor) to insert the references to any symbols that are defined in other files. This is because the C compiler only processes one translation unit from top to bottom, so you have to give it everything, and in the correct order.
    6 
    7 So our C module system simply analyzes other modules and extracts any imported information to give to the compiler, right? Yes, but it's a bit tricky because C is a systems programming language, which means we care a lot about how our code is compiled. An object-oriented approach hides a class' implementation behind a pointer and a vtable (also a pointer), so importers can use a class without even knowing its size. This doesn't work for C because it has unboxed types, so we need to expose information to the compiler.
     5<span style="color:red">PAB</span> Modules are a software engineering mechanism providing information control (hiding), separate compilation, and initialization.
     6
     7C doesn't have modules
     8
     9<span style="color:red">PAB</span> C provides a complex form of module through forward declarations and definitions, `#include`, and translation units with `extern` and `static` visibility control.
     10
     11-- instead, C relies on the programmer (+ preprocessor) to insert the references to any symbols that are defined in other files. This is because the C compiler only processes one translation unit from top to bottom, so you have to give it everything, and in the correct order.
     12
     13<span style="color:red">PAB</span> Like many programming language, C requires definition before use (DBU).
     14A consequence of DBU is cyclic dependences of types and routines.
     15
     16```
     17struct Y {                                              void f() {
     18        struct Z { ... };                                       if ( ! base-case ) g();  // DBU
     19        struct X x;  // DBU                                     ...
     20};                                                              }
     21struct X  {                                             void g() {
     22        struct Y.Z z;                                           if ( ! base-case ) f();
     23};                                                                      ...
     24                                                                }
     25```     
     26For recursive types, the issue is knowing each type's size.
     27To break the cycle requires a forward declaration of one type name and replacing that type in the cycle with a pointer/reference to it.
     28For recursive routines, the issue is knowing each routine's code for inlining.
     29To break the cycle requires a forward declaration of one routine, which precludes inlining its calls until it is defined.
     30Finally, most (non-functional) languages do not support pure recursive data types.
     31
     32```
     33struct S {
     34        struct T t;  // DBU
     35        struct S s;  // recursive type
     36};
     37struct T {
     38        struct S s;  // recursive type
     39};
     40```
     41In theory, these types are infinitely large without some other semantic meaning.
     42
     43So our C module system simply analyzes other modules and extracts any imported information to give to the compiler, right? Yes, but it's a bit tricky because C is a systems programming language, which means we care a lot about how our code is compiled.
     44
     45<span style="color:red">PAB</span> A systems language allows direct access to the hardware and violations of the type system to allow raw storage to be accessed.
     46Dealing with DBU is a separate problem.
     47
     48An object-oriented approach hides a class' implementation behind a pointer and a vtable (also a pointer), so importers can use a class without even knowing its size.
     49
     50<span style="color:red">PAB</span> The issue here is garbage collection (GC) not OO.
     51Languages *without* GC can place objects on the stack or heap, where objects on the stack are not hidden behind a pointer.
     52
     53This doesn't work for C because it has unboxed types, so we need to expose information to the compiler.
     54
     55<span style="color:red">PAB</span> In Cforall, types are (largely) treated uniformly, so basic types can be treated like object (`struct`) types.
     56The reason this is possible is that constructors/destructors are not restricted to members (we have no members).
     57So any type can be boxed or unboxed.
     58
     59```
     60void ?{}( int & this ) { this = 42; }  // constructor for basic type
     61int main() {
     62        int i, j, k;
     63        sout | i | j | k;
     64}
     6542 42 42
     66```
    867
    968### A previous attempt
    1069
    11 A previous attempt at C modules used "type stubs", meaning that each exported type would generate a "stub" type with only size and alignment information. Any function that returned the exported type would return the "stub" type instead, so importers wouldn't need to know any implementation details unless they imported the actual type (which would use type-punning to convert between the two). This approach unfortunately doesn't work in C because type-punning breaks strict aliasing rules, and the C spec allows small structs to be unpacked when used as arguments into functions. Additionally, extracting size and alignment information can require analyzing the entire codebase -- if we have to do all that to make just unboxed types work, perhaps there are better options.
     70A previous attempt at C modules used "type stubs", meaning that each exported type would generate a "stub" type with only size and alignment information.
     71
     72<span style="color:red">PAB</span> Your "stubs" are how Cforall handles polymorphic types.
     73
     74Any function that returned the exported type would return the "stub" type instead, so importers wouldn't need to know any implementation details unless they imported the actual type (which would use type-punning to convert between the two). This approach unfortunately doesn't work in C because type-punning breaks strict aliasing rules, and the C spec allows small structs to be unpacked when used as arguments into functions. Additionally, extracting size and alignment information can require analyzing the entire codebase -- if we have to do all that to make just unboxed types work, perhaps there are better options.
     75
     76<span style="color:red">PAB</span> You are trying to address an information hiding issue.
     77The entire type definition is needed to know its size so it must be unboxed.
     78Given the entire type definition are fields private or public providing strong abstraction?
     79Assuming strong abstraction, is it commercially inappropriate to show implementation?
     80The PImpl pattern (`https://en.cppreference.com/w/cpp/language/pimpl.html`) addresses both of these concerns.
     81And more fundamentally, this approach changes the calling convention to legacy code.
     82Otherwise, the compiler must have sudo to access source libraries that are not publicly accessible.
     83
    1284
    1385### Other languages
    1486
    15 Let's take a look at some other systems programming languages to see how they do it. C and C++ use header files, as described above. C++20 modules need to be compiled before they can be imported, which makes them acyclic. Rust compiles an entire crate all at once, so modules are essentially namespaces, and modules can import freely within a crate. Zig modules are implicitly structs, and are used by assigning the import to a name.
    16 
    17 C and C++ header files lead to a lot of manual module management, where the programmer has to ensure the .h and its corresponding .c file stay in sync. It should be possible to condense this workflow into a single file without any practical loss of functionality. C++20 modules are acyclic, forcing any mutually recursive structures into the same module -- this doesn't work with the granularity of .c/.h files, which frequently share declarations with each other. Rust modules rely on whole-crate compilation, which clashes with the C philosophy of separate compilation. Zig modules share many similarities with this prototype, and present some interesting avenues for further development. As such, we will discuss Zig modules after presenting the prototype.
     87Let's take a look at some other systems programming languages to see how they do it. C and C++ use header files, as described above. C++20 modules need to be compiled before they can be imported, which makes them acyclic.
     88
     89<span style="color:red">PAB</span> What do you mean by acyclic here?
     90
     91Rust compiles an entire crate all at once, so modules are essentially namespaces, and modules can import freely within a crate. Zig modules are implicitly structs, and are used by assigning the import to a name.
     92
     93<span style="color:red">PAB</span> Modules can correspond one-to-one to files. This is the approach used in Python, Ruby, JavaScript, and others. Modules can correspond one-to-one to directories. This is the approach that Go uses. Modules can correspond to a combination of files and directories as well; this is the case with Rust, Java, and others. `https://denisdefreyne.com/notes/zlc9l-nrkfw-wztwz`
     94
     95C and C++ header files lead to a lot of manual module management, where the programmer has to ensure the .h and its corresponding .c file stay in sync. It should be possible to condense this workflow into a single file without any practical loss of functionality.
     96
     97<span style="color:red">PAB</span> `.h` files cannot be removed. They are fundamental to C and its development ecosystem. What you are trying to do is auto-generate them.
     98
     99C++20 modules are acyclic, forcing any mutually recursive structures into the same module -- this doesn't work with the granularity of .c/.h files, which frequently share declarations with each other. Rust modules rely on whole-crate compilation, which clashes with the C philosophy of separate compilation. Zig modules share many similarities with this prototype, and present some interesting avenues for further development. As such, we will discuss Zig modules after presenting the prototype.
     100
     101<span style="color:red">PAB</span> The concern about recursive references is a DBU issue, which is orthogonal to modules, i.e., languages exist with DBU and modules.
     102To remove DBU, requires a multi-pass compiler and often *whole-program* compilation to see the unboxed types.
    18103
    19104## The prototype
     
    63148```
    64149
     150<span style="color:red">PAB</span> Here is the same example broken down into DBU and recursive data-types explanation.
     151
     152
     153```
     154struct S1 {
     155        struct S2 s2;   // DBU as S2 in separate TU => whole-program compilation
     156};
     157struct S3 {
     158        struct S4 * s4p; // DBU, but recursive data type with S4 => must use pointer
     159};
     160struct S2 {
     161        struct S3 s3;   // DBU as S3 in separate TU => whole-program compilation
     162};
     163struct S4 {
     164        struct S3 s3;   // DBU as S3 in separate TU => whole-program compilation
     165};
     166```
     167First question is how to establish connections among TU (specifically their `.h` files) to perform implicit `#include`s?
     168Second question is information hiding during whole-program compilation.
     169
     170
    65171### Design choices
    66172
     
    69175* Imports follow the file system: Just like `#include`, we resolve modules by following the file system. Other languages have users specify module names, but for now it seemed unnecessarily complex.
    70176    * As such, the module declaration at the top of a module file (`module;`) does not take a name argument (this declaration would be used to distinguish a regular C file from a C module file).
    71     * As seen in `testing/`, files with the same name as a folder exist outside of it (ie. `yesImports/` and `yesImports.cmod`). This is in line with Rust, which initially used `mod.rs` for folder-like files before using this strategy.
     177    * As seen in `testing/`, files with the same name as a folder exist outside of it (he. `yesImports/` and `yesImports.cmod`). This is in line with Rust, which initially used `mod.rs` for folder-like files before using this strategy.
    72178* Symbol names are prefixed with the file path of the module, as read from the project root folder.
    73 * Imports automatically expand into module scope: The alternative is to have imports like `import "graph/node" as N;`. Doing so would require prefixing any symbols from "graph/node.cmod" (eg. `N.struct_1`). The prototype took the other approach to keep the language less verbose.
    74     * The prototype ignores name clashes, but a full module system should give the ability to disambiguate. One idea is to use the module name (eg. `"../graph".struct_2`)
     179* Imports automatically expand into module scope: The alternative is to have imports like `import "graph/node" as N;`. Doing so would require prefixing any symbols from "graph/node.cmod" (e.g., `N.struct_1`). The prototype took the other approach to keep the language less verbose.
     180    * The prototype ignores name clashes, but a full module system should give the ability to disambiguate. One idea is to use the module name (e.g., `"../graph".struct_2`)
    75181* We use `import` and `export` keywords to control module visibility.
    76182* This isn't demonstrated in the grammar, but follows from the work in my previous attempt at modules (type stubs). The idea is that struct definitions are not visible unless imported. For example, if a module imports `struct struct_3 {struct struct_4 field_1;};` but does not import `struct struct_4`, it can access `field_1` but it cannot access any fields of `field_1`. This would work similarly to how, if `field_1` were a pointer, you don't have access to the struct.
    77183
     184<span style="color:red">PAB</span> What is the purpose of `module`? How does it interact with a TU?
     185You need more here.
     186
     187```
     188// Collection TU
     189module Collection {  // namespace, collection type
     190        module Linked { // namespace, linked-list types
     191                static:  // private
     192                        ...
     193                extern:  // public
     194                        ...
     195        } {
     196                // initialization code across all linked-list types
     197        }
     198        module string {  // namespace, string type
     199                static:  // private
     200                        ...
     201                extern:  // public
     202                        ...
     203        } {
     204                // initialization code across all string types
     205        }
     206        module array {  // namespace, array type
     207                static:  // private
     208                        ...
     209                extern:  // public
     210                        ...
     211        } {
     212                // initialization code across all array types
     213        }
     214} {
     215        // general initialization code across all collections
     216}
     217```
     218
     219where usage might look like:
     220
     221```
     222#include Collection.Linked;  // import, open specific namespace
     223Stack(int) stack;
     224
     225#include Collection  // import, open general namespace
     226String str;
     227array( int, 100 ) arr;
     228
     229#include Collection.array {  // import, closed namespace
     230        array( int, 100 ) arr;
     231}
     232```
     233
    78234### Comparison with Zig
    79235
    80 Zig has separate compilation, cicular modules and no header files! This is what my module system is trying to do, so it's really worth taking a close look:
    81 * `@import` is like treating the file as a struct. You assign to a name and use it.
    82     * This neatly unifies language concepts -- Zig's main feature is compile-time logic (`comptime`), and `@import` behaves like any other compile-time function.
    83 * Compile-time functions (`@import` works like one) are memoized, so importing twice leads to references to the same object (avoiding double-definitions).
    84     * This may explain why they use module names instead of file paths; file paths change depending on the current directory, which messes with memoization.
    85 * The Zig compiler waits until a struct/function is used before analyzing its definition/declaration.
    86     * This "lazy evalution" differs from my prototype, which performs "eager evaluation" of imports. The prototype does this partly because it's simpler to implement, but also because I need `module_input` in order to resolve parsing ambiguities.
    87     * Using functions means analyzing function declarations, not their definition. If you want inline functions or constants, those are likely handled by the `comptime` feature. Cforall doesn't have such a feature and requires backwards compatibility with C, so we can't make this assumption. Thankfully, the prototype can be adapted to work for cases such as inline functions.
    88 * Zig has the philosophy of making things explicit: no implicit constructors/destructors, passing allocators into functions, etc. There is also no private struct fields (the idea being that when you're working low-level, you may need access to the internals). I think Cforall takes a different approach, using more powerful abstractions (potentially influenced by C++); however, I think Zig has a lot of merit in wanting to make things visible and tweakable by the programmer, and we could benefit from taking some of these ideas.
     236Zig has separate compilation, circular modules and no header files! This is what my module system is trying to do, so it's really worth taking a close look:
     237
     238- `@import` is like treating the file as a struct. You assign to a name and use it.
     239
     240   - This neatly unifies language concepts -- Zig's main feature is compile-time logic (`comptime`), and `@import` behaves like any other compile-time function.
     241
     242- Compile-time functions (`@import` works like one) are memoized, so importing twice leads to references to the same object (avoiding double-definitions).
     243
     244   - This may explain why they use module names instead of file paths; file paths change depending on the current directory, which messes with memoization.
     245
     246- The Zig compiler waits until a struct/function is used before analyzing its definition/declaration.
     247
     248    - This "lazy evaluation" differs from my prototype, which performs "eager evaluation" of imports. The prototype does this partly because it's simpler to implement, but also because I need `module_input` in order to resolve parsing ambiguities.
     249
     250    - Using functions means analyzing function declarations, not their definition. If you want inline functions or constants, those are likely handled by the `comptime` feature. Cforall doesn't have such a feature and requires backwards compatibility with C, so we can't make this assumption. Thankfully, the prototype can be adapted to work for cases such as inline functions.
     251
     252- Zig has the philosophy of making things explicit: no implicit constructors/destructors, passing allocators into functions, etc. There is also no private struct fields (the idea being that when you're working low-level, you may need access to the internals). I think Cforall takes a different approach, using more powerful abstractions (potentially influenced by C++); however, I think Zig has a lot of merit in wanting to make things visible and tweakable by the programmer, and we could benefit from taking some of these ideas.
    89253
    90254### Ideas for future direction
    91255
    92256So with this insight, combined with the design choices section, what direction would I like to take this?
    93 * I still like the idea of resolving modules by following the file system. The fact that `import "std/array";` works similarly to `#include "std/array.h"` is really nice in my opinion.
    94     * That being said, my current grammar also allows writing `import std/array;` , which I think is a mistake. The unquoted version should be reserved for if/when we support named modules, which would look like `module file_reader;` and `import file_reader;`
    95 * Unlike Zig, Cforall still needs to compile down to C code, so prefixing symbol names with the module name is still the most reasonable solution I can come up with that still works with existing C linkers.
    96 * I'm inspired by the way Zig assigns imports to a symbol, so I'd like to try having imports require the `as` keyword (eg. `import "graph/node" as N;`). If the programmer wants the import to be expanded into scope, they can use `with N;` or `import "graph/node" as *;`. This also resolves any nasty disambiguation syntax such as `"../graph".struct_2`.
    97     * One of the struggles with this was that `import "graph/node" as N; import "graph/node" as M;` (in practice, this could happen through "diamond imports") would mean `N.struct_1` and `M.struct_1` need to refer to the same struct, without double-definition problems. With the concept of memoization, this turns out to be implementable.
    98     * I concede that `N.struct_1` and `M.struct_1` isn't the nicest thing to deal with. Rust would write `"../graph".struct_2` as `super::graph.struct_2`, but I would like to stick with import names looking the same as `#include`. As a consolation, the meaning is not ambiguous, and this is arguably an edge case.
    99     * This does increase the verbosity of the language, but it's arguably worth it for the increased readability. Note that Python, a language touted for its ease of use, works in a similar manner. Additionally, this renaming can be automated, so migrating existing systems shouldn't be a problem.
    100 * We use `import/export`, similar to C++20 modules. Rust uses `use/pub`, Zig uses `@import/pub`. For now, I don't see a need to change, and it's fairly simple to update in the future.
    101 * I'm quite conflicted on the idea that struct definitions (therefore its fields) should not be visible unless imported. While restricting field access is common in other languages, no language does it in the way I'm envisoning.
    102     * By example: if I import `struct struct_5 foo() {...}` but I don't import `struct_5`, I should be able to use `foo()` (which would include assigning to a variable) but I can't access the fields of the return value. You only get the functionality that you import.
    103     * The implementation problem: In C, in order to create a variable you need to specify its type. So you'll have to provide some way to expose the name `struct_5` to the importer. If you do that, why can't you give me the fields too?
    104     * The useability problem: The ability to access the fields of a returned value can be seen as necessary in order to properly use a function (eg. function returns named tuple). So you're forcing the programmer to do extra import/export management for not much practical gain. Additionally, this isn't very "C-like", because in regular C you would need to provide the struct definition here.
    105     * You can think of this concept as "opaque types, but from the importer's side". The function itself does nothing to hide the fact it's using `struct_5`, but the importer cannot use `struct_5` because it didn't import it. Pretty much all other languages (eg. Scala, Swift) put the opaque type on the exporter's side. In comparison, my system seems unnecessarily pedantic. If we want to consider restricting field access, public/private fields also provide better granularity (we might be able to leverage "tagged exports", described below).
    106         * It's also worth asking if hiding struct information is the right thing. Zig chooses not to have private fields, taking the philosophy that low-level often needs to reach into the internals of a struct in order to produce composable abstraction layers. Something I'm interested in knowing is: if I have a variable whose type has a field with a pointer to some other struct, can I access the other struct's fields? If you can, then it would be consistent between pointer and non-pointer fields.
    107     * Ultimately, it might be best to abandon this idea, as it is pedantic for not enough practical benefit. Just let the programmer access the struct in the same way it's written in the function declaration.
    108         * As an aside, trait information in Cforall might also be unnecessarily pedantic. Having to import a forall, trait, struct, and certain methods in order to make use of some polymorphic function seems a bit overkill (though I might be missing something).
    109 * Modules often need to expose different interfaces to different modules. For example, a thread module may need to expose more information to a garbage collection module than a regular module. The object-oriented technique of having "friend classes" is an all-or-nothing approach; it's not great because it lacks granularity. Instead, we can tag certain exports: the thread module uses `export(details) struct internals {...};` while the garbage collection module uses `import thread(+, details);` (the `+` referring to also wanting regular exports).
    110     * I've never seen this in the wild before, but a quick search shows that Perl has some form of this in the form of `%EXPORT_TAGS`. I like my method of putting it directly on the symbol definition instead of a big array at the bottom of the file, though.
     257
     258- I still like the idea of resolving modules by following the file system. The fact that `import "std/array";` works similarly to `#include "std/array.h"` is really nice in my opinion.
     259
     260    - That being said, my current grammar also allows writing `import std/array;` , which I think is a mistake. The unquoted version should be reserved for if/when we support named modules, which would look like `module file_reader;` and `import file_reader;`
     261
     262- Unlike Zig, Cforall still needs to compile down to C code, so prefixing symbol names with the module name is still the most reasonable solution I can come up with that still works with existing C linkers.
     263
     264- I'm inspired by the way Zig assigns imports to a symbol, so I'd like to try having imports require the `as` keyword (e.g., `import "graph/node" as N;`). If the programmer wants the import to be expanded into scope, they can use `with N;` or `import "graph/node" as -;`. This also resolves any nasty disambiguation syntax such as `"../graph".struct_2`.
     265
     266    - One of the struggles with this was that `import "graph/node" as N; import "graph/node" as M;` (in practice, this could happen through "diamond imports") would mean `N.struct_1` and `M.struct_1` need to refer to the same struct, without double-definition problems. With the concept of memoization, this turns out to be implementable.
     267
     268    - I concede that `N.struct_1` and `M.struct_1` isn't the nicest thing to deal with. Rust would write `"../graph".struct_2` as `super::graph.struct_2`, but I would like to stick with import names looking the same as `#include`. As a consolation, the meaning is not ambiguous, and this is arguably an edge case.
     269
     270    - This does increase the verbosity of the language, but it's arguably worth it for the increased readability. Note that Python, a language touted for its ease of use, works in a similar manner. Additionally, this renaming can be automated, so migrating existing systems shouldn't be a problem.
     271
     272- We use `import/export`, similar to C++20 modules. Rust uses `use/pub`, Zig uses `@import/pub`. For now, I don't see a need to change, and it's fairly simple to update in the future.
     273
     274- I'm quite conflicted on the idea that struct definitions (therefore its fields) should not be visible unless imported. While restricting field access is common in other languages, no language does it in the way I'm envisioning.
     275
     276    - By example: if I import `struct struct_5 foo() {...}` but I don't import `struct_5`, I should be able to use `foo()` (which would include assigning to a variable) but I can't access the fields of the return value. You only get the functionality that you import.
     277
     278    - The implementation problem: In C, in order to create a variable you need to specify its type. So you'll have to provide some way to expose the name `struct_5` to the importer. If you do that, why can't you give me the fields too?
     279
     280    - The useability problem: The ability to access the fields of a returned value can be seen as necessary in order to properly use a function (e.g., function returns named tuple). So you're forcing the programmer to do extra import/export management for not much practical gain. Additionally, this isn't very "C-like", because in regular C you would need to provide the struct definition here.
     281
     282    - You can think of this concept as "opaque types, but from the importer's side". The function itself does nothing to hide the fact it's using `struct_5`, but the importer cannot use `struct_5` because it didn't import it. Pretty much all other languages (e.g., Scala, Swift) put the opaque type on the exporter's side. In comparison, my system seems unnecessarily pedantic. If we want to consider restricting field access, public/private fields also provide better granularity (we might be able to leverage "tagged exports", described below).
     283
     284        - It's also worth asking if hiding struct information is the right thing. Zig chooses not to have private fields, taking the philosophy that low-level often needs to reach into the internals of a struct in order to produce composable abstraction layers. Something I'm interested in knowing is: if I have a variable whose type has a field with a pointer to some other struct, can I access the other struct's fields? If you can, then it would be consistent between pointer and non-pointer fields.
     285
     286    - Ultimately, it might be best to abandon this idea, as it is pedantic for not enough practical benefit. Just let the programmer access the struct in the same way it's written in the function declaration.
     287
     288        - As an aside, trait information in Cforall might also be unnecessarily pedantic. Having to import a forall, trait, struct, and certain methods in order to make use of some polymorphic function seems a bit overkill (though I might be missing something).
     289
     290- Modules often need to expose different interfaces to different modules. For example, a thread module may need to expose more information to a garbage collection module than a regular module. The object-oriented technique of having "friend classes" is an all-or-nothing approach; it's not great because it lacks granularity. Instead, we can tag certain exports: the thread module uses `export(details) struct internals {...};` while the garbage collection module uses `import thread(+, details);` (the `+` referring to also wanting regular exports).
     291
     292    - I've never seen this in the wild before, but a quick search shows that Perl has some form of this in the form of `%EXPORT_TAGS`. I like my method of putting it directly on the symbol definition instead of a big array at the bottom of the file, though.
    111293
    112294## Future work
    113295
    114 * The current request is to generate headers from analyzing .c files. My general steps would be to:
    115     * parse C code, extract necessary symbols (like `module_data` step).
    116     * *(start with a simpler model, without inline functions or constants)*
    117     * figure out what other code it's referencing (use `#include` to figure out where to look, like `module_input` step).
    118     * Heavily cyclic symbol references requires breaking a single header into multiple parts. To start, we can assume modules are not cyclic and put an `#include` in the header when symbols from other files are used. Then we can see where the cycles are happening and prioritize those first.
    119     * *tbh I'm not sure why you'd want to go back to using headers when the information is all gathered already -- you could just merge this logic into the compilation step. I think it's more for people who don't want to change anything about their code (use it more like a tool rather than changing their compiler). Perhaps it's a "gateway drug" to the full module system. It also could function as a "source of truth" for what the full module system should be doing.*
    120 * I want to take a closer look at Zig, actually run some code to validate my theories. Also look at some other "low-level languages".
    121 * Flesh out how the full C module system would work.
    122     * I'd also need to look into implementing migration tooling (likely will be able to reuse functionality from previous steps)
    123 * Write thesis.
    124 * Graduate.
     296- The current request is to generate headers from analyzing .c files. My general steps would be to:
     297    - parse C code, extract necessary symbols (like `module_data` step).
     298
     299    - (start with a simpler model, without inline functions or constants)
     300
     301    - figure out what other code it's referencing (use `#include` to figure out where to look, like `module_input` step).
     302
     303    - Heavily cyclic symbol references requires breaking a single header into multiple parts. To start, we can assume modules are not cyclic and put an `#include` in the header when symbols from other files are used. Then we can see where the cycles are happening and prioritize those first.
     304
     305    - tbh I'm not sure why you'd want to go back to using headers when the information is all gathered already -- you could just merge this logic into the compilation step. I think it's more for people who don't want to change anything about their code (use it more like a tool rather than changing their compiler). Perhaps it's a "gateway drug" to the full module system. It also could function as a "source of truth" for what the full module system should be doing.
     306
     307- I want to take a closer look at Zig, actually run some code to validate my theories. Also look at some other "low-level languages".
     308
     309- Flesh out how the full C module system would work.
     310
     311    - I'd also need to look into implementing migration tooling (likely will be able to reuse functionality from previous steps)
     312
     313- Write thesis.
     314
     315- Graduate.
Note: See TracChangeset for help on using the changeset viewer.