Module System Proposal
======================

Modules are a term for the base in separate compilation. Different languages have different ways to implement it, for C/C++ the module is the code/source file and usually header file.

Uses of Modules
---------------
The most straight forward purpose of modules is to enable separate compilation.
This inturn reduces recompilation, by isolating changes, and parallel compilation, but making modules independent.

An related feature is sharing information between modules. Information needed by other modules must be shared. However, avoiding sharing extra information can further isolating changes, and can also reduces the work of compiling a single module.

Modules are also used as a base for other organizational features. Such as namespacing on module names, using the module as a space for visibility modifiers.

C Comparisons
.............
To be 99% compatable with C, Cforall pretty much has to use the C-preprocessor (or replace it with a Cforall-preprocessor, that is in turn backwards compatable). To this end, how well does the C-preprocessor operate in these areas?

C is very good at separate compilation. Parallel computation is completely unhindered and recompilation is good, although sometimes a bit premptive. Information sharing is a bit weaker, C has a tendency to overshare because its copy-and-paste rule gets the entire file. This is also why its recompilation can be premptive. It is on the user to follow conventions and figure out what information needs to/should be shared. (On a personal note, I have spent a lot of time working to remove extra includes from the Cforall compiler.)

C doesn't use modules to implement any behaviour. Except for preserved source location information used in error messages, they are completely erased by the preprocessor.

Module Linkage Specification
----------------------------
A proposed solution keep track of code and wheither or not we are in the module we are currently being compiled. This "is_in_module" linkage* is used in the compiler (and perhaps the preprocessor) to mark different declarations. Usually, only the original source file (the `.cfa` file) and its header (a `.hfa` file) are considered to be in the module.

Prelude definitions are never considered to be inside the current module, except when compiling the prelude itself.

* That is linkage in the sense of linkage specifier (like mangled, or overridable) not external/internal linkage (part of storage classes).

How to Specify the Module
-------------------------
Perhaps the trickiest issue is figuring out where the module is after the C-preprocessor has finished its work.

If we don't include the preprocessor in the this (which has the distinct advantage of not needing to update the C-preprocessor). Then the module needs to be blocked out in C code. This is fairly trivial in the source file, marking the end of the include statements is usually good enough.

Headers are harder because they are almost always mixed in with other includes, both in other files and their own. I have been able to think of two solutions that do not get caught up in these problems:
1.  Mark out the header include in the source file (in addition to the source file body) and have the header escape all of its includes. This gives us start and stop points for the module.
2.  Have the header mark its body in a way that mentions the source file. Most includes may have these blocks, but the non-matching ones can be discarded.

Using the preprocessor (or at least relying on the line marks/processed line directives) opens things up a bit more. With accurate knowledge of what original file a declaration came from, all that needs to be done it map files onto modules. This is less flexable, but it covers the standard layout of headers, and even many of the unconventional layouts I have seen.

Given which files are part of the module, a source file is always part of its own module. The paired header (same path and name, except for the extension) could automatically be included in the module, but this might take away some needed flexablity. Allowing intermediate extensions (see the AST/Pass files for an example) would allow for slight more flexability. The other way would be to specify in the source files theselves. Headers could say which modules they are a part of, but I think the more natural solution may be to have a file already in the module say what other files in the module it is including.

Within that, it could always go with the include, part of the include or a list of files in the source files. Any of these options should work.
>   // With the include:
>   #pragma module "filename.hfa"
>   #include "filename.hfa"
>
>   // Part of the include:
>   #include_module "filename.hfa"
>
>   // Listed Source Files:
>   #pragma module "filename.hfa" "included-from-filename.hfa"
>   #include "filename.hfa"
>   // In the previous examples, the include in filename.hfa would be updated.

Uses of Module Linkage
----------------------
After we know what sections are in the module and which are not, how do we use this to actually support coding?

In the preprocessor, the simplest use is a conditional macro. Takes two arguments, and expands to one of them depending on if the tokens were found in the module or not. This would require an implemenation directly in the preprocessor.
>   __MODULE__(if_inside_module, if_outside_module)

In the compiler proper, the linkage can be checked on declarations to handle them in the compiler. A simple example is a function specifier that takes the module status into account. Say "module_inline", which becomes "inline" (if anything) in the module and "extern inline" elsewhere. This (using some GCC behaviour) allows every file to see the function definition and inline it, but only the module will keep a non-inlined copy. This ensures that there is only one translation unit with a copy without involving the linker.

This may also help solve other memory-allocated-in-header problems, as this memory can then only be allocated in the module.

It may also be used to help implement visibility. The level of granularity is still module level, but private information can be included in the header, used by the compiler, but it will be hidden from direct use in other modules. For example, you could make the fields of a structure as private, while the layout is known for the compiler, other modules cannot preform field access and would have to use other provided functions to manipulate and read the type. (There are a few containers that do this by convention by in the library.)

Remaining Issues
----------------
Not all of these have to be solved, but there are still some areas that could really use an improvement.

First, using modules as the visibility tool does lead to a major short-coming. That is, because there is only "in-module" and "out-of-module", multiple things in the same header don't know that they are in the same module. Which could prevent adding inline functions in the header.

Second, this does nothing to solve the oversized header issue. It does not reduce any requirements on what includes need to be use.
