Context Navigation

← Previous Changeset
Next Changeset →

Changeset c76bd34

Timestamp:

Oct 7, 2020, 4:31:43 PM (5 years ago)

Author:

Colby Alexander Parsons <caparsons@…>

Branches:

ADT, arm-eh, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, new-ast-unique-expr, pthread-emulation, qualifiedEnum, stuck-waitfor-destruct

Children:

848439f

Parents:

ae2c27a (diff), 597c5d18 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.

Message:

Merge branch 'master' of plg.uwaterloo.ca:software/cfa/cfa-cc into master

Files:

: 54 added
: 3 deleted
: 168 edited
: 121 moved

.gitignore (modified) (1 diff)
Jenkinsfile (modified) (2 diffs)
benchmark/Makefile.am (modified) (12 diffs)
benchmark/creation/JavaThread.java (modified) (3 diffs)
benchmark/ctxswitch/JavaThread.java (modified) (2 diffs)
benchmark/io/http/filecache.cfa (modified) (7 diffs)
benchmark/io/http/main.cfa (modified) (5 diffs)
benchmark/io/http/options.cfa (modified) (5 diffs)
benchmark/io/http/options.hfa (modified) (2 diffs)
benchmark/io/http/protocol.cfa (modified) (6 diffs)
benchmark/io/http/worker.cfa (modified) (3 diffs)
benchmark/io/http/worker.hfa (modified) (3 diffs)
benchmark/io/readv.cfa (modified) (2 diffs)
benchmark/mutex/JavaThread.java (modified) (3 diffs)
benchmark/mutexC/JavaThread.java (modified) (4 diffs)
benchmark/readyQ/yield.cfa (modified) (2 diffs)
benchmark/schedint/JavaThread.java (modified) (2 diffs)
doc/LaTeXmacros/common.tex (modified) (4 diffs)
doc/LaTeXmacros/lstlang.sty (modified) (3 diffs)
doc/bibliography/pl.bib (modified) (2 diffs)
doc/papers/concurrency/Paper.tex (modified) (19 diffs)
doc/papers/concurrency/annex/local.bib (modified) (1 diff)
doc/papers/concurrency/mail2 (modified) (1 diff)
doc/papers/concurrency/response3 (added)
doc/proposals/ZeroCostPreemption.md (added)
doc/proposals/function_type_change.md (added)
doc/refrat/refrat.tex (modified) (3 diffs)
doc/theses/andrew_beach_MMath/glossaries.tex (added)
doc/theses/andrew_beach_MMath/thesis.tex (modified) (1 diff)
doc/theses/fangren_yu_COOP_S20/Makefile (added)
doc/theses/fangren_yu_COOP_S20/Report.tex (added)
doc/theses/fangren_yu_COOP_S20/cfa_developer_reference.pdf (added)
doc/theses/fangren_yu_COOP_S20/figures/DeepNodeSharing.fig (added)
doc/theses/fangren_yu_COOP_S20/figures/DeepNodeSharing.fig.bak (added)
doc/theses/thierry_delisle_PhD/.gitignore (modified) (1 diff)
doc/theses/thierry_delisle_PhD/code/readQ_example/Makefile (added)
doc/theses/thierry_delisle_PhD/code/readQ_example/proto-gui/main.cpp (added)
doc/theses/thierry_delisle_PhD/code/readQ_example/thrdlib/Makefile (added)
doc/theses/thierry_delisle_PhD/code/readQ_example/thrdlib/cforall.cpp (added)
doc/theses/thierry_delisle_PhD/code/readQ_example/thrdlib/fibre.cpp (added)
doc/theses/thierry_delisle_PhD/code/readQ_example/thrdlib/pthread.cpp (added)
doc/theses/thierry_delisle_PhD/code/readQ_example/thrdlib/thread.cpp (added)
doc/theses/thierry_delisle_PhD/code/readQ_example/thrdlib/thread.hpp (added)
doc/theses/thierry_delisle_PhD/code/readyQ_proto/Makefile (moved) (moved from doc/theses/thierry_delisle_PhD/code/Makefile )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/assert.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/assert.hpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/bitbench/select.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/bitbench/select.cpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/bts.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/bts.cpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/bts_test.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/bts_test.cpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/links.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/links.hpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/prefetch.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/prefetch.cpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/process.sh (moved) (moved from doc/theses/thierry_delisle_PhD/code/process.sh )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/processor.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/processor.hpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/processor_list.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/processor_list.hpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/processor_list_fast.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/processor_list_fast.cpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/processor_list_good.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/processor_list_good.cpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/randbit.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/randbit.cpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/relaxed_list.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/relaxed_list.cpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/relaxed_list.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/relaxed_list.hpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/relaxed_list_layout.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/relaxed_list_layout.cpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/runperf.sh (moved) (moved from doc/theses/thierry_delisle_PhD/code/runperf.sh )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/scale.sh (moved) (moved from doc/theses/thierry_delisle_PhD/code/scale.sh )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/snzi-packed.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/snzi-packed.hpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/snzi.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/snzi.hpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/snzm.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/snzm.hpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/utils.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/utils.hpp )
doc/theses/thierry_delisle_PhD/code/readyQ_proto/work_stealing.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/work_stealing.hpp )
doc/theses/thierry_delisle_PhD/comp_II/comp_II.tex (modified) (25 diffs)
doc/theses/thierry_delisle_PhD/comp_II/img/system.fig (modified) (4 diffs)
doc/theses/thierry_delisle_PhD/comp_II/presentation.pdf (deleted)
doc/theses/thierry_delisle_PhD/thesis/Makefile (added)
doc/theses/thierry_delisle_PhD/thesis/fig/base.fig (added)
doc/theses/thierry_delisle_PhD/thesis/fig/empty.fig (added)
doc/theses/thierry_delisle_PhD/thesis/fig/emptybit.fig (added)
doc/theses/thierry_delisle_PhD/thesis/fig/emptytls.fig (added)
doc/theses/thierry_delisle_PhD/thesis/fig/emptytree.fig (added)
doc/theses/thierry_delisle_PhD/thesis/fig/resize.fig (added)
doc/theses/thierry_delisle_PhD/thesis/fig/system.fig (added)
doc/theses/thierry_delisle_PhD/thesis/glossary.tex (added)
doc/theses/thierry_delisle_PhD/thesis/text/core.tex (added)
doc/theses/thierry_delisle_PhD/thesis/text/front.tex (added)
doc/theses/thierry_delisle_PhD/thesis/text/intro.tex (added)
doc/theses/thierry_delisle_PhD/thesis/text/io.tex (added)
doc/theses/thierry_delisle_PhD/thesis/text/practice.tex (added)
doc/theses/thierry_delisle_PhD/thesis/text/runtime.tex (added)
doc/theses/thierry_delisle_PhD/thesis/thesis.tex (added)
doc/user/Makefile (modified) (1 diff)
doc/user/user.tex (modified) (17 diffs)
examples/constructors.c (deleted)
libcfa/configure.ac (modified) (4 diffs)
libcfa/prelude/defines.hfa.in (modified) (3 diffs)
libcfa/src/Makefile.am (modified) (4 diffs)
libcfa/src/bits/locks.hfa (modified) (6 diffs)
libcfa/src/concurrency/CtxSwitch-i386.S (modified) (3 diffs)
libcfa/src/concurrency/alarm.cfa (modified) (1 diff)
libcfa/src/concurrency/clib/cfathread.cfa (added)
libcfa/src/concurrency/clib/cfathread.h (added)
libcfa/src/concurrency/coroutine.cfa (modified) (2 diffs)
libcfa/src/concurrency/coroutine.hfa (modified) (4 diffs)
libcfa/src/concurrency/exception.cfa (modified) (1 diff)
libcfa/src/concurrency/exception.hfa (modified) (2 diffs)
libcfa/src/concurrency/invoke.h (modified) (4 diffs)
libcfa/src/concurrency/io.cfa (modified) (3 diffs)
libcfa/src/concurrency/io/call.cfa.in (added)
libcfa/src/concurrency/io/setup.cfa (modified) (4 diffs)
libcfa/src/concurrency/io/types.hfa (modified) (3 diffs)
libcfa/src/concurrency/iofwd.hfa (modified) (2 diffs)
libcfa/src/concurrency/kernel.cfa (modified) (14 diffs)
libcfa/src/concurrency/kernel.hfa (modified) (3 diffs)
libcfa/src/concurrency/kernel/fwd.hfa (modified) (1 diff)
libcfa/src/concurrency/kernel/startup.cfa (modified) (2 diffs)
libcfa/src/concurrency/kernel_private.hfa (modified) (2 diffs)
libcfa/src/concurrency/monitor.cfa (modified) (15 diffs)
libcfa/src/concurrency/monitor.hfa (modified) (1 diff)
libcfa/src/concurrency/mutex.cfa (modified) (8 diffs)
libcfa/src/concurrency/preemption.cfa (modified) (2 diffs)
libcfa/src/concurrency/thread.cfa (modified) (2 diffs)
libcfa/src/concurrency/thread.hfa (modified) (2 diffs)
libcfa/src/exception.h (modified) (3 diffs)
libcfa/src/exception.hfa (modified) (2 diffs)
libcfa/src/heap.cfa (modified) (7 diffs)
libcfa/src/limits.cfa (modified) (2 diffs)
libcfa/src/limits.hfa (modified) (2 diffs)
libcfa/src/parseargs.cfa (modified) (1 diff)
src/AST/Convert.cpp (modified) (7 diffs)
src/AST/Decl.hpp (modified) (1 diff)
src/AST/ForallSubstitutor.hpp (modified) (2 diffs)
src/AST/Fwd.hpp (modified) (1 diff)
src/AST/GenericSubstitution.cpp (modified) (1 diff)
src/AST/Node.cpp (modified) (1 diff)
src/AST/Pass.hpp (modified) (6 diffs)
src/AST/Pass.impl.hpp (modified) (4 diffs)
src/AST/Pass.proto.hpp (modified) (2 diffs)
src/AST/Print.cpp (modified) (1 diff)
src/AST/SymbolTable.cpp (modified) (5 diffs)
src/AST/SymbolTable.hpp (modified) (1 diff)
src/AST/Type.cpp (modified) (6 diffs)
src/AST/Type.hpp (modified) (10 diffs)
src/AST/TypeSubstitution.cpp (modified) (1 diff)
src/AST/TypeSubstitution.hpp (modified) (1 diff)
src/Common/Examine.cc (added)
src/Common/Examine.h (added)
src/Common/Stats/ResolveTime.cc (added)
src/Common/Stats/ResolveTime.h (added)
src/Common/Stats/Stats.cc (modified) (2 diffs)
src/Common/module.mk (modified) (2 diffs)
src/Concurrency/Keywords.cc (modified) (17 diffs)
src/GenPoly/InstantiateGeneric.cc (modified) (2 diffs)
src/InitTweak/InitTweak.cc (modified) (1 diff)
src/Parser/lex.ll (modified) (3 diffs)
src/Parser/parser.yy (modified) (9 diffs)
src/ResolvExpr/CandidateFinder.cpp (modified) (5 diffs)
src/ResolvExpr/ConversionCost.cc (modified) (2 diffs)
src/ResolvExpr/ConversionCost.h (modified) (1 diff)
src/ResolvExpr/CurrentObject.cc (modified) (5 diffs)
src/ResolvExpr/Resolver.cc (modified) (9 diffs)
src/ResolvExpr/SatisfyAssertions.cpp (modified) (1 diff)
src/ResolvExpr/SpecCost.cc (modified) (1 diff)
src/ResolvExpr/Unify.cc (modified) (12 diffs)
src/SymTab/Mangler.cc (modified) (2 diffs)
src/SymTab/Validate.cc (modified) (6 diffs)
src/Virtual/Tables.cc (added)
src/Virtual/Tables.h (added)
src/Virtual/module.mk (modified) (1 diff)
tests/.expect/array.txt (modified) (1 diff)
tests/.expect/cast.txt (modified) (1 diff)
tests/.expect/enum.txt (modified) (1 diff)
tests/.expect/expression.txt (modified) (1 diff)
tests/.expect/forall.txt (modified) (1 diff)
tests/.expect/heap.txt (modified) (1 diff)
tests/.expect/identFuncDeclarator.txt (modified) (1 diff)
tests/.expect/identParamDeclarator.txt (modified) (1 diff)
tests/.expect/labelledExit.txt (modified) (1 diff)
tests/.expect/limits.txt (modified) (1 diff)
tests/.expect/maybe.txt (modified) (1 diff)
tests/.expect/nested-types.txt (modified) (1 diff)
tests/.expect/numericConstants.txt (modified) (1 diff)
tests/.expect/operators.txt (modified) (1 diff)
tests/.expect/poly-d-cycle.txt (moved) (moved from tests/.expect/poly-cycle.txt )
tests/.expect/poly-o-cycle.txt (added)
tests/.expect/result.txt (modified) (1 diff)
tests/.expect/stdincludes.txt (modified) (1 diff)
tests/.expect/switch.txt (modified) (1 diff)
tests/.expect/typedefRedef-ERR1.txt (modified) (1 diff)
tests/.expect/typedefRedef.txt (modified) (1 diff)
tests/.expect/typeof.txt (modified) (1 diff)
tests/.expect/variableDeclarator.txt (modified) (1 diff)
tests/.expect/voidPtr.txt (modified) (1 diff)
tests/Makefile.am (modified) (7 diffs)
tests/alloc2.cfa (modified) (9 diffs)
tests/array.cfa (modified) (2 diffs)
tests/bugs/196.cfa (added)
tests/builtins/.expect/sync.txt (modified) (1 diff)
tests/builtins/sync.cfa (modified) (2 diffs)
tests/cast.cfa (modified) (1 diff)
tests/concurrent/.expect/clib.txt (added)
tests/concurrent/.expect/cluster.txt (modified) (1 diff)
tests/concurrent/.expect/join.txt (added)
tests/concurrent/.expect/joinerror.sed (moved) (moved from tests/.expect/smart-pointers.txt )
tests/concurrent/clib.c (added)
tests/concurrent/cluster.cfa (modified) (1 diff)
tests/concurrent/examples/.expect/datingService.txt (modified) (1 diff)
tests/concurrent/examples/datingService.cfa (modified) (2 diffs)
tests/concurrent/futures/.expect/basic.txt (modified) (1 diff)
tests/concurrent/futures/basic.cfa (modified) (1 diff)
tests/concurrent/join.cfa (added)
tests/concurrent/joinerror.cfa (added)
tests/concurrent/park/.expect/force_preempt.txt (modified) (1 diff)
tests/concurrent/park/.expect/start_parked.txt (modified) (1 diff)
tests/concurrent/park/contention.cfa (modified) (2 diffs)
tests/concurrent/park/force_preempt.cfa (modified) (3 diffs)
tests/concurrent/park/start_parked.cfa (modified) (1 diff)
tests/enum.cfa (modified) (1 diff)
tests/exceptions/.expect/virtual-cast.txt (modified) (1 diff)
tests/exceptions/.expect/virtual-poly.txt (modified) (1 diff)
tests/exceptions/cancel/.expect/coroutine.txt (added)
tests/exceptions/cancel/coroutine.cfa (added)
tests/exceptions/virtual-cast.cfa (modified) (1 diff)
tests/exceptions/virtual-poly.cfa (modified) (1 diff)
tests/expression.cfa (modified) (1 diff)
tests/forall.cfa (modified) (3 diffs)
tests/heap.cfa (modified) (4 diffs)
tests/identFuncDeclarator.cfa (modified) (2 diffs)
tests/identParamDeclarator.cfa (modified) (2 diffs)
tests/labelledExit.cfa (modified) (2 diffs)
tests/limits.cfa (modified) (2 diffs)
tests/maybe.cfa (modified) (2 diffs)
tests/nested-types.cfa (modified) (2 diffs)
tests/numericConstants.cfa (modified) (2 diffs)
tests/operators.cfa (modified) (1 diff)
tests/poly-d-cycle.cfa (added)
tests/poly-o-cycle.cfa (moved) (moved from tests/poly-cycle.cfa ) (1 diff)
tests/pybin/tools.py (modified) (2 diffs)
tests/raii/.expect/ctor-autogen.txt (modified) (1 diff)
tests/raii/.expect/init_once.txt (modified) (1 diff)
tests/raii/ctor-autogen.cfa (modified) (1 diff)
tests/raii/init_once.cfa (modified) (2 diffs)
tests/result.cfa (modified) (2 diffs)
tests/stdincludes.cfa (modified) (2 diffs)
tests/switch.cfa (modified) (2 diffs)
tests/test.py (modified) (4 diffs)
tests/typedefRedef.cfa (modified) (2 diffs)
tests/typeof.cfa (modified) (1 diff)
tests/variableDeclarator.cfa (modified) (8 diffs)
tests/voidPtr.cfa (modified) (1 diff)
tests/warnings/.expect/self-assignment.txt (modified) (1 diff)
tests/warnings/self-assignment.cfa (modified) (2 diffs)
tests/zombies/ArrayN.c (moved) (moved from examples/ArrayN.c )
tests/zombies/Initialization.c (moved) (moved from examples/Initialization.c )
tests/zombies/Initialization2.c (moved) (moved from examples/Initialization2.c )
tests/zombies/Makefile.example (moved) (moved from examples/Makefile.example )
tests/zombies/Members.c (moved) (moved from examples/Members.c )
tests/zombies/Misc.c (moved) (moved from examples/Misc.c )
tests/zombies/MiscError.c (moved) (moved from examples/MiscError.c )
tests/zombies/Rank2.c (moved) (moved from examples/Rank2.c ) (2 diffs)
tests/zombies/Tuple.c (moved) (moved from examples/Tuple.c ) (3 diffs)
tests/zombies/abstype.c (moved) (moved from examples/abstype.c ) (2 diffs)
tests/zombies/constructors.c (added)
tests/zombies/forward.c (moved) (moved from examples/forward.c )
tests/zombies/gc_no_raii/.gitignore (moved) (moved from examples/gc_no_raii/.gitignore )
tests/zombies/gc_no_raii/bug-repro/assert.c (moved) (moved from examples/gc_no_raii/bug-repro/assert.c )
tests/zombies/gc_no_raii/bug-repro/blockers.tar.gz (moved) (moved from examples/gc_no_raii/bug-repro/blockers.tar.gz )
tests/zombies/gc_no_raii/bug-repro/blockers/explicit_cast.c (moved) (moved from examples/gc_no_raii/bug-repro/blockers/explicit_cast.c )
tests/zombies/gc_no_raii/bug-repro/blockers/file_scope.c (moved) (moved from examples/gc_no_raii/bug-repro/blockers/file_scope.c )
tests/zombies/gc_no_raii/bug-repro/blockers/recursive_realloc.c (moved) (moved from examples/gc_no_raii/bug-repro/blockers/recursive_realloc.c )
tests/zombies/gc_no_raii/bug-repro/crash.c (moved) (moved from examples/gc_no_raii/bug-repro/crash.c )
tests/zombies/gc_no_raii/bug-repro/deref.c (moved) (moved from examples/gc_no_raii/bug-repro/deref.c )
tests/zombies/gc_no_raii/bug-repro/field.c (moved) (moved from examples/gc_no_raii/bug-repro/field.c )
tests/zombies/gc_no_raii/bug-repro/find.c (moved) (moved from examples/gc_no_raii/bug-repro/find.c )
tests/zombies/gc_no_raii/bug-repro/inline.c (moved) (moved from examples/gc_no_raii/bug-repro/inline.c )
tests/zombies/gc_no_raii/bug-repro/malloc.c (moved) (moved from examples/gc_no_raii/bug-repro/malloc.c )
tests/zombies/gc_no_raii/bug-repro/not_equal.c (moved) (moved from examples/gc_no_raii/bug-repro/not_equal.c )
tests/zombies/gc_no_raii/bug-repro/oddtype.c (moved) (moved from examples/gc_no_raii/bug-repro/oddtype.c )
tests/zombies/gc_no_raii/bug-repro/push_back.c (moved) (moved from examples/gc_no_raii/bug-repro/push_back.c )
tests/zombies/gc_no_raii/bug-repro/push_back.h (moved) (moved from examples/gc_no_raii/bug-repro/push_back.h )
tests/zombies/gc_no_raii/bug-repro/realloc.c (moved) (moved from examples/gc_no_raii/bug-repro/realloc.c )
tests/zombies/gc_no_raii/bug-repro/return.c (moved) (moved from examples/gc_no_raii/bug-repro/return.c )
tests/zombies/gc_no_raii/bug-repro/return_template.c (moved) (moved from examples/gc_no_raii/bug-repro/return_template.c )
tests/zombies/gc_no_raii/bug-repro/slow_malloc.c (moved) (moved from examples/gc_no_raii/bug-repro/slow_malloc.c )
tests/zombies/gc_no_raii/bug-repro/static_const_local.c (moved) (moved from examples/gc_no_raii/bug-repro/static_const_local.c )
tests/zombies/gc_no_raii/bug-repro/test-assert.cpp (moved) (moved from examples/gc_no_raii/bug-repro/test-assert.cpp )
tests/zombies/gc_no_raii/bug-repro/void_pointer.c (moved) (moved from examples/gc_no_raii/bug-repro/void_pointer.c )
tests/zombies/gc_no_raii/bug-repro/while.c (moved) (moved from examples/gc_no_raii/bug-repro/while.c )
tests/zombies/gc_no_raii/bug-repro/zero.c (moved) (moved from examples/gc_no_raii/bug-repro/zero.c )
tests/zombies/gc_no_raii/pool-alloc/allocate-malign.c (moved) (moved from examples/gc_no_raii/pool-alloc/allocate-malign.c )
tests/zombies/gc_no_raii/pool-alloc/allocate-malloc.c (moved) (moved from examples/gc_no_raii/pool-alloc/allocate-malloc.c )
tests/zombies/gc_no_raii/pool-alloc/allocate-mmap.c (moved) (moved from examples/gc_no_raii/pool-alloc/allocate-mmap.c )
tests/zombies/gc_no_raii/pool-alloc/allocate-win-valloc.c (moved) (moved from examples/gc_no_raii/pool-alloc/allocate-win-valloc.c )
tests/zombies/gc_no_raii/premake4.lua (moved) (moved from examples/gc_no_raii/premake4.lua )
tests/zombies/gc_no_raii/src/allocate-pool.c (moved) (moved from examples/gc_no_raii/src/allocate-pool.c )
tests/zombies/gc_no_raii/src/allocate-pool.h (moved) (moved from examples/gc_no_raii/src/allocate-pool.h )
tests/zombies/gc_no_raii/src/gc.h (moved) (moved from examples/gc_no_raii/src/gc.h )
tests/zombies/gc_no_raii/src/gcpointers.c (moved) (moved from examples/gc_no_raii/src/gcpointers.c )
tests/zombies/gc_no_raii/src/gcpointers.h (moved) (moved from examples/gc_no_raii/src/gcpointers.h )
tests/zombies/gc_no_raii/src/internal/card_table.h (moved) (moved from examples/gc_no_raii/src/internal/card_table.h )
tests/zombies/gc_no_raii/src/internal/collector.c (moved) (moved from examples/gc_no_raii/src/internal/collector.c )
tests/zombies/gc_no_raii/src/internal/collector.h (moved) (moved from examples/gc_no_raii/src/internal/collector.h )
tests/zombies/gc_no_raii/src/internal/gc_tools.h (moved) (moved from examples/gc_no_raii/src/internal/gc_tools.h )
tests/zombies/gc_no_raii/src/internal/globals.h (moved) (moved from examples/gc_no_raii/src/internal/globals.h )
tests/zombies/gc_no_raii/src/internal/memory_pool.c (moved) (moved from examples/gc_no_raii/src/internal/memory_pool.c )
tests/zombies/gc_no_raii/src/internal/memory_pool.h (moved) (moved from examples/gc_no_raii/src/internal/memory_pool.h )
tests/zombies/gc_no_raii/src/internal/object_header.c (moved) (moved from examples/gc_no_raii/src/internal/object_header.c )
tests/zombies/gc_no_raii/src/internal/object_header.h (moved) (moved from examples/gc_no_raii/src/internal/object_header.h )
tests/zombies/gc_no_raii/src/internal/state.c (moved) (moved from examples/gc_no_raii/src/internal/state.c )
tests/zombies/gc_no_raii/src/internal/state.h (moved) (moved from examples/gc_no_raii/src/internal/state.h )
tests/zombies/gc_no_raii/src/test_include.c (moved) (moved from examples/gc_no_raii/src/test_include.c )
tests/zombies/gc_no_raii/src/tools.h (moved) (moved from examples/gc_no_raii/src/tools.h )
tests/zombies/gc_no_raii/src/tools/checks.h (moved) (moved from examples/gc_no_raii/src/tools/checks.h )
tests/zombies/gc_no_raii/src/tools/print.c (moved) (moved from examples/gc_no_raii/src/tools/print.c )
tests/zombies/gc_no_raii/src/tools/print.h (moved) (moved from examples/gc_no_raii/src/tools/print.h )
tests/zombies/gc_no_raii/src/tools/worklist.h (moved) (moved from examples/gc_no_raii/src/tools/worklist.h )
tests/zombies/gc_no_raii/test/badlll.c (moved) (moved from examples/gc_no_raii/test/badlll.c )
tests/zombies/gc_no_raii/test/gctest.c (moved) (moved from examples/gc_no_raii/test/gctest.c )
tests/zombies/gc_no_raii/test/operators.c (moved) (moved from examples/gc_no_raii/test/operators.c )
tests/zombies/hashtable.cfa (moved) (moved from examples/hashtable.cfa )
tests/zombies/hashtable2.cfa (moved) (moved from examples/hashtable2.cfa )
tests/zombies/huge.c (moved) (moved from examples/huge.c )
tests/zombies/includes.c (moved) (moved from examples/includes.c ) (4 diffs)
tests/zombies/index.h (moved) (moved from examples/index.h )
tests/zombies/io/cat.c (moved) (moved from examples/io/cat.c )
tests/zombies/io/filereader.c (moved) (moved from examples/io/filereader.c )
tests/zombies/io/simple/client.c (moved) (moved from examples/io/simple/client.c )
tests/zombies/io/simple/server.c (moved) (moved from examples/io/simple/server.c )
tests/zombies/io/simple/server.cfa (moved) (moved from examples/io/simple/server.cfa )
tests/zombies/io/simple/server_epoll.c (moved) (moved from examples/io/simple/server_epoll.c )
tests/zombies/io_uring.txt (moved) (moved from examples/io_uring.txt )
tests/zombies/it_out.c (moved) (moved from examples/it_out.c )
tests/zombies/multicore.c (moved) (moved from examples/multicore.c )
tests/zombies/new.c (moved) (moved from examples/new.c )
tests/zombies/poly-bench.c (moved) (moved from examples/poly-bench.c )
tests/zombies/prolog.c (moved) (moved from examples/prolog.c )
tests/zombies/quad.c (moved) (moved from examples/quad.c )
tests/zombies/s.c (moved) (moved from examples/s.c )
tests/zombies/simplePoly.c (moved) (moved from examples/simplePoly.c )
tests/zombies/simpler.c (moved) (moved from examples/simpler.c )
tests/zombies/specialize.c (moved) (moved from examples/specialize.c )
tests/zombies/square.c (moved) (moved from examples/square.c )
tests/zombies/structMember.cfa (modified) (2 diffs)
tests/zombies/twice.c (moved) (moved from examples/twice.c )
tests/zombies/wrapper/.gitignore (moved) (moved from examples/wrapper/.gitignore )
tests/zombies/wrapper/premake4.lua (moved) (moved from examples/wrapper/premake4.lua )
tests/zombies/wrapper/src/main.c (moved) (moved from examples/wrapper/src/main.c )
tests/zombies/wrapper/src/pointer.h (moved) (moved from examples/wrapper/src/pointer.h )
tests/zombies/zero_one.c (moved) (moved from examples/zero_one.c )
tools/gdb/utils-gdb.py (modified) (7 diffs)
tools/langserver/cfa-ls (deleted)
tools/vscode/uwaterloo.cforall-0.1.0/.gitignore (added)

Legend:

: Unmodified
: Added
: Removed

.gitignore

rae2c27a	rc76bd34
79	79	doc/user/pointer2.tex
80	80	doc/user/EHMHierarchy.tex
	81
	82	# generated by npm
	83	package-lock.json

Jenkinsfile

-              rae2c27a
+              rc76bd34
                 echo GitLogMessage()
-                // This is a complete hack but it solves problems with automake thinking it needs to regenerate makefiles
-                // We fudged automake/missing to handle that but automake stills bakes prints inside the makefiles
-                // and these cause more problems.
-                sh 'find . -name Makefile.in -exec touch {} +'
+        }
+}
 …
                                         description: 'Which compiler to use',                                   \
                                         name: 'Compiler',                                                                       \
                                         choices: 'gcc-9\ngcc-8\ngcc-7\ngcc-6\ngcc-5\ngcc-4.9\nclang',                                   \
+                                        choices: 'gcc-9\ngcc-8\ngcc-7\ngcc-6\ngcc-5\ngcc-4.9\nclang',   \
                                         defaultValue: 'gcc-8',                                                          \
                                 ],                                                                                              \

benchmark/Makefile.am

-              rae2c27a
+              rc76bd34
 creation_cfa_generator_DURATION = 1000000000
 creation_upp_coroutine_DURATION = ${creation_cfa_coroutine_eager_DURATION}
-creation_cfa_thread_DURATION = 10000000
-creation_upp_thread_DURATION = ${creation_cfa_thread_DURATION}
 creation_DURATION = 10000000
 …
 cleancsv:
         rm -f compile.csv basic.csv ctxswitch.csv mutex.csv scheduling.csv
+        rm -f compile.csv basic.csv ctxswitch.csv mutex.csv schedint.csv
 jenkins$(EXEEXT): cleancsv
 …
         +make mutex.csv
         -+make mutex.diff.csv
         +make scheduling.csv
         -+make scheduling.diff.csv
+        +make schedint.csv
+        -+make schedint.diff.csv
 @DOifskipcompile@
         cat compile.csv
 …
         cat mutex.csv
         -cat mutex.diff.csv
         cat scheduling.csv
         -cat scheduling.diff.csv
+        cat schedint.csv
+        -cat schedint.diff.csv
 compile.csv:
 …
         $(srcdir)/fixcsv.sh $@
 scheduling.csv:
+schedint.csv:
         echo "building $@"
         echo "schedint-1,schedint-2,schedext-1,schedext-2" > $@
 …
 ctxswitch-python_coroutine$(EXEEXT):
         $(BENCH_V_PY)echo "#!/bin/sh" > a.out
         echo "python3.7 $(srcdir)/ctxswitch/python_cor.py" >> a.out
+        echo "python3 $(srcdir)/ctxswitch/python_cor.py \"$$""@\"" >> a.out
         chmod a+x a.out
 ctxswitch-nodejs_coroutine$(EXEEXT):
         $(BENCH_V_NODEJS)echo "#!/bin/sh" > a.out
         echo "nodejs $(srcdir)/ctxswitch/node_cor.js" >> a.out
+        echo "nodejs $(srcdir)/ctxswitch/node_cor.js \"$$""@\"" >> a.out
         chmod a+x a.out
 ctxswitch-nodejs_await$(EXEEXT):
         $(BENCH_V_NODEJS)echo "#!/bin/sh" > a.out
         echo "nodejs $(srcdir)/ctxswitch/node_await.js" >> a.out
+        echo "nodejs $(srcdir)/ctxswitch/node_await.js \"$$""@\"" >> a.out
         chmod a+x a.out
 …
         $(BENCH_V_JAVAC)javac -d $(builddir) $(srcdir)/ctxswitch/JavaThread.java
         echo "#!/bin/sh" > a.out
         echo "java JavaThread" >> a.out
+        echo "java JavaThread \"$$""@\"" >> a.out
         chmod a+x a.out
 …
         $(BENCH_V_JAVAC)javac -d $(builddir) $(srcdir)/mutex/JavaThread.java
         echo "#!/bin/sh" > a.out
         echo "java JavaThread" >> a.out
+        echo "java JavaThread \"$$""@\"" >> a.out
         chmod a+x a.out
 …
         $(BENCH_V_JAVAC)javac -d $(builddir) $(srcdir)/schedint/JavaThread.java
         echo "#!/bin/sh" > a.out
         echo "java JavaThread" >> a.out
+        echo "java JavaThread \"$$""@\"" >> a.out
         chmod a+x a.out
 …
 creation-python_coroutine$(EXEEXT):
         $(BENCH_V_PY)echo "#!/bin/sh" > a.out
         echo "python3.7 $(srcdir)/creation/python_cor.py" >> a.out
+        echo "python3 $(srcdir)/creation/python_cor.py \"$$""@\"" >> a.out
         chmod a+x a.out
 creation-nodejs_coroutine$(EXEEXT):
         $(BENCH_V_NODEJS)echo "#!/bin/sh" > a.out
         echo "nodejs $(srcdir)/creation/node_cor.js" >> a.out
+        echo "nodejs $(srcdir)/creation/node_cor.js \"$$""@\"" >> a.out
         chmod a+x a.out
 …
         $(BENCH_V_JAVAC)javac -d $(builddir) $(srcdir)/creation/JavaThread.java
         echo "#!/bin/sh" > a.out
         echo "java JavaThread" >> a.out
+        echo "java JavaThread \"$$""@\"" >> a.out
         chmod a+x a.out
 …
 compile-array$(EXEEXT):
         $(CFACOMPILE) -fsyntax-only -w $(testdir)/array.cfa
+        $(CFACOMPILE) -DNO_COMPILED_PRAGMA -fsyntax-only -w $(testdir)/array.cfa
 compile-attributes$(EXEEXT):
         $(CFACOMPILE) -fsyntax-only -w $(testdir)/attributes.cfa
+        $(CFACOMPILE) -DNO_COMPILED_PRAGMA -fsyntax-only -w $(testdir)/attributes.cfa
 compile-empty$(EXEEXT):
         $(CFACOMPILE) -fsyntax-only -w $(srcdir)/compile/empty.cfa
+        $(CFACOMPILE) -DNO_COMPILED_PRAGMA -fsyntax-only -w $(srcdir)/compile/empty.cfa
 compile-expression$(EXEEXT):
         $(CFACOMPILE) -fsyntax-only -w $(testdir)/expression.cfa
+        $(CFACOMPILE) -DNO_COMPILED_PRAGMA -fsyntax-only -w $(testdir)/expression.cfa
 compile-io$(EXEEXT):
         $(CFACOMPILE) -fsyntax-only -w $(testdir)/io1.cfa
+        $(CFACOMPILE) -DNO_COMPILED_PRAGMA -fsyntax-only -w $(testdir)/io1.cfa
 compile-monitor$(EXEEXT):
         $(CFACOMPILE) -fsyntax-only -w $(testdir)/concurrent/monitor.cfa
+        $(CFACOMPILE) -DNO_COMPILED_PRAGMA -fsyntax-only -w $(testdir)/concurrent/monitor.cfa
 compile-operators$(EXEEXT):
         $(CFACOMPILE) -fsyntax-only -w $(testdir)/operators.cfa
+        $(CFACOMPILE) -DNO_COMPILED_PRAGMA -fsyntax-only -w $(testdir)/operators.cfa
 compile-thread$(EXEEXT):
         $(CFACOMPILE) -fsyntax-only -w $(testdir)/concurrent/thread.cfa
+        $(CFACOMPILE) -DNO_COMPILED_PRAGMA -fsyntax-only -w $(testdir)/concurrent/thread.cfa
 compile-typeof$(EXEEXT):
         $(CFACOMPILE) -fsyntax-only -w $(testdir)/typeof.cfa
+        $(CFACOMPILE) -DNO_COMPILED_PRAGMA -fsyntax-only -w $(testdir)/typeof.cfa
 ## =========================================================================================================

benchmark/creation/JavaThread.java

-              rae2c27a
+              rc76bd34
 public class JavaThread {
         // Simplistic low-quality Marsaglia Shift-XOR pseudo-random number generator.
         // Bijective
+        // Bijective
         // Cycle length for non-zero values is 4G-1.
         // 0 is absorbing and should be avoided -- fixed point.
         // The returned value is typically masked to produce a positive value.
         static volatile int Ticket = 0 ;
+        static volatile int Ticket = 0 ;
         private static int nextRandom (int x) {
                 if (x == 0) {
+                if (x == 0) {
                         // reseed the PRNG
                         // Ticket is accessed infrequently and does not constitute a coherence hot-spot.
                         // Note that we use a non-atomic racy increment -- the race is rare and benign.
                         // If the race is a concern switch to an AtomicInteger.
                         // In addition accesses to the RW volatile global "Ticket"  variable are not
                         // (readily) predictable at compile-time so the JIT will not be able to elide
                         // nextRandom() invocations.
                         x = ++Ticket ;
                         if (x == 0) x = 1 ;
+                        // Ticket is accessed infrequently and does not constitute a coherence hot-spot.
+                        // Note that we use a non-atomic racy increment -- the race is rare and benign.
+                        // If the race is a concern switch to an AtomicInteger.
+                        // In addition accesses to the RW volatile global "Ticket"  variable are not
+                        // (readily) predictable at compile-time so the JIT will not be able to elide
+                        // nextRandom() invocations.
+                        x = ++Ticket ;
+                        if (x == 0) x = 1 ;
+                }
                 x ^= x << 6;
                 x ^= x >>> 21;
                 x ^= x << 7;
                 return x ;
+                return x ;
+        }
         static int x = 2;
         static private int times = Integer.parseInt("10000") ;
+        static private long times = Long.parseLong("10000") ;
         public static class MyThread extends Thread {
 …
+        }
         public static void helper() throws InterruptedException {
                 for(int i = 1; i <= times; i += 1) {
+                for(long i = 1; i <= times; i += 1) {
                         MyThread m = new MyThread();
                         x = nextRandom( x );
 …
+        }
         public static void main(String[] args) throws InterruptedException {
                 if ( args.length > 2 ) System.exit( 1 );
                 if ( args.length == 2 ) { times = Integer.parseInt(args[1]); }
+                if ( args.length > 1 ) System.exit( 1 );
+                if ( args.length == 1 ) { times = Long.parseLong(args[0]); }
                 for (int i = Integer.parseInt("5"); --i >= 0 ; ) {
+                for (int i = Integer.parseInt("5"); --i >= 0 ; ) {
                         InnerMain();
                         Thread.sleep(2000);             // 2 seconds

benchmark/ctxswitch/JavaThread.java

-              rae2c27a
+              rc76bd34
 public class JavaThread {
         // Simplistic low-quality Marsaglia Shift-XOR pseudo-random number generator.
         // Bijective
+        // Bijective
         // Cycle length for non-zero values is 4G-1.
         // 0 is absorbing and should be avoided -- fixed point.
         // The returned value is typically masked to produce a positive value.
         static volatile int Ticket = 0 ;
+        static volatile int Ticket = 0 ;
         private static int nextRandom (int x) {
                 if (x == 0) {
+                if (x == 0) {
                         // reseed the PRNG
                         // Ticket is accessed infrequently and does not constitute a coherence hot-spot.
                         // Note that we use a non-atomic racy increment -- the race is rare and benign.
                         // If the race is a concern switch to an AtomicInteger.
                         // In addition accesses to the RW volatile global "Ticket"  variable are not
                         // (readily) predictable at compile-time so the JIT will not be able to elide
                         // nextRandom() invocations.
                         x = ++Ticket ;
                         if (x == 0) x = 1 ;
+                        // Ticket is accessed infrequently and does not constitute a coherence hot-spot.
+                        // Note that we use a non-atomic racy increment -- the race is rare and benign.
+                        // If the race is a concern switch to an AtomicInteger.
+                        // In addition accesses to the RW volatile global "Ticket"  variable are not
+                        // (readily) predictable at compile-time so the JIT will not be able to elide
+                        // nextRandom() invocations.
+                        x = ++Ticket ;
+                        if (x == 0) x = 1 ;
+                }
                 x ^= x << 6;
                 x ^= x >>> 21;
                 x ^= x << 7;
                 return x ;
+                return x ;
+        }
         static int x = 2;
         static private int times = Integer.parseInt("100000");
+        static private long times = Long.parseLong("100000");
         public static void helper() {
                 for(int i = 1; i <= times; i += 1) {
+                for(long i = 1; i <= times; i += 1) {
                         Thread.yield();
+                }
 …
+        }
         public static void main(String[] args) throws InterruptedException {
                 if ( args.length > 2 ) System.exit( 1 );
                 if ( args.length == 2 ) { times = Integer.parseInt(args[1]); }
+                if ( args.length > 1 ) System.exit( 1 );
+                if ( args.length == 1 ) { times = Long.parseLong(args[0]); }
                 for (int i = Integer.parseInt("5"); --i >= 0 ; ) {

benchmark/io/http/filecache.cfa

-              rae2c27a
+              rc76bd34
         cache_line * entries;
         size_t size;
+        int * rawfds;
+        int nfds;
 } file_cache;
 …
+}
 int put_file( cache_line & entry ) {
+int put_file( cache_line & entry, int fd ) {
         uint32_t idx = murmur3_32( (const uint8_t *)entry.file, strlen(entry.file), options.file_cache.hash_seed ) % file_cache.size;
 …
         file_cache.entries[idx] = entry;
+        file_cache.entries[idx].fd = fd;
         return i > 0 ? 1 : 0;
+}
 …
         size_t fcount = 0;
         size_t fsize = 16;
+        cache_line * raw = 0p;
+        raw = alloc(raw, fsize, true);
+        cache_line * raw = alloc(fsize);
         // Step 1 get a dense array of all files
         int walk(const char *fpath, const struct stat *sb, int typeflag) {
 …
                 if(fcount > fsize) {
                         fsize *= 2;
                         raw = alloc(raw, fsize, true);
+                        raw = alloc(fsize, raw`realloc);
+                }
 …
         file_cache.entries = anew(file_cache.size);
+        if(options.file_cache.fixed_fds) {
+                file_cache.nfds   = fcount;
+                file_cache.rawfds = alloc(fcount);
+        }
         // Step 3 fill the cache
         int conflicts = 0;
         for(i; fcount) {
+                conflicts += put_file( raw[i] );
+                int fd;
+                if(options.file_cache.fixed_fds) {
+                        file_cache.rawfds[i] = raw[i].fd;
+                        fd = i;
+                }
+                else {
+                        fd = raw[i].fd;
+                }
+                conflicts += put_file( raw[i], fd );
+        }
         printf("Filled cache from path \"%s\" with %zu files\n", path, fcount);
 …
+        }
+        return [aalloc(extra), 0];
+        size_t s = file_cache.nfds + extra;
+        int * data = alloc(s, file_cache.rawfds`realloc);
+        return [data, file_cache.nfds];
+}

benchmark/io/http/main.cfa

-              rae2c27a
+              rc76bd34
 #include <kernel.hfa>
 #include <stats.hfa>
+#include <time.hfa>
 #include <thread.hfa>
-#include "channel.hfa"
 #include "filecache.hfa"
 #include "options.hfa"
 #include "worker.hfa"
+extern void register_fixed_files( cluster &, int *, unsigned count );
+Duration default_preemption() {
+        return 0;
+}
 //=============================================================================================
 // Globals
 //=============================================================================================
-channel & wait_connect;
 struct ServerProc {
         processor self;
 …
         // Run Server Cluster
+        {
                 cluster cl = { "Server Cluster", options.clopts.flags };
+                cluster cl = { "Server Cluster", options.clopts.params };
                 #if !defined(__CFA_NO_STATISTICS__)
                         print_stats_at_exit( cl, CFA_STATS_READY_Q | CFA_STATS_IO );
                 #endif
                 options.clopts.instance = &cl;
-                channel chan = { options.clopts.chan_size };
-                &wait_connect = &chan;
                 int pipe_cnt = options.clopts.nworkers * 2;
 …
+                }
+                if(options.file_cache.fixed_fds) {
+                        register_fixed_files(cl, fds, pipe_off);
+                }
+                {
                         ServerProc procs[options.clopts.nprocs];
 …
                                 Worker workers[options.clopts.nworkers];
                                 for(i; options.clopts.nworkers) {
+                                        if( options.file_cache.fixed_fds ) {
+                                                workers[i].pipe[0] = pipe_off + (i * 2) + 0;
+                                                workers[i].pipe[1] = pipe_off + (i * 2) + 1;
+                                        }
+                                        else {
+                                        // if( options.file_cache.fixed_fds ) {
+                                        //      workers[i].pipe[0] = pipe_off + (i * 2) + 0;
+                                        //      workers[i].pipe[1] = pipe_off + (i * 2) + 1;
+                                        // }
+                                        // else
+                                        {
                                                 workers[i].pipe[0] = fds[pipe_off + (i * 2) + 0];
                                                 workers[i].pipe[1] = fds[pipe_off + (i * 2) + 1];
+                                                workers[i].sockfd  = server_fd;
+                                                workers[i].addr    = (struct sockaddr *)&address;
+                                                workers[i].addrlen = (socklen_t*)&addrlen;
+                                                workers[i].flags   = 0;
+                                        }
                                         unpark( workers[i] __cfaabi_dbg_ctx2 );
+                                        unpark( workers[i] );
+                                }
                                 printf("%d workers started on %d processors\n", options.clopts.nworkers, options.clopts.nprocs);
+                                {
-                                        Acceptor acceptor = { server_fd, (struct sockaddr *)&address, (socklen_t*)&addrlen, 0 };
                                         char buffer[128];
                                         while(!feof(stdin)) {
 …
                                         printf("Shutting Down\n");
+                                }
-                                printf("Acceptor Closed\n");
-                                // Clean-up the workers
-                                for(options.clopts.nworkers) {
-                                        put( wait_connect, -1 );
+                                }
+                        }

benchmark/io/http/options.cfa

-              rae2c27a
+              rc76bd34
 ,     // nworkers;
 ,     // flags;
-,    // chan_size;
                 false, // procstats
                 false, // viewhalts
 …
 const char * parse_options( int argc, char * argv[] ) {
-        bool uthrdpo = false;
         bool subthrd = false;
         bool eagrsub = false;
 …
                 {'t', "threads",        "Number of worker threads to use", options.clopts.nworkers},
                 {'b', "accept-backlog", "Maximum number of pending accepts", options.socket.backlog},
-                {'B', "channel-size",   "Maximum number of accepted connection pending", options.clopts.chan_size},
                 {'r', "request_len",    "Maximum number of bytes in the http request, requests with more data will be answered with Http Code 414", options.socket.buflen},
                 {'S', "seed",           "seed to use for hashing", options.file_cache.hash_seed },
                 {'C', "cache-size",     "Size of the cache to use, if set to small, will uses closes power of 2", options.file_cache.size },
                 {'l', "list-files",     "List the files in the specified path and exit", options.file_cache.list, parse_settrue },
-                {'u', "userthread",     "If set, cluster uses user-thread to poll I/O", uthrdpo, parse_settrue },
                 {'s', "submitthread",   "If set, cluster uses polling thread to submit I/O", subthrd, parse_settrue },
                 {'e', "eagersubmit",    "If set, cluster submits I/O eagerly but still aggregates submits", eagrsub, parse_settrue},
 …
         parse_args( argc, argv, opt, opt_cnt, "[OPTIONS]... [PATH]\ncforall http server", left );
+        if( uthrdpo ) {
+                options.clopts.flags |= CFA_CLUSTER_IO_POLLER_USER_THREAD;
+        }
+        if( subthrd ) {
+                options.clopts.flags |= CFA_CLUSTER_IO_POLLER_THREAD_SUBMITS;
+        }
+        if( eagrsub ) {
+                options.clopts.flags |= CFA_CLUSTER_IO_EAGER_SUBMITS;
+        }
+        options.clopts.params.poller_submits = subthrd;
+        options.clopts.params.eager_submits  = eagrsub;
         if( fixedfd ) {
 …
         if( sqkpoll ) {
                 options.clopts.flags |= CFA_CLUSTER_IO_KERNEL_POLL_SUBMITS;
+                options.clopts.params.poll_submit = true;
                 options.file_cache.fixed_fds = true;
+        }
         if( iokpoll ) {
                 options.clopts.flags |= CFA_CLUSTER_IO_KERNEL_POLL_COMPLETES;
+                options.clopts.params.poll_complete = true;
                 options.file_cache.open_flags |= O_DIRECT;
+        }
         options.clopts.flags |= (sublen << CFA_CLUSTER_IO_BUFFLEN_OFFSET);
+        options.clopts.params.num_ready = sublen;
         if( left[0] == 0p ) { return "."; }

benchmark/io/http/options.hfa

-              rae2c27a
+              rc76bd34
 #include <stdint.h>
+#include <kernel.hfa>
 struct cluster;
 …
                 int nprocs;
                 int nworkers;
+                int flags;
+                int chan_size;
+                io_context_params params;
                 bool procstats;
                 bool viewhalts;

benchmark/io/http/protocol.cfa

-              rae2c27a
+              rc76bd34
 extern "C" {
       int snprintf ( char * s, size_t n, const char * format, ... );
+        #include <linux/io_uring.h>
+}
 #include <string.h>
 #include <errno.h>
+#include "options.hfa"
 const char * http_msgs[] = {
 …
         READ:
         for() {
                 int ret = cfa_read(fd, it, count);
                 if(ret == 0 ) return [OK200, true, 0p, 0];
+                int ret = cfa_read(fd, (void*)it, count, 0, -1`s, 0p, 0p);
+                if(ret == 0 ) return [OK200, true, 0, 0];
                 if(ret < 0 ) {
                         if( errno == EAGAIN || errno == EWOULDBLOCK) continue READ;
 …
                 count -= ret;
                 if( count < 1 ) return [E414, false, 0p, 0];
+                if( count < 1 ) return [E414, false, 0, 0];
+        }
 …
         it = buffer;
         int ret = memcmp(it, "GET /", 5);
         if( ret != 0 ) return [E400, false, 0p, 0];
+        if( ret != 0 ) return [E400, false, 0, 0];
         it += 5;
 …
         ssize_t ret;
         SPLICE1: while(count > 0) {
                 ret = cfa_splice(ans_fd, &offset, pipe[1], 0p, count, SPLICE_F_MOVE | SPLICE_F_MORE);
+                ret = cfa_splice(ans_fd, &offset, pipe[1], 0p, count, SPLICE_F_MOVE | SPLICE_F_MORE, 0, -1`s, 0p, 0p);
                 if( ret < 0 ) {
                         if( errno != EAGAIN && errno != EWOULDBLOCK) continue SPLICE1;
 …
                 size_t in_pipe = ret;
                 SPLICE2: while(in_pipe > 0) {
                         ret = cfa_splice(pipe[0], 0p, fd, 0p, in_pipe, SPLICE_F_MOVE | SPLICE_F_MORE);
+                        ret = cfa_splice(pipe[0], 0p, fd, 0p, in_pipe, SPLICE_F_MOVE | SPLICE_F_MORE, 0, -1`s, 0p, 0p);
                         if( ret < 0 ) {
                                 if( errno != EAGAIN && errno != EWOULDBLOCK) continue SPLICE2;

benchmark/io/http/worker.cfa

-              rae2c27a
+              rc76bd34
 void main( Worker & this ) {
         park( __cfaabi_dbg_ctx );
+        park();
         /* paranoid */ assert( this.pipe[0] != -1 );
         /* paranoid */ assert( this.pipe[1] != -1 );
 …
         CONNECTION:
         for() {
+                int fd = take(wait_connect);
+                if (fd < 0) break;
+                int fd = cfa_accept4( this.[sockfd, addr, addrlen, flags], 0, -1`s, 0p, 0p );
+                if(fd < 0) {
+                        if( errno == ECONNABORTED ) break;
+                        abort( "accept error: (%d) %s\n", (int)errno, strerror(errno) );
+                }
                 printf("New connection %d, waiting for requests\n", fd);
 …
+        }
+}
-//=============================================================================================
-// Acceptor Thread
-//=============================================================================================
-void ?{}( Acceptor & this, int sockfd, struct sockaddr * addr, socklen_t * addrlen, int flags ) {
-        ((thread&)this){ "Acceptor Thread", *options.clopts.instance };
-        this.sockfd  = sockfd;
-        this.addr    = addr;
-        this.addrlen = addrlen;
-        this.flags   = flags;
+}
-void main( Acceptor & this ) {
-        for() {
-                int ret = cfa_accept4( this.[sockfd, addr, addrlen, flags] );
-                if(ret < 0) {
-                        if( errno == ECONNABORTED ) break;
-                        abort( "accept error: (%d) %s\n", (int)errno, strerror(errno) );
+                }
-                printf("New connection accepted\n");
-                put( wait_connect, ret );
+        }
+}

benchmark/io/http/worker.hfa

-              rae2c27a
+              rc76bd34
+}
-#include "channel.hfa"
-extern channel & wait_connect;
 //=============================================================================================
 // Worker Thread
 …
 thread Worker {
         int pipe[2];
-};
-void ?{}( Worker & this );
-void main( Worker & );
-//=============================================================================================
-// Acceptor Thread
-//=============================================================================================
-thread Acceptor {
         int sockfd;
         struct sockaddr * addr;
 …
         int flags;
 };
+void ?{}( Acceptor & this, int sockfd, struct sockaddr * addr, socklen_t * addrlen, int flags );
+void main( Acceptor & this );
+void ?{}( Worker & this);
+void main( Worker & );

benchmark/io/readv.cfa

-              rae2c27a
+              rc76bd34
 void main( Reader & ) {
         park( __cfaabi_dbg_ctx );
+        park();
         /* paranoid */ assert( true == __atomic_load_n(&run, __ATOMIC_RELAXED) );
 …
                                 for(i; nthreads) {
                                         unpark( threads[i] __cfaabi_dbg_ctx2 );
+                                        unpark( threads[i] );
+                                }
                                 wait(duration, start, end, is_tty);

benchmark/mutex/JavaThread.java

-              rae2c27a
+              rc76bd34
 public class JavaThread {
         // Simplistic low-quality Marsaglia Shift-XOR pseudo-random number generator.
         // Bijective
+        // Bijective
         // Cycle length for non-zero values is 4G-1.
         // 0 is absorbing and should be avoided -- fixed point.
         // The returned value is typically masked to produce a positive value.
         static volatile int Ticket = 0 ;
+        static volatile int Ticket = 0 ;
         private static int nextRandom (int x) {
                 if (x == 0) {
+                if (x == 0) {
                         // reseed the PRNG
                         // Ticket is accessed infrequently and does not constitute a coherence hot-spot.
                         // Note that we use a non-atomic racy increment -- the race is rare and benign.
                         // If the race is a concern switch to an AtomicInteger.
                         // In addition accesses to the RW volatile global "Ticket"  variable are not
                         // (readily) predictable at compile-time so the JIT will not be able to elide
                         // nextRandom() invocations.
                         x = ++Ticket ;
                         if (x == 0) x = 1 ;
+                        // Ticket is accessed infrequently and does not constitute a coherence hot-spot.
+                        // Note that we use a non-atomic racy increment -- the race is rare and benign.
+                        // If the race is a concern switch to an AtomicInteger.
+                        // In addition accesses to the RW volatile global "Ticket"  variable are not
+                        // (readily) predictable at compile-time so the JIT will not be able to elide
+                        // nextRandom() invocations.
+                        x = ++Ticket ;
+                        if (x == 0) x = 1 ;
+                }
                 x ^= x << 6;
                 x ^= x >>> 21;
                 x ^= x << 7;
                 return x ;
+                return x ;
+        }
         static int x = 2;
         static private int times = Integer.parseInt("100000000");
+        static private long times = Long.parseLong("100000000");
         public synchronized void noop() {
 …
                 JavaThread j = new JavaThread();
                 // Inhibit biased locking ...
                 x = (j.hashCode() ^ System.identityHashCode(j)) | 1 ;
                 for(int i = 1; i <= times; i += 1) {
+                x = (j.hashCode() ^ System.identityHashCode(j)) | 1 ;
+                for(long i = 1; i <= times; i += 1) {
                         x = nextRandom(x);
                         j.noop();
 …
+        }
         public static void main(String[] args) throws InterruptedException {
                 if ( args.length > 2 ) System.exit( 1 );
                 if ( args.length == 2 ) { times = Integer.parseInt(args[1]); }
+                if ( args.length > 1 ) System.exit( 1 );
+                if ( args.length == 1 ) { times = Long.parseLong(args[0]); }
                 for (int n = Integer.parseInt("5"); --n >= 0 ; ) {
+                for (int n = Integer.parseInt("5"); --n >= 0 ; ) {
                         InnerMain();
                         Thread.sleep(2000);     // 2 seconds

benchmark/mutexC/JavaThread.java

-              rae2c27a
+              rc76bd34
 class Noop {
         // Simplistic low-quality Marsaglia Shift-XOR pseudo-random number generator.
         // Bijective
+        // Bijective
         // Cycle length for non-zero values is 4G-1.
         // 0 is absorbing and should be avoided -- fixed point.
         // The returned value is typically masked to produce a positive value.
         static volatile int Ticket = 0 ;
+        static volatile int Ticket = 0 ;
         public static int nextRandom( int x ) {
                 if (x == 0) {
+                if (x == 0) {
                         // reseed the PRNG
                         // Ticket is accessed infrequently and does not constitute a coherence hot-spot.
                         // Note that we use a non-atomic racy increment -- the race is rare and benign.
                         // If the race is a concern switch to an AtomicInteger.
                         // In addition accesses to the RW volatile global "Ticket"  variable are not
                         // (readily) predictable at compile-time so the JIT will not be able to elide
                         // nextRandom() invocations.
                         x = ++Ticket ;
                         if (x == 0) x = 1 ;
+                        // Ticket is accessed infrequently and does not constitute a coherence hot-spot.
+                        // Note that we use a non-atomic racy increment -- the race is rare and benign.
+                        // If the race is a concern switch to an AtomicInteger.
+                        // In addition accesses to the RW volatile global "Ticket"  variable are not
+                        // (readily) predictable at compile-time so the JIT will not be able to elide
+                        // nextRandom() invocations.
+                        x = ++Ticket ;
+                        if (x == 0) x = 1 ;
+                }
                 x ^= x << 6;
                 x ^= x >>> 21;
                 x ^= x << 7;
                 return x ;
+                return x ;
+        }
+}
 …
         static int x = 2;
         static private int times = Integer.parseInt("10000000");
+        static private long times = Long.parseLong("10000000");
         public static void call( Monitor m ) throws InterruptedException {
 …
                 m.go = true;
                 //while ( ! m.go2 );
                 for ( int i = 0; i < times; i += 1 ) {
+                for ( long i = 0; i < times; i += 1 ) {
                         m.call();
                         x = Noop.nextRandom( x );
 …
         public static void main( String[] args ) throws InterruptedException {
                 if ( args.length > 2 ) System.exit( 1 );
                 if ( args.length == 2 ) { times = Integer.parseInt(args[1]); }
+                if ( args.length == 2 ) { times = Long.parseLong(args[1]); }
+                if ( args.length > 2 ) System.exit( 1 );
+                if ( args.length == 2 ) { times = Integer.parseInt(args[1]); }
+                for ( int i = Integer.parseInt("5"); --i >= 0 ; ) {
+                for ( int i = Integer.parseInt("5"); --i >= 0 ; ) {
                         InnerMain();
                         // Thread.sleep(2000);  // 2 seconds

benchmark/readyQ/yield.cfa

-              rae2c27a
+              rc76bd34
 void main( Yielder & this ) {
         park( __cfaabi_dbg_ctx );
+        park();
         /* paranoid */ assert( true == __atomic_load_n(&run, __ATOMIC_RELAXED) );
 …
                                 for(i; nthreads) {
                                         unpark( threads[i] __cfaabi_dbg_ctx2 );
+                                        unpark( threads[i] );
+                                }
                                 wait(duration, start, end, is_tty);

benchmark/schedint/JavaThread.java

-              rae2c27a
+              rc76bd34
 public class JavaThread {
         // Simplistic low-quality Marsaglia Shift-XOR pseudo-random number generator.
         // Bijective
+        // Bijective
         // Cycle length for non-zero values is 4G-1.
         // 0 is absorbing and should be avoided -- fixed point.
         // The returned value is typically masked to produce a positive value.
         static volatile int Ticket = 0 ;
+        static volatile int Ticket = 0 ;
         private static int nextRandom (int x) {
                 if (x == 0) {
+                if (x == 0) {
                         // reseed the PRNG
                         // Ticket is accessed infrequently and does not constitute a coherence hot-spot.
                         // Note that we use a non-atomic racy increment -- the race is rare and benign.
                         // If the race is a concern switch to an AtomicInteger.
                         // In addition accesses to the RW volatile global "Ticket"  variable are not
                         // (readily) predictable at compile-time so the JIT will not be able to elide
                         // nextRandom() invocations.
                         x = ++Ticket ;
                         if (x == 0) x = 1 ;
+                        // Ticket is accessed infrequently and does not constitute a coherence hot-spot.
+                        // Note that we use a non-atomic racy increment -- the race is rare and benign.
+                        // If the race is a concern switch to an AtomicInteger.
+                        // In addition accesses to the RW volatile global "Ticket"  variable are not
+                        // (readily) predictable at compile-time so the JIT will not be able to elide
+                        // nextRandom() invocations.
+                        x = ++Ticket ;
+                        if (x == 0) x = 1 ;
+                }
                 x ^= x << 6;
                 x ^= x >>> 21;
                 x ^= x << 7;
                 return x ;
+                return x ;
+        }
         static int x = 2;
         static private int times = Integer.parseInt("1000000");
+        static private long times = Long.parseLong("1000000");
         public static void helper( Monitor m ) throws InterruptedException {
                 for(int i = 1; i <= times; i += 1) {
+                for(long i = 1; i <= times; i += 1) {
                         m.wait();               // relase monitor lock
                         m.next = true;
 …
+        }
         public static void main(String[] args) throws InterruptedException {
                 if ( args.length > 2 ) System.exit( 1 );
                 if ( args.length == 2 ) { times = Integer.parseInt(args[1]); }
+                if ( args.length > 1 ) System.exit( 1 );
+                if ( args.length == 1 ) { times = Long.parseLong(args[0]); }
                 for (int n = Integer.parseInt("5"); --n >= 0 ; ) {
+                for (int n = Integer.parseInt("5"); --n >= 0 ; ) {
                         InnerMain();
                         Thread.sleep(2000);     // 2 seconds

doc/LaTeXmacros/common.tex

-              rae2c27a
+              rc76bd34
 %% Created On       : Sat Apr  9 10:06:17 2016
 %% Last Modified By : Peter A. Buhr
 %% Last Modified On : Fri Sep  4 13:56:52 2020
 %% Update Count     : 383
+%% Last Modified On : Mon Oct  5 09:34:46 2020
+%% Update Count     : 464
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
 \newlength{\parindentlnth}
 \setlength{\parindentlnth}{\parindent}
-\newcommand{\LstBasicStyle}[1]{{\lst@basicstyle{#1}}}
-\newcommand{\LstKeywordStyle}[1]{{\lst@basicstyle{\lst@keywordstyle{#1}}}}
-\newcommand{\LstCommentStyle}[1]{{\lst@basicstyle{\lst@commentstyle{#1}}}}
-\newlength{\gcolumnposn}                                % temporary hack because lstlisting does not handle tabs correctly
-\newlength{\columnposn}
-\setlength{\gcolumnposn}{2.75in}
-\setlength{\columnposn}{\gcolumnposn}
-\newcommand{\C}[2][\@empty]{\ifx#1\@empty\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\hfill\makebox[\textwidth-\columnposn][l]{\lst@basicstyle{\LstCommentStyle{#2}}}}
-\newcommand{\CRT}{\global\columnposn=\gcolumnposn}
-% allow escape sequence in lstinline
-%\usepackage{etoolbox}
-%\patchcmd{\lsthk@TextStyle}{\let\lst@DefEsc\@empty}{}{}{\errmessage{failed to patch}}
 \usepackage{pslatex}                                    % reduce size of san serif font
 …
 \usepackage{listings}                                                                   % format program code
 \usepackage{lstlang}
+\newcommand{\CFADefaults}{%
+\makeatletter
+\newcommand{\LstBasicStyle}[1]{{\lst@basicstyle{#1}}}
+\newcommand{\LstKeywordStyle}[1]{{\lst@basicstyle{\lst@keywordstyle{#1}}}}
+\newcommand{\LstCommentStyle}[1]{{\lst@basicstyle{\lst@commentstyle{#1}}}}
+\newlength{\gcolumnposn}                                % temporary hack because lstlisting does not handle tabs correctly
+\newlength{\columnposn}
+\setlength{\gcolumnposn}{2.75in}
+\setlength{\columnposn}{\gcolumnposn}
+\newcommand{\C}[2][\@empty]{\ifx#1\@empty\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\hfill\makebox[\textwidth-\columnposn][l]{\lst@basicstyle{\LstCommentStyle{#2}}}}
+\newcommand{\CRT}{\global\columnposn=\gcolumnposn}
+% allow escape sequence in lstinline
+%\usepackage{etoolbox}
+%\patchcmd{\lsthk@TextStyle}{\let\lst@DefEsc\@empty}{}{}{\errmessage{failed to patch}}
+% allow adding to lst literate
+\def\addToLiterate#1{\protect\edef\lst@literate{\unexpanded\expandafter{\lst@literate}\unexpanded{#1}}}
+\lst@Key{add to literate}{}{\addToLiterate{#1}}
+\makeatother
+\newcommand{\CFAStyle}{%
 \lstset{
-language=CFA,
 columns=fullflexible,
 basicstyle=\linespread{0.9}\sf,                 % reduce line spacing and use sanserif font
 …
 belowskip=3pt,
 % replace/adjust listing characters that look bad in sanserif
 literate={-}{\makebox[1ex][c]{\raisebox{0.4ex}{\rule{0.8ex}{0.1ex}}}}1 {^}{\raisebox{0.6ex}{$\scriptscriptstyle\land\,$}}1
+literate={-}{\makebox[1ex][c]{\raisebox{0.4ex}{\rule{0.75ex}{0.1ex}}}}1 {^}{\raisebox{0.6ex}{$\scriptscriptstyle\land\,$}}1
         {~}{\raisebox{0.3ex}{$\scriptstyle\sim\,$}}1 {`}{\ttfamily\upshape\hspace*{-0.1ex}`}1
         {<-}{$\leftarrow$}2 {=>}{$\Rightarrow$}2 {->}{\makebox[1ex][c]{\raisebox{0.4ex}{\rule{0.8ex}{0.075ex}}}\kern-0.2ex\textgreater}2,
+}% lstset
+}% CFAStyle
+\ifdefined\CFALatin% extra Latin-1 escape characters
+\lstnewenvironment{cfa}[1][]{
+\lstset{
+language=CFA,
 moredelim=**[is][\color{red}]{®}{®},    % red highlighting ®...® (registered trademark symbol) emacs: C-q M-.
 moredelim=**[is][\color{blue}]{ß}{ß},   % blue highlighting ß...ß (sharp s symbol) emacs: C-q M-_
 moredelim=**[is][\color{OliveGreen}]{¢}{¢}, % green highlighting ¢...¢ (cent symbol) emacs: C-q M-"
 moredelim=[is][\lstset{keywords={}}]{¶}{¶}, % keyword escape ¶...¶ (pilcrow symbol) emacs: C-q M-^
+% replace/adjust listing characters that look bad in sanserif
+add to literate={`}{\ttfamily\upshape\hspace*{-0.1ex}`}1
 }% lstset
+}% CFADefaults
+\newcommand{\CFAStyle}{%
+\CFADefaults
+\lstset{#1}
+}{}
 % inline code ©...© (copyright symbol) emacs: C-q M-)
 \lstMakeShortInline©                                    % single-character for \lstinline
+}% CFAStyle
+\lstnewenvironment{cfa}[1][]
+{\CFADefaults\lstset{#1}}
+{}
+\else% regular ASCI characters
+\lstnewenvironment{cfa}[1][]{
+\lstset{
+language=CFA,
+escapechar=\$,                                                  % LaTeX escape in CFA code
+moredelim=**[is][\color{red}]{@}{@},    % red highlighting @...@
+}% lstset
+\lstset{#1}
+}{}
+% inline code @...@ (at symbol)
+\lstMakeShortInline@                                    % single-character for \lstinline
+\fi%
 % Local Variables: %

doc/LaTeXmacros/lstlang.sty

-              rae2c27a
+              rc76bd34
 %% Created On       : Sat May 13 16:34:42 2017
 %% Last Modified By : Peter A. Buhr
 %% Last Modified On : Tue Jan  8 14:40:33 2019
 %% Update Count     : 21
+%% Last Modified On : Wed Sep 23 22:40:04 2020
+%% Update Count     : 24
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
                 auto, _Bool, catch, catchResume, choose, _Complex, __complex, __complex__, __const, __const__,
                 coroutine, disable, dtype, enable, exception, __extension__, fallthrough, fallthru, finally,
                 __float80, float80, __float128, float128, forall, ftype, _Generic, _Imaginary, __imag, __imag__,
+                __float80, float80, __float128, float128, forall, ftype, generator, _Generic, _Imaginary, __imag, __imag__,
                 inline, __inline, __inline__, __int128, int128, __label__, monitor, mutex, _Noreturn, one_t, or,
                 otype, restrict, __restrict, __restrict__, __signed, __signed__, _Static_assert, thread,
+                otype, restrict, __restrict, __restrict__, __signed, __signed__, _Static_assert, suspend, thread,
                 _Thread_local, throw, throwResume, timeout, trait, try, ttype, typeof, __typeof, __typeof__,
                 virtual, __volatile, __volatile__, waitfor, when, with, zero_t,
 …
 % C++ programming language
+\lstdefinelanguage{C++}[ANSI]{C++}{}
+\lstdefinelanguage{C++}[ANSI]{C++}{
+        morekeywords={nullptr,}
+}
 % uC++ programming language, based on ANSI C++

doc/bibliography/pl.bib

-              rae2c27a
+              rc76bd34
     key         = {Cforall Benchmarks},
     author      = {{\textsf{C}{$\mathbf{\forall}$} Benchmarks}},
     howpublished= {\href{https://plg.uwaterloo.ca/~cforall/doc/CforallConcurrentBenchmarks.tar}{https://\-plg.uwaterloo.ca/\-$\sim$cforall/\-doc/\-CforallConcurrentBenchmarks.tar}},
+    howpublished= {\href{https://github.com/cforall/ConcurrentBenchmarks_SPE20}{https://\-github.com/\-cforall/\-ConcurrentBenchmarks\_SPE20}},
+}
 …
     title       = {Cooperating Sequential Processes},
     institution = {Technological University},
     address     = {Eindhoven, Netherlands},
+    address     = {Eindhoven, Neth.},
     year        = 1965,
     note        = {Reprinted in \cite{Genuys68} pp. 43--112.}

doc/papers/concurrency/Paper.tex

-              rae2c27a
+              rc76bd34
 {}
 \lstnewenvironment{C++}[1][]                            % use C++ style
 {\lstset{language=C++,moredelim=**[is][\protect\color{red}]{`}{`},#1}\lstset{#1}}
+{\lstset{language=C++,moredelim=**[is][\protect\color{red}]{`}{`}}\lstset{#1}}
 {}
 \lstnewenvironment{uC++}[1][]
 {\lstset{language=uC++,moredelim=**[is][\protect\color{red}]{`}{`},#1}\lstset{#1}}
+{\lstset{language=uC++,moredelim=**[is][\protect\color{red}]{`}{`}}\lstset{#1}}
 {}
 \lstnewenvironment{Go}[1][]
 {\lstset{language=Golang,moredelim=**[is][\protect\color{red}]{`}{`},#1}\lstset{#1}}
+{\lstset{language=Golang,moredelim=**[is][\protect\color{red}]{`}{`}}\lstset{#1}}
 {}
 \lstnewenvironment{python}[1][]
 {\lstset{language=python,moredelim=**[is][\protect\color{red}]{`}{`},#1}\lstset{#1}}
+{\lstset{language=python,moredelim=**[is][\protect\color{red}]{`}{`}}\lstset{#1}}
 {}
 \lstnewenvironment{java}[1][]
 {\lstset{language=java,moredelim=**[is][\protect\color{red}]{`}{`},#1}\lstset{#1}}
+{\lstset{language=java,moredelim=**[is][\protect\color{red}]{`}{`}}\lstset{#1}}
 {}
 …
 \begin{document}
 \linenumbers                            % comment out to turn off line numbering
+%\linenumbers                           % comment out to turn off line numbering
 \maketitle
 …
 \hline
 stateful                        & thread        & \multicolumn{1}{c|}{No} & \multicolumn{1}{c}{Yes} \\
 \hline
 \hline
+\hline
+\hline
 No                                      & No            & \textbf{1}\ \ \ @struct@                              & \textbf{2}\ \ \ @mutex@ @struct@              \\
 \hline
+\hline
 Yes (stackless)         & No            & \textbf{3}\ \ \ @generator@                   & \textbf{4}\ \ \ @mutex@ @generator@   \\
 \hline
+\hline
 Yes (stackful)          & No            & \textbf{5}\ \ \ @coroutine@                   & \textbf{6}\ \ \ @mutex@ @coroutine@   \\
 \hline
+\hline
 No                                      & Yes           & \textbf{7}\ \ \ {\color{red}rejected} & \textbf{8}\ \ \ {\color{red}rejected} \\
 \hline
+\hline
 Yes (stackless)         & Yes           & \textbf{9}\ \ \ {\color{red}rejected} & \textbf{10}\ \ \ {\color{red}rejected} \\
 \hline
+\hline
 Yes (stackful)          & Yes           & \textbf{11}\ \ \ @thread@                             & \textbf{12}\ \ @mutex@ @thread@               \\
 \end{tabular}
 …
 \label{s:RuntimeStructureCluster}
 A \newterm{cluster} is a collection of user and kernel threads, where the kernel threads run the user threads from the cluster's ready queue, and the operating system runs the kernel threads on the processors from its ready queue.
+A \newterm{cluster} is a collection of user and kernel threads, where the kernel threads run the user threads from the cluster's ready queue, and the operating system runs the kernel threads on the processors from its ready queue~\cite{Buhr90a}.
 The term \newterm{virtual processor} is introduced as a synonym for kernel thread to disambiguate between user and kernel thread.
 From the language perspective, a virtual processor is an actual processor (core).
 …
 \end{cfa}
 where CPU time in nanoseconds is from the appropriate language clock.
+Each benchmark is performed @N@ times, where @N@ is selected so the benchmark runs in the range of 2--20 seconds for the specific programming language.
+Each benchmark is performed @N@ times, where @N@ is selected so the benchmark runs in the range of 2--20 seconds for the specific programming language;
+each @N@ appears after the experiment name in the following tables.
 The total time is divided by @N@ to obtain the average time for a benchmark.
 Each benchmark experiment is run 13 times and the average appears in the table.
+For languages with a runtime JIT (Java, Node.js, Python), a single half-hour long experiment is run to check stability;
+all long-experiment results are statistically equivalent, \ie median/average/standard-deviation correlate with the short-experiment results, indicating the short experiments reached a steady state.
 All omitted tests for other languages are functionally identical to the \CFA tests and available online~\cite{CforallConcurrentBenchmarks}.
-% tar --exclude-ignore=exclude -cvhf benchmark.tar benchmark
-% cp -p benchmark.tar /u/cforall/public_html/doc/concurrent_benchmark.tar
 \paragraph{Creation}
 …
 \begin{multicols}{2}
+\lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}}
+\begin{cfa}
+@coroutine@ MyCoroutine {};
+\begin{cfa}[xleftmargin=0pt]
+`coroutine` MyCoroutine {};
 void ?{}( MyCoroutine & this ) {
 #ifdef EAGER
 …
 void main( MyCoroutine & ) {}
 int main() {
         BENCH( for ( N ) { @MyCoroutine c;@ } )
+        BENCH( for ( N ) { `MyCoroutine c;` } )
         sout | result;
+}
 …
 \begin{tabular}[t]{@{}r*{3}{D{.}{.}{5.2}}@{}}
+\multicolumn{1}{@{}c}{} & \multicolumn{1}{c}{Median} & \multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+\CFA generator                  & 0.6           & 0.6           & 0.0           \\
+\CFA coroutine lazy             & 13.4          & 13.1          & 0.5           \\
+\CFA coroutine eager    & 144.7         & 143.9         & 1.5           \\
+\CFA thread                             & 466.4         & 468.0         & 11.3          \\
+\uC coroutine                   & 155.6         & 155.7         & 1.7           \\
+\uC thread                              & 523.4         & 523.9         & 7.7           \\
+Python generator                & 123.2         & 124.3         & 4.1           \\
+Node.js generator               & 33.4          & 33.5          & 0.3           \\
+Goroutine thread                & 751.0         & 750.5         & 3.1           \\
+Rust tokio thread               & 1860.0        & 1881.1        & 37.6          \\
+Rust thread                             & 53801.0       & 53896.8       & 274.9         \\
+Java thread                             & 120274.0      & 120722.9      & 2356.7        \\
+Pthreads thread                 & 31465.5       & 31419.5       & 140.4
+\multicolumn{1}{@{}r}{N\hspace*{10pt}} & \multicolumn{1}{c}{Median} & \multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+\CFA generator (1B)                     & 0.6           & 0.6           & 0.0           \\
+\CFA coroutine lazy     (100M)  & 13.4          & 13.1          & 0.5           \\
+\CFA coroutine eager (10M)      & 144.7         & 143.9         & 1.5           \\
+\CFA thread (10M)                       & 466.4         & 468.0         & 11.3          \\
+\uC coroutine (10M)                     & 155.6         & 155.7         & 1.7           \\
+\uC thread (10M)                        & 523.4         & 523.9         & 7.7           \\
+Python generator (10M)          & 123.2         & 124.3         & 4.1           \\
+Node.js generator (10M)         & 33.4          & 33.5          & 0.3           \\
+Goroutine thread (10M)          & 751.0         & 750.5         & 3.1           \\
+Rust tokio thread (10M)         & 1860.0        & 1881.1        & 37.6          \\
+Rust thread     (250K)                  & 53801.0       & 53896.8       & 274.9         \\
+Java thread (250K)                      & 119256.0      & 119679.2      & 2244.0        \\
+% Java thread (1 000 000)               & 123100.0      & 123052.5      & 751.6         \\
+Pthreads thread (250K)          & 31465.5       & 31419.5       & 140.4
 \end{tabular}
 \end{multicols}
 …
 Internal scheduling is measured using a cycle of two threads signalling and waiting.
 Figure~\ref{f:schedint} shows the code for \CFA, with results in Table~\ref{t:schedint}.
+Note, the incremental cost of bulk acquire for \CFA, which is largely a fixed cost for small numbers of mutex objects.
+Java scheduling is significantly greater because the benchmark explicitly creates multiple threads in order to prevent the JIT from making the program sequential, \ie removing all locking.
+Note, the \CFA incremental cost for bulk acquire is a fixed cost for small numbers of mutex objects.
+User-level threading has one kernel thread, eliminating contention between the threads (direct handoff of the kernel thread).
+Kernel-level threading has two kernel threads allowing some contention.
 \begin{multicols}{2}
 \lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}}
 \begin{cfa}
+\setlength{\tabcolsep}{3pt}
+\begin{cfa}[xleftmargin=0pt]
 volatile int go = 0;
+@condition c;@
 @monitor@ M {} m1/*, m2, m3, m4*/;
 void call( M & @mutex p1/*, p2, p3, p4*/@ ) {
         @signal( c );@
+}
 void wait( M & @mutex p1/*, p2, p3, p4*/@ ) {
+`condition c;`
+`monitor` M {} m1/*, m2, m3, m4*/;
+void call( M & `mutex p1/*, p2, p3, p4*/` ) {
+        `signal( c );`
+}
+void wait( M & `mutex p1/*, p2, p3, p4*/` ) {
         go = 1; // continue other thread
         for ( N ) { @wait( c );@ } );
+        for ( N ) { `wait( c );` } );
+}
 thread T {};
 …
 \begin{tabular}{@{}r*{3}{D{.}{.}{5.2}}@{}}
+\multicolumn{1}{@{}c}{} & \multicolumn{1}{c}{Median} & \multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+\CFA @signal@, 1 monitor        & 364.4         & 364.2         & 4.4           \\
+\CFA @signal@, 2 monitor        & 484.4         & 483.9         & 8.8           \\
+\CFA @signal@, 4 monitor        & 709.1         & 707.7         & 15.0          \\
+\uC @signal@ monitor            & 328.3         & 327.4         & 2.4           \\
+Rust cond. variable                     & 7514.0        & 7437.4        & 397.2         \\
+Java @notify@ monitor           & 9623.0        & 9654.6        & 236.2         \\
+Pthreads cond. variable         & 5553.7        & 5576.1        & 345.6
+\multicolumn{1}{@{}r}{N\hspace*{10pt}} & \multicolumn{1}{c}{Median} & \multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+\CFA @signal@, 1 monitor (10M)  & 364.4         & 364.2         & 4.4           \\
+\CFA @signal@, 2 monitor (10M)  & 484.4         & 483.9         & 8.8           \\
+\CFA @signal@, 4 monitor (10M)  & 709.1         & 707.7         & 15.0          \\
+\uC @signal@ monitor (10M)              & 328.3         & 327.4         & 2.4           \\
+Rust cond. variable     (1M)            & 7514.0        & 7437.4        & 397.2         \\
+Java @notify@ monitor (1M)              & 8717.0        & 8774.1        & 471.8         \\
+% Java @notify@ monitor (100 000 000)           & 8634.0        & 8683.5        & 330.5         \\
+Pthreads cond. variable (1M)    & 5553.7        & 5576.1        & 345.6
 \end{tabular}
 \end{multicols}
 …
 External scheduling is measured using a cycle of two threads calling and accepting the call using the @waitfor@ statement.
 Figure~\ref{f:schedext} shows the code for \CFA with results in Table~\ref{t:schedext}.
 Note, the incremental cost of bulk acquire for \CFA, which is largely a fixed cost for small numbers of mutex objects.
+Note, the \CFA incremental cost for bulk acquire is a fixed cost for small numbers of mutex objects.
 \begin{multicols}{2}
 \lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}}
+\setlength{\tabcolsep}{5pt}
 \vspace*{-16pt}
 \begin{cfa}
 @monitor@ M {} m1/*, m2, m3, m4*/;
 void call( M & @mutex p1/*, p2, p3, p4*/@ ) {}
 void wait( M & @mutex p1/*, p2, p3, p4*/@ ) {
         for ( N ) { @waitfor( call : p1/*, p2, p3, p4*/ );@ }
+\begin{cfa}[xleftmargin=0pt]
+`monitor` M {} m1/*, m2, m3, m4*/;
+void call( M & `mutex p1/*, p2, p3, p4*/` ) {}
+void wait( M & `mutex p1/*, p2, p3, p4*/` ) {
+        for ( N ) { `waitfor( call : p1/*, p2, p3, p4*/ );` }
+}
 thread T {};
 …
 \columnbreak
 \vspace*{-16pt}
+\vspace*{-18pt}
 \captionof{table}{External-scheduling comparison (nanoseconds)}
 \label{t:schedext}
 \begin{tabular}{@{}r*{3}{D{.}{.}{3.2}}@{}}
 \multicolumn{1}{@{}c}{} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
 \CFA @waitfor@, 1 monitor       & 367.1 & 365.3 & 5.0   \\
 \CFA @waitfor@, 2 monitor       & 463.0 & 464.6 & 7.1   \\
 \CFA @waitfor@, 4 monitor       & 689.6 & 696.2 & 21.5  \\
 \uC \lstinline[language=uC++]|_Accept| monitor  & 328.2 & 329.1 & 3.4   \\
 Go \lstinline[language=Golang]|select| channel  & 365.0 & 365.5 & 1.2
+\multicolumn{1}{@{}r}{N\hspace*{10pt}} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+\CFA @waitfor@, 1 monitor (10M) & 367.1 & 365.3 & 5.0   \\
+\CFA @waitfor@, 2 monitor (10M) & 463.0 & 464.6 & 7.1   \\
+\CFA @waitfor@, 4 monitor (10M) & 689.6 & 696.2 & 21.5  \\
+\uC \lstinline[language=uC++]|_Accept| monitor (10M)    & 328.2 & 329.1 & 3.4   \\
+Go \lstinline[language=Golang]|select| channel (10M)    & 365.0 & 365.5 & 1.2
 \end{tabular}
 \end{multicols}
 …
 \begin{multicols}{2}
 \lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}}
 \begin{cfa}
 @monitor@ M {} m1/*, m2, m3, m4*/;
 call( M & @mutex p1/*, p2, p3, p4*/@ ) {}
+\setlength{\tabcolsep}{3pt}
+\begin{cfa}[xleftmargin=0pt]
+`monitor` M {} m1/*, m2, m3, m4*/;
+call( M & `mutex p1/*, p2, p3, p4*/` ) {}
 int main() {
         BENCH( for( N ) call( m1/*, m2, m3, m4*/ ); )
 …
 \label{t:mutex}
 \begin{tabular}{@{}r*{3}{D{.}{.}{3.2}}@{}}
+\multicolumn{1}{@{}c}{} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+test-and-test-set lock                  & 19.1  & 18.9  & 0.4   \\
+\CFA @mutex@ function, 1 arg.   & 48.3  & 47.8  & 0.9   \\
+\CFA @mutex@ function, 2 arg.   & 86.7  & 87.6  & 1.9   \\
+\CFA @mutex@ function, 4 arg.   & 173.4 & 169.4 & 5.9   \\
+\uC @monitor@ member rtn.               & 54.8  & 54.8  & 0.1   \\
+Goroutine mutex lock                    & 34.0  & 34.0  & 0.0   \\
+Rust mutex lock                                 & 33.0  & 33.2  & 0.8   \\
+Java synchronized method                & 31.0  & 31.0  & 0.0   \\
+Pthreads mutex Lock                             & 31.0  & 31.1  & 0.4
+\multicolumn{1}{@{}r}{N\hspace*{10pt}} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+test-and-test-set lock (50M)            & 19.1  & 18.9  & 0.4   \\
+\CFA @mutex@ function, 1 arg. (50M)     & 48.3  & 47.8  & 0.9   \\
+\CFA @mutex@ function, 2 arg. (50M)     & 86.7  & 87.6  & 1.9   \\
+\CFA @mutex@ function, 4 arg. (50M)     & 173.4 & 169.4 & 5.9   \\
+\uC @monitor@ member rtn. (50M)         & 54.8  & 54.8  & 0.1   \\
+Goroutine mutex lock (50M)                      & 34.0  & 34.0  & 0.0   \\
+Rust mutex lock (50M)                           & 33.0  & 33.2  & 0.8   \\
+Java synchronized method (50M)          & 31.0  & 30.9  & 0.5   \\
+% Java synchronized method (10 000 000 000)             & 31.0 & 30.2 & 0.9 \\
+Pthreads mutex Lock (50M)                       & 31.0  & 31.1  & 0.4
 \end{tabular}
 \end{multicols}
 …
 % To: "Peter A. Buhr" <pabuhr@plg2.cs.uwaterloo.ca>
 % Date: Fri, 24 Jan 2020 13:49:18 -0500
+%
+%
 % I can also verify that the previous version, which just tied a bunch of promises together, *does not* go back to the
 % event loop at all in the current version of Node. Presumably they're taking advantage of the fact that the ordering of
 …
 \begin{multicols}{2}
+\lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}}
+\begin{cfa}[aboveskip=0pt,belowskip=0pt]
+@coroutine@ C {};
+void main( C & ) { for () { @suspend;@ } }
+\begin{cfa}[xleftmargin=0pt]
+`coroutine` C {};
+void main( C & ) { for () { `suspend;` } }
 int main() { // coroutine test
         C c;
         BENCH( for ( N ) { @resume( c );@ } )
+        BENCH( for ( N ) { `resume( c );` } )
         sout | result;
+}
 int main() { // thread test
         BENCH( for ( N ) { @yield();@ } )
+        BENCH( for ( N ) { `yield();` } )
         sout | result;
+}
 …
 \label{t:ctx-switch}
 \begin{tabular}{@{}r*{3}{D{.}{.}{3.2}}@{}}
+\multicolumn{1}{@{}c}{} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+C function                      & 1.8           & 1.8           & 0.0   \\
+\CFA generator          & 1.8           & 2.0           & 0.3   \\
+\CFA coroutine          & 32.5          & 32.9          & 0.8   \\
+\CFA thread                     & 93.8          & 93.6          & 2.2   \\
+\uC coroutine           & 50.3          & 50.3          & 0.2   \\
+\uC thread                      & 97.3          & 97.4          & 1.0   \\
+Python generator        & 40.9          & 41.3          & 1.5   \\
+Node.js await           & 1852.2        & 1854.7        & 16.4  \\
+Node.js generator       & 33.3          & 33.4          & 0.3   \\
+Goroutine thread        & 143.0         & 143.3         & 1.1   \\
+Rust async await        & 32.0          & 32.0          & 0.0   \\
+Rust tokio thread       & 143.0         & 143.0         & 1.7   \\
+Rust thread                     & 332.0         & 331.4         & 2.4   \\
+Java thread                     & 405.0         & 415.0         & 17.6  \\
+Pthreads thread         & 334.3         & 335.2         & 3.9
+\multicolumn{1}{@{}r}{N\hspace*{10pt}} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+C function (10B)                        & 1.8           & 1.8           & 0.0   \\
+\CFA generator (5B)                     & 1.8           & 2.0           & 0.3   \\
+\CFA coroutine (100M)           & 32.5          & 32.9          & 0.8   \\
+\CFA thread (100M)                      & 93.8          & 93.6          & 2.2   \\
+\uC coroutine (100M)            & 50.3          & 50.3          & 0.2   \\
+\uC thread (100M)                       & 97.3          & 97.4          & 1.0   \\
+Python generator (100M)         & 40.9          & 41.3          & 1.5   \\
+Node.js await (5M)                      & 1852.2        & 1854.7        & 16.4  \\
+Node.js generator (100M)        & 33.3          & 33.4          & 0.3   \\
+Goroutine thread (100M)         & 143.0         & 143.3         & 1.1   \\
+Rust async await (100M)         & 32.0          & 32.0          & 0.0   \\
+Rust tokio thread (100M)        & 143.0         & 143.0         & 1.7   \\
+Rust thread (25M)                       & 332.0         & 331.4         & 2.4   \\
+Java thread (100M)                      & 405.0         & 415.0         & 17.6  \\
+% Java thread (  100 000 000)                   & 413.0 & 414.2 & 6.2 \\
+% Java thread (5 000 000 000)                   & 415.0 & 415.2 & 6.1 \\
+Pthreads thread (25M)           & 334.3         & 335.2         & 3.9
 \end{tabular}
 \end{multicols}
 …
 Languages using 1:1 threading based on pthreads can at best meet or exceed, due to language overhead, the pthread results.
 Note, pthreads has a fast zero-contention mutex lock checked in user space.
+Languages with M:N threading have better performance than 1:1 because there is no operating-system interactions.
+Languages with M:N threading have better performance than 1:1 because there is no operating-system interactions (context-switching or locking).
+As well, for locking experiments, M:N threading has less contention if only one kernel thread is used.
 Languages with stackful coroutines have higher cost than stackless coroutines because of stack allocation and context switching;
 however, stackful \uC and \CFA coroutines have approximately the same performance as stackless Python and Node.js generators.
 The \CFA stackless generator is approximately 25 times faster for suspend/resume and 200 times faster for creation than stackless Python and Node.js generators.
+The Node.js context-switch is costly when asynchronous await must enter the event engine because a promise is not fulfilled.
+Finally, the benchmark results correlate across programming languages with and without JIT, indicating the JIT has completed any runtime optimizations.
 …
 The authors recognize the design assistance of Aaron Moss, Rob Schluntz, Andrew Beach, and Michael Brooks; David Dice for commenting and helping with the Java benchmarks; and Gregor Richards for helping with the Node.js benchmarks.
 This research is funded by a grant from Waterloo-Huawei (\url{http://www.huawei.com}) Joint Innovation Lab. %, and Peter Buhr is partially funded by the Natural Sciences and Engineering Research Council of Canada.
+This research is funded by the NSERC/Waterloo-Huawei (\url{http://www.huawei.com}) Joint Innovation Lab. %, and Peter Buhr is partially funded by the Natural Sciences and Engineering Research Council of Canada.
 {%

doc/papers/concurrency/annex/local.bib

rae2c27a	rc76bd34
59	59	@manual{Cpp-Transactions,
60	60	keywords = {C++, Transactional Memory},
61		title = {Tech~~nical Specification~~ for C++ Extensions for Transactional Memory},
	61	title = {Tech. Spec. for C++ Extensions for Transactional Memory},
62	62	organization= {International Standard ISO/IEC TS 19841:2015 },
63	63	publisher = {American National Standards Institute},

doc/papers/concurrency/mail2

-              rae2c27a
+              rc76bd34
 Software: Practice and Experience Editorial Office
+Date: Wed, 2 Sep 2020 20:55:34 +0000
+From: Richard Jones <onbehalfof@manuscriptcentral.com>
+Reply-To: R.E.Jones@kent.ac.uk
+To: tdelisle@uwaterloo.ca, pabuhr@uwaterloo.ca
+Subject: Software: Practice and Experience - Decision on Manuscript ID
+ SPE-19-0219.R2
+-Sep-2020
+Dear Dr Buhr,
+Many thanks for submitting SPE-19-0219.R2 entitled "Advanced Control-flow and Concurrency in Cforall" to Software: Practice and Experience. The paper has now been reviewed and the comments of the referees are included at the bottom of this letter. I apologise for the length of time it has taken to get these.
+Both reviewers consider this paper to be close to acceptance. However, before I can accept this paper, I would like you address the comments of Reviewer 2, particularly with regard to the description of the adaptation Java harness to deal with warmup. I would expect to see a convincing argument that the computation has reached a steady state. I would also like you to provide the values for N for each benchmark run. This should be very straightforward for you to do. There are a couple of papers on steady state that you may wish to consult (though I am certainly not pushing my own work).
+) Barrett, Edd; Bolz-Tereick, Carl Friedrich; Killick, Rebecca; Mount, Sarah and Tratt, Laurence. Virtual Machine Warmup Blows Hot and Cold. OOPSLA 2017. https://doi.org/10.1145/3133876
+Virtual Machines (VMs) with Just-In-Time (JIT) compilers are traditionally thought to execute programs in two phases: the initial warmup phase determines which parts of a program would most benefit from dynamic compilation, before JIT compiling those parts into machine code; subsequently the program is said to be at a steady state of peak performance. Measurement methodologies almost always discard data collected during the warmup phase such that reported measurements focus entirely on peak performance. We introduce a fully automated statistical approach, based on changepoint analysis, which allows us to determine if a program has reached a steady state and, if so, whether that represents peak performance or not. Using this, we show that even when run in the most controlled of circumstances, small, deterministic, widely studied microbenchmarks often fail to reach a steady state of peak performance on a variety of common VMs. Repeating our experiment on 3 different machines, we found that at most 43.5% of pairs consistently reach a steady state of peak performance.
+) Kalibera, Tomas and Jones, Richard. Rigorous Benchmarking in Reasonable Time. ISMM  2013. https://doi.org/10.1145/2555670.2464160
+Experimental evaluation is key to systems research. Because modern systems are complex and non-deterministic, good experimental methodology demands that researchers account for uncertainty. To obtain valid results, they are expected to run many iterations of benchmarks, invoke virtual machines (VMs) several times, or even rebuild VM or benchmark binaries more than once. All this repetition costs time to complete experiments. Currently, many evaluations give up on sufficient repetition or rigorous statistical methods, or even run benchmarks only in training sizes. The results reported often lack proper variation estimates and, when a small difference between two systems is reported, some are simply unreliable.In contrast, we provide a statistically rigorous methodology for repetition and summarising results that makes efficient use of experimentation time. Time efficiency comes from two key observations. First, a given benchmark on a given platform is typically prone to much less non-determinism than the common worst-case of published corner-case studies. Second, repetition is most needed where most uncertainty arises (whether between builds, between executions or between iterations). We capture experimentation cost with a novel mathematical model, which we use to identify the number of repetitions at each level of an experiment necessary and sufficient to obtain a given level of precision.We present our methodology as a cookbook that guides researchers on the number of repetitions they should run to obtain reliable results. We also show how to present results with an effect size confidence interval. As an example, we show how to use our methodology to conduct throughput experiments with the DaCapo and SPEC CPU benchmarks on three recent platforms.
+You have 42 days from the date of this email to submit your revision. If you are unable to complete the revision within this time, please contact me to request a short extension.
+You can upload your revised manuscript and submit it through your Author Center. Log into https://mc.manuscriptcentral.com/spe and enter your Author Center, where you will find your manuscript title listed under "Manuscripts with Decisions".
+When submitting your revised manuscript, you will be able to respond to the comments made by the referee(s) in the space provided.  You can use this space to document any changes you make to the original manuscript.
+If you would like help with English language editing, or other article preparation support, Wiley Editing Services offers expert help with English Language Editing, as well as translation, manuscript formatting, and figure formatting at www.wileyauthors.com/eeo/preparation. You can also check out our resources for Preparing Your Article for general guidance about writing and preparing your manuscript at www.wileyauthors.com/eeo/prepresources.
+Once again, thank you for submitting your manuscript to Software: Practice and Experience. I look forward to receiving your revision.
+Sincerely,
+Richard
+Prof. Richard Jones
+Editor, Software: Practice and Experience
+R.E.Jones@kent.ac.uk
+Referee(s)' Comments to Author:
+Reviewing: 1
+Comments to the Author
+Overall, I felt that this draft was an improvement on previous drafts and I don't have further changes to request.
+I appreciated the new language to clarify the relationship of external and internal scheduling, for example, as well as the new measurements of Rust tokio. Also, while I still believe that the choice between thread/generator/coroutine and so forth could be made crisper and clearer, the current draft of Section 2 did seem adequate to me in terms of specifying the considerations that users would have to take into account to make the choice.
+Reviewing: 2
+Comments to the Author
+First: let me apologise for the delay on this review. I'll blame the global pandemic combined with my institution's senior management's counterproductive decisions for taking up most of my time and all of my energy.
+At this point, reading the responses, I think we've been around the course enough times that further iteration is unlikely to really improve the paper any further, so I'm happy to recommend acceptance.    My main comments are that there were some good points in the responses to *all* the reviews and I strongly encourage the authors to incorporate those discursive responses into the final paper so they may benefit readers as well as reviewers.   I agree with the recommendations of reviewer #2 that the paper could usefully be split in to two, which I think I made to a previous revision, but I'm happy to leave that decision to the Editor.
+Finally, the paper needs to describe how the Java harness was adapted to deal with warmup; why the computation has warmed up and reached a steady state - similarly for js and Python. The tables should also give the "N" chosen for each benchmark run.
+minor points
+* don't start sentences with "However"
+* most downloaded isn't an "Award"
+Date: Thu, 1 Oct 2020 05:34:29 +0000
+From: Richard Jones <onbehalfof@manuscriptcentral.com>
+Reply-To: R.E.Jones@kent.ac.uk
+To: pabuhr@uwaterloo.ca
+Subject: Revision reminder - SPE-19-0219.R2
+-Oct-2020
+Dear Dr Buhr
+SPE-19-0219.R2
+This is a reminder that your opportunity to revise and re-submit your manuscript will expire 14 days from now. If you require more time please contact me directly and I may grant an extension to this deadline, otherwise the option to submit a revision online, will not be available.
+If your article is of potential interest to the general public, (which means it must be timely, groundbreaking, interesting and impact on everyday society) then please e-mail ejp@wiley.co.uk explaining the public interest side of the research. Wiley will then investigate the potential for undertaking a global press campaign on the article.
+I look forward to receiving your revision.
+Sincerely,
+Prof. Richard Jones
+Editor, Software: Practice and Experience
+https://mc.manuscriptcentral.com/spe
+Date: Tue, 6 Oct 2020 15:29:41 +0000
+From: Mayank Roy Chowdhury <onbehalfof@manuscriptcentral.com>
+Reply-To: speoffice@wiley.com
+To: tdelisle@uwaterloo.ca, pabuhr@uwaterloo.ca
+Subject: SPE-19-0219.R3 successfully submitted
+-Oct-2020
+Dear Dr Buhr,
+Your manuscript entitled "Advanced Control-flow and Concurrency in Cforall" has been successfully submitted online and is presently being given full consideration for publication in Software: Practice and Experience.
+Your manuscript number is SPE-19-0219.R3.  Please mention this number in all future correspondence regarding this submission.
+You can view the status of your manuscript at any time by checking your Author Center after logging into https://mc.manuscriptcentral.com/spe.  If you have difficulty using this site, please click the 'Get Help Now' link at the top right corner of the site.
+Thank you for submitting your manuscript to Software: Practice and Experience.
+Sincerely,
+Software: Practice and Experience Editorial Office

doc/refrat/refrat.tex

-              rae2c27a
+              rc76bd34
 %% Created On       : Wed Apr  6 14:52:25 2016
 %% Last Modified By : Peter A. Buhr
 %% Last Modified On : Wed Jan 31 17:30:23 2018
 %% Update Count     : 108
+%% Last Modified On : Mon Oct  5 09:02:53 2020
+%% Update Count     : 110
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
 \usepackage{upquote}                                                                    % switch curled `'" to straight
 \usepackage{calc}
-\usepackage{xspace}
 \usepackage{varioref}                                                                   % extended references
-\usepackage{listings}                                                                   % format program code
 \usepackage[flushmargin]{footmisc}                                              % support label/reference in footnote
 \usepackage{latexsym}                                   % \Box glyph
 \usepackage{mathptmx}                                   % better math font with "times"
 \usepackage[usenames]{color}
+\input{common}                                          % common CFA document macros
+\usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}
+\usepackage{breakurl}
+\renewcommand{\UrlFont}{\small\sf}
+\usepackage[pagewise]{lineno}
+\renewcommand{\linenumberfont}{\scriptsize\sffamily}
+\usepackage[firstpage]{draftwatermark}
+\SetWatermarkLightness{0.9}
+% Default underscore is too low and wide. Cannot use lstlisting "literate" as replacing underscore
+% removes it as a variable-name character so keywords in variables are highlighted. MUST APPEAR
+% AFTER HYPERREF.
+\renewcommand{\textunderscore}{\leavevmode\makebox[1.2ex][c]{\rule{1ex}{0.075ex}}}
+\setlength{\topmargin}{-0.45in}                                                 % move running title into header
+\setlength{\headsep}{0.25in}
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\CFAStyle                                                                                               % use default CFA format-style
+\lstnewenvironment{C++}[1][]                            % use C++ style
+{\lstset{language=C++,moredelim=**[is][\protect\color{red}]{®}{®}#1}}
+{}
+\newcommand{\CFALatin}{}
 % inline code ©...© (copyright symbol) emacs: C-q M-)
 % red highlighting ®...® (registered trademark symbol) emacs: C-q M-.
 …
 % keyword escape ¶...¶ (pilcrow symbol) emacs: C-q M-^
 % math escape $...$ (dollar symbol)
+\input{common}                                          % common CFA document macros
+\usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}
+\usepackage{breakurl}
+\renewcommand{\UrlFont}{\small\sf}
+\usepackage[pagewise]{lineno}
+\renewcommand{\linenumberfont}{\scriptsize\sffamily}
+\usepackage[firstpage]{draftwatermark}
+\SetWatermarkLightness{0.9}
+% Default underscore is too low and wide. Cannot use lstlisting "literate" as replacing underscore
+% removes it as a variable-name character so keywords in variables are highlighted. MUST APPEAR
+% AFTER HYPERREF.
+\renewcommand{\textunderscore}{\leavevmode\makebox[1.2ex][c]{\rule{1ex}{0.075ex}}}
+\setlength{\topmargin}{-0.45in}                                                 % move running title into header
+\setlength{\headsep}{0.25in}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\CFAStyle                                                                                               % use default CFA format-style
+\lstnewenvironment{C++}[1][]                            % use C++ style
+{\lstset{language=C++,moredelim=**[is][\protect\color{red}]{®}{®},#1}}
+{}
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % Names used in the document.
 \newcommand{\Version}{\input{../../version}}
+\newcommand{\Version}{\input{build/version}}
 \newcommand{\Textbf}[2][red]{{\color{#1}{\textbf{#2}}}}
 \newcommand{\Emph}[2][red]{{\color{#1}\textbf{\emph{#2}}}}

doc/theses/andrew_beach_MMath/thesis.tex

-              rae2c27a
+              rc76bd34
 \usepackage[toc,abbreviations]{glossaries-extra}
+% Main glossary entries -- definitions of relevant terminology
+\newglossaryentry{computer}
+{
+name=computer,
+description={A programmable machine that receives input data,
+               stores and manipulates the data, and provides
+               formatted output}
+}
+% Nomenclature glossary entries -- New definitions, or unusual terminology
+\newglossary*{nomenclature}{Nomenclature}
+\newglossaryentry{dingledorf}
+{
+type=nomenclature,
+name=dingledorf,
+description={A person of supposed average intelligence who makes incredibly
+               brainless misjudgments}
+}
+% List of Abbreviations (abbreviations are from the glossaries-extra package)
+\newabbreviation{aaaaz}{AAAAZ}{American Association of Amature Astronomers
+               and Zoologists}
+% List of Symbols
+\newglossary*{symbols}{List of Symbols}
+\newglossaryentry{rvec}
+{
+name={$\mathbf{v}$},
+sort={label},
+type=symbols,
+description={Random vector: a location in n-dimensional Cartesian space, where
+               each dimensional component is determined by a random process}
+}
+% Define all the glossaries.
+\input{glossaries}
 % Generate the glossaries defined above.

doc/theses/thierry_delisle_PhD/.gitignore

-              rae2c27a
+              rc76bd34
 comp_II/comp_II.pdf
 comp_II/comp_II.ps
+comp_II/presentation.pdf
+thesis/build/
+thesis/fig/*.fig.bak
+thesis/thesis.pdf
+thesis/thesis.ps
 !Makefile

doc/theses/thierry_delisle_PhD/comp_II/comp_II.tex

-              rae2c27a
+              rc76bd34
 \section{Introduction}
 \subsection{\CFA and the \CFA concurrency package}
 \CFA\cite{Moss18} is a modern, polymorphic, non-object-oriented, concurrent, backwards-compatible extension of the C programming language.
+\CFA~\cite{Moss18} is a modern, polymorphic, non-object-oriented, concurrent, backwards-compatible extension of the C programming language.
 It aims to add high-productivity features while maintaining the predictable performance of C.
 As such, concurrency in \CFA\cite{Delisle19} aims to offer simple and safe high-level tools while still allowing performant code.
 \CFA concurrent code is written in the synchronous programming paradigm but uses \glspl{uthrd} in order to achieve the simplicity and maintainability of synchronous programming without sacrificing the efficiency of asynchronous programming.
+As such, concurrency in \CFA~\cite{Delisle19} aims to offer simple and safe high-level tools while still allowing performant code.
+\CFA concurrent code is written in the synchronous programming paradigm but uses \glspl{uthrd} to achieve the simplicity and maintainability of synchronous programming without sacrificing the efficiency of asynchronous programming.
 As such, the \CFA \newterm{scheduler} is a preemptive user-level scheduler that maps \glspl{uthrd} onto \glspl{kthrd}.
+\subsection{Scheduling}
 \newterm{Scheduling} occurs when execution switches from one thread to another, where the second thread is implicitly chosen by the scheduler.
 This scheduling is an indirect handoff, as opposed to generators and coroutines which explicitly switch to the next generator and coroutine respectively.
+This scheduling is an indirect handoff, as opposed to generators and coroutines that explicitly switch to the next generator and coroutine respectively.
 The cost of switching between two threads for an indirect handoff has two components:
 \begin{enumerate}
 …
 and the cost of scheduling, \ie deciding which thread to run next among all the threads ready to run.
 \end{enumerate}
 The first cost is generally constant and fixed\footnote{Affecting the constant context-switch cost is whether it is done in one step, after the scheduling, or in two steps, context-switching to a third fixed thread before scheduling.}, while the scheduling cost can vary based on the system state.
 Adding multiple \glspl{kthrd} does not fundamentally change the scheduler semantics or requirements, it simply adds new correctness requirements, \ie \newterm{linearizability}\footnote{Meaning, however fast the CPU threads run, there is an equivalent sequential order that gives the same result.}, and a new dimension to performance: scalability, where scheduling cost now also depends on contention.
+The first cost is generally constant\footnote{Affecting the constant context-switch cost is whether it is done in one step, where the first thread schedules the second, or in two steps, where the first thread context switches to a third scheduler thread.}, while the scheduling cost can vary based on the system state.
+Adding multiple \glspl{kthrd} does not fundamentally change the scheduler semantics or requirements, it simply adds new correctness requirements, \ie \newterm{linearizability}\footnote{Meaning however fast the CPU threads run, there is an equivalent sequential order that gives the same result.}, and a new dimension to performance: scalability, where scheduling cost also depends on contention.
 The more threads switch, the more the administration cost of scheduling becomes noticeable.
 It is therefore important to build a scheduler with the lowest possible cost and latency.
 Another important consideration is \newterm{fairness}.
 In principle, scheduling should give the illusion of perfect fairness, where all threads ready to run are running \emph{simultaneously}.
+In practice, there can be advantages to unfair scheduling, similar to the express cash register at a grocery store.
 While the illusion of simultaneity is easier to reason about, it can break down if the scheduler allows too much unfairness.
 Therefore, the scheduler should offer as much fairness as needed to guarantee eventual progress, but use unfairness to help performance.
+In practice, threads must wait in turn but there can be advantages to unfair scheduling, similar to the express cash register at a grocery store.
 The goal of this research is to produce a scheduler that is simple for programmers to understand and offers good performance.
+\subsection{Research Goal}
+The goal of this research is to produce a scheduler that is simple for programmers to understand and offers good general performance.
 Here understandability does not refer to the API but to how much scheduling concerns programmers need to take into account when writing a \CFA concurrent package.
 Therefore, the main goal of this proposal is :
+Therefore, the main consequence of this goal is :
 \begin{quote}
 The \CFA scheduler should be \emph{viable} for \emph{any} workload.
 \end{quote}
 For a general-purpose scheduler, it is impossible to produce an optimal algorithm as it would require knowledge of the future behaviour of threads.
 As such, scheduling performance is generally either defined by the best-case scenario, \ie a workload to which the scheduler is tailored, or the worst-case scenario, \ie the scheduler behaves no worse than \emph{X}.
+For a general-purpose scheduler, it is impossible to produce an optimal algorithm as that requires knowledge of the future behaviour of threads.
+As such, scheduling performance is generally either defined by a best-case scenario, \ie a workload to which the scheduler is tailored, or a worst-case scenario, \ie the scheduler behaves no worse than \emph{X}.
 For this proposal, the performance is evaluated using the second approach to allow \CFA programmers to rely on scheduling performance.
 Because there is no optimal scheduler, ultimately \CFA may allow programmers to write their own scheduler; but that is not the subject of this proposal, which considers only the default scheduler.
 …
         \item creating an abstraction layer over the operating system to handle kernel-threads spinning unnecessarily,
         \item scheduling blocking I/O operations,
         \item and writing sufficient library tools to allow developers to indirectly use the scheduler, either through tuning knobs or replacing the default scheduler.
+        \item and writing sufficient library tools to allow developers to indirectly use the scheduler, either through tuning knobs in the default scheduler or replacing the default scheduler.
 \end{enumerate}
 …
 \paragraph{Performance} The performance of a scheduler can generally be measured in terms of scheduling cost, scalability and latency.
 \newterm{Scheduling cost} is the cost to switch from one thread to another, as mentioned above.
+For simple applications, where a single kernel thread does most of the scheduling, it is generally the dominating cost.
+\newterm{Scalability} is the cost of adding multiple kernel threads because it increases the time for context switching because of contention by multiple threads accessing shared resources, \eg the ready queue.
+For compute-bound concurrent applications with little context switching, the scheduling cost is negligible.
+For applications with high context-switch rates, scheduling cost can begin to dominating the cost.
+\newterm{Scalability} is the cost of adding multiple kernel threads.
+It can increase the time for scheduling because of contention from the multiple threads accessing shared resources, \eg a single ready queue.
 Finally, \newterm{tail latency} is service delay and relates to thread fairness.
 Specifically, latency measures how long a thread waits to run once scheduled and is evaluated in the worst case.
+Specifically, latency measures how long a thread waits to run once scheduled and is evaluated by the worst case.
 The \CFA scheduler should offer good performance for all three metrics.
 …
 \newterm{Eventual progress} guarantees every scheduled thread is eventually run, \ie prevent starvation.
 As a hard requirement, the \CFA scheduler must guarantee eventual progress, otherwise the above-mentioned illusion of simultaneous execution is broken and the scheduler becomes much more complex to reason about.
 \newterm{Predictability} and \newterm{reliability} mean similar workloads achieve similar performance and programmer execution intuition is respected.
 For example, a thread that yields aggressively should not run more often than other tasks.
+\newterm{Predictability} and \newterm{reliability} mean similar workloads achieve similar performance so programmer execution intuition is respected.
+For example, a thread that yields aggressively should not run more often than other threads.
 While this is intuitive, it does not hold true for many work-stealing or feedback based schedulers.
 The \CFA scheduler must guarantee eventual progress and should be predictable and offer reliable performance.
+The \CFA scheduler must guarantee eventual progress, should be predictable, and offer reliable performance.
 \paragraph{Efficiency} Finally, efficient usage of CPU resources is also an important requirement and is discussed in depth towards the end of the proposal.
 \newterm{Efficiency} means avoiding using CPU cycles when there are no threads to run, and conversely, use all CPUs available when the workload can benefit from it.
+\newterm{Efficiency} means avoiding using CPU cycles when there are no threads to run (to conserve energy), and conversely, using as many available CPU cycles when the workload can benefit from it.
 Balancing these two states is where the complexity lies.
 The \CFA scheduler should be efficient with respect to the underlying (shared) computer.
 …
 \begin{enumerate}
         \item Threads live long enough for useful feedback information to be gathered.
         \item Threads belong to multiple users so fairness across threads is insufficient.
+        \item Threads belong to multiple users so fairness across users is important.
 \end{enumerate}
 …
 In the case of the \CFA scheduler, every thread runs in the same user space and is controlled by the same user.
 Fairness across users is therefore a given and it is then possible to safely ignore the possibility that threads are malevolent.
+This approach allows for a much simpler fairness metric and in this proposal \emph{fairness} is defined as: when multiple threads are cycling through the system, the total ordering of threads being scheduled, \ie pushed onto the ready queue, should not differ much from the total ordering of threads being executed, \ie popped from the ready queue.
+This approach allows for a much simpler fairness metric, and in this proposal, \emph{fairness} is defined as:
+\begin{quote}
+When multiple threads are cycling through the system, the total ordering of threads being scheduled, \ie pushed onto the ready queue, should not differ much from the total ordering of threads being executed, \ie popped from the ready queue.
+\end{quote}
 Since feedback is not necessarily feasible within the lifetime of all threads and a simple fairness metric can be used, the scheduling strategy proposed for the \CFA runtime does not use per-threads feedback.
 …
 Threads with equal priority are scheduled using a secondary strategy, often something simple like round robin or FIFO.
 A consequence of priority is that, as long as there is a thread with a higher priority that desires to run, a thread with a lower priority does not run.
 This possible starving of threads can dramatically increase programming complexity since starving threads and priority inversion (prioritizing a lower priority thread) can both lead to serious problems.
+The potential for thread starvation dramatically increases programming complexity since starving threads and priority inversion (prioritizing a lower priority thread) can both lead to serious problems.
 An important observation is that threads do not need to have explicit priorities for problems to occur.
 Indeed, any system with multiple ready queues that attempts to exhaust one queue before accessing the other queues, essentially provide implicit priority, which can encounter starvation problems.
+Indeed, any system with multiple ready queues that attempts to exhaust one queue before accessing the other queues, essentially provides implicit priority, which can encounter starvation problems.
 For example, a popular scheduling strategy that suffers from implicit priorities is work stealing.
 \newterm{Work stealing} is generally presented as follows:
 …
         \item If a processor's ready queue is empty, attempt to run threads from some other processor's ready queue.
 \end{enumerate}
 In a loaded system\footnote{A \newterm{loaded system} is a system where threads are being run at the same rate they are scheduled.}, if a thread does not yield, block, or preempt for an extended period of time, threads on the same processor's list starve if no other processors exhaust their list.
 Since priorities can be complex for programmers to incorporate into their execution intuition, the scheduling strategy proposed for the \CFA runtime does not use a strategy with either implicit or explicit thread priorities.
+Since priorities can be complex for programmers to incorporate into their execution intuition, the \CFA scheduling strategy does not provided explicit priorities and attempts to eliminate implicit priorities.
 \subsection{Schedulers without feedback or priorities}
 …
 Thankfully, strict FIFO is not needed for sufficient fairness.
 Since concurrency is inherently non-deterministic, fairness concerns in scheduling are only a problem if a thread repeatedly runs before another thread can run.
 Some relaxation is possible because non-determinism means programmers already handle ordering problems to produce correct code and hence rely on weak guarantees, \eg that a specific thread will \emph{eventually} run.
+Some relaxation is possible because non-determinism means programmers already handle ordering problems to produce correct code and hence rely on weak guarantees, \eg that a thread \emph{eventually} runs.
 Since some reordering does not break correctness, the FIFO fairness guarantee can be significantly relaxed without causing problems.
 For this proposal, the target guarantee is that the \CFA scheduler provides \emph{probable} FIFO ordering, which allows reordering but makes it improbable that threads are reordered far from their position in total ordering.
 The \CFA scheduler fairness is defined as follows:
 \begin{itemize}
         \item Given two threads $X$ and $Y$, the odds that thread $X$ runs $N$ times \emph{after} thread $Y$ is scheduled but \emph{before} it is run, decreases exponentially with regard to $N$.
 \end{itemize}
+\begin{quote}
+Given two threads $X$ and $Y$, the odds that thread $X$ runs $N$ times \emph{after} thread $Y$ is scheduled but \emph{before} it is run, decreases exponentially with regard to $N$.
+\end{quote}
 While this is not a bounded guarantee, the probability that unfairness persist for long periods of times decreases exponentially, making persisting unfairness virtually impossible.
 …
 The described queue uses an array of underlying strictly FIFO queues as shown in Figure~\ref{fig:base}\footnote{For this section, the number of underlying queues is assumed to be constant.
 Section~\ref{sec:resize} discusses resizing the array.}.
 Pushing new data is done by selecting one of these underlying queues at random, recording a timestamp for the operation and pushing to the selected queue.
+Pushing new data is done by selecting one of the underlying queues at random, recording a timestamp for the operation, and pushing to the selected queue.
 Popping is done by selecting two queues at random and popping from the queue with the oldest timestamp.
 A higher number of underlying queues lead to less contention on each queue and therefore better performance.
 In a loaded system, it is highly likely the queues are non-empty, \ie several tasks are on each of the underlying queues.
 This means that selecting a queue at random to pop from is highly likely to yield a queue with available items.
+A higher number of underlying queues leads to less contention on each queue and therefore better performance.
+In a loaded system, it is highly likely the queues are non-empty, \ie several threads are on each of the underlying queues.
+For this case, selecting a queue at random to pop from is highly likely to yield a queue with available items.
 In Figure~\ref{fig:base}, ignoring the ellipsis, the chances of getting an empty queue is 2/7 per pick, meaning two random picks yield an item approximately 9 times out of 10.
 …
                 \input{base.pstex_t}
         \end{center}
         \caption{Relaxed FIFO list at the base of the scheduler: an array of strictly FIFO lists.
         The timestamp is in all nodes and cell arrays.}
+        \caption{Loaded relaxed FIFO list base on an array of strictly FIFO lists.
+        A timestamp appears in each node and array cell.}
         \label{fig:base}
 \end{figure}
 …
                 \input{empty.pstex_t}
         \end{center}
         \caption{``More empty'' state of the queue: the array contains many empty cells.}
+        \caption{Underloaded relaxed FIFO list where the array contains many empty cells.}
         \label{fig:empty}
 \end{figure}
 When the ready queue is \emph{more empty}, \ie several of the queues are empty, selecting a random queue for popping is less likely to yield a successful selection and more attempts are needed, resulting in a performance degradation.
+In an underloaded system, several of the queues are empty, so selecting a random queue for popping is less likely to yield a successful selection and more attempts are needed, resulting in a performance degradation.
 Figure~\ref{fig:empty} shows an example with fewer elements, where the chances of getting an empty queue is 5/7 per pick, meaning two random picks yield an item only half the time.
 Since the ready queue is not empty, the pop operation \emph{must} find an element before returning and therefore must retry.
 …
 \end{table}
 Performance can be improved in case~D (Table~\ref{tab:perfcases}) by adding information to help processors find which inner queues are used.
+Performance can be improved in Table~\ref{tab:perfcases} case~D by adding information to help processors find which inner queues are used.
 This addition aims to avoid the cost of retrying the pop operation but does not affect contention on the underlying queues and can incur some management cost for both push and pop operations.
 The approach used to encode this information can vary in density and be either global or local.
 …
 With a multi-word bitmask, this maximum limit can be increased arbitrarily, but it is not possible to check if the queue is empty by reading the bitmask atomically.
 Finally, a dense bitmap, either single or multi-word, causes additional problems in case C (Table 1), because many processors are continuously scanning the bitmask to find the few available threads.
+Finally, a dense bitmap, either single or multi-word, causes additional problems in Table~\ref{tab:perfcases} case C, because many processors are continuously scanning the bitmask to find the few available threads.
 This increased contention on the bitmask(s) reduces performance because of cache misses after updates and the bitmask is updated more frequently by the scanning processors racing to read and/or update that information.
 This increased update frequency means the information in the bitmask is more often stale before a processor can use it to find an item, \ie mask read says there are available user threads but none on queue.
 …
 \begin{figure}
         \begin{center}
+                {\resizebox{0.8\textwidth}{!}{\input{emptybit}}}
+        \end{center}
+        \caption{``More empty'' queue with added bitmask to indicate which array cells have items.}
+                {\resizebox{0.73\textwidth}{!}{\input{emptybit}}}
+        \end{center}
+        \vspace*{-5pt}
+        \caption{Underloaded queue with added bitmask to indicate which array cells have items.}
         \label{fig:emptybit}
+        \begin{center}
+                {\resizebox{0.73\textwidth}{!}{\input{emptytree}}}
+        \end{center}
+        \vspace*{-5pt}
+        \caption{Underloaded queue with added binary search tree indicate which array cells have items.}
+        \label{fig:emptytree}
+        \begin{center}
+                {\resizebox{0.9\textwidth}{!}{\input{emptytls}}}
+        \end{center}
+        \vspace*{-5pt}
+        \caption{Underloaded queue with added per processor bitmask to indicate which array cells have items.}
+        \label{fig:emptytls}
 \end{figure}
+Figure~\ref{fig:emptytree} shows another approach using a hierarchical tree data-structure to reduce contention and has been shown to work in similar cases~\cite{ellen2007snzi}\footnote{This particular paper seems to be patented in the US.
+How does that affect \CFA? Can I use it in my work?}.
+However, this approach may lead to poorer performance in case~B (Table~\ref{tab:perfcases}) due to the inherent pointer chasing cost and already low contention cost in that case.
+\begin{figure}
+        \begin{center}
+                {\resizebox{0.8\textwidth}{!}{\input{emptytree}}}
+        \end{center}
+        \caption{``More empty'' queue with added binary search tree indicate which array cells have items.}
+        \label{fig:emptytree}
+\end{figure}
+Finally, a third approach is to use dense information, similar to the bitmap, but have each thread keep its own independent copy of it.
+Figure~\ref{fig:emptytree} shows an approach using a hierarchical tree data-structure to reduce contention and has been shown to work in similar cases~\cite{ellen2007snzi}.
+However, this approach may lead to poorer performance in Table~\ref{tab:perfcases} case~B due to the inherent pointer chasing cost and already low contention cost in that case.
+Figure~\ref{fig:emptytls} shows an approach using dense information, similar to the bitmap, but have each thread keep its own independent copy of it.
 While this approach can offer good scalability \emph{and} low latency, the liveliness of the information can become a problem.
 In the simple cases, local copies of which underlying queues are empty can become stale and end-up not being useful for the pop operation.
+In the simple cases, local copies can become stale and end-up not being useful for the pop operation.
 A more serious problem is that reliable information is necessary for some parts of this algorithm to be correct.
 As mentioned in this section, processors must know \emph{reliably} whether the list is empty or not to decide if they can return \texttt{NULL} or if they must keep looking during a pop operation.
 Section~\ref{sec:sleep} discusses another case where reliable information is required for the algorithm to be correct.
-\begin{figure}
-        \begin{center}
-                \input{emptytls}
-        \end{center}
-        \caption{``More empty'' queue with added per processor bitmask to indicate which array cells have items.}
-        \label{fig:emptytls}
-\end{figure}
 There is a fundamental tradeoff among these approach.
 Dense global information about empty underlying queues helps zero-contention cases at the cost of high-contention case.
 Sparse global information helps high-contention cases but increases latency in zero-contention-cases, to read and ``aggregate'' the information\footnote{Hierarchical structures, \eg binary search tree, effectively aggregate information but follow pointer chains, learning information at each node.
+Dense global information about empty underlying queues helps zero-contention cases at the cost of the high-contention case.
+Sparse global information helps high-contention cases but increases latency in zero-contention cases to read and ``aggregate'' the information\footnote{Hierarchical structures, \eg binary search tree, effectively aggregate information but follow pointer chains, learning information at each node.
 Similarly, other sparse schemes need to read multiple cachelines to acquire all the information needed.}.
+Finally, dense local information has both the advantages of low latency in zero-contention cases and scalability in high-contention cases. However the information can become stale making it difficult to use to ensure correctness.
+Finally, dense local information has both the advantages of low latency in zero-contention cases and scalability in high-contention cases.
+However, the information can become stale making it difficult to use to ensure correctness.
 The fact that these solutions have these fundamental limits suggest to me a better solution that attempts to combine these properties in an interesting way.
 Also, the lock discussed in Section~\ref{sec:resize} allows for solutions that adapt to the number of processors, which could also prove useful.
 …
 How much scalability is actually needed is highly debatable.
 \emph{libfibre}\cite{libfibre} has compared favourably to other schedulers in webserver tests\cite{Karsten20} and uses a single atomic counter in its scheduling algorithm similarly to the proposed bitmask.
+\emph{libfibre}~\cite{libfibre} has compared favourably to other schedulers in webserver tests~\cite{Karsten20} and uses a single atomic counter in its scheduling algorithm similarly to the proposed bitmask.
 As such, the single atomic instruction on a shared cacheline may be sufficiently performant.
 I have built a prototype of this ready queue in the shape of a data queue, \ie nodes on the queue are structures with a single int representing a thread and intrusive data fields.
 Using this prototype, I ran preliminary performance experiments that confirm the expected performance in Table~\ref{tab:perfcases}.
 However, these experiments only offer a hint at the actual performance of the scheduler since threads form more complex operations than simple integer nodes, \eg threads are not independent of each other, when a thread blocks some other thread must intervene to wake it.
+I have built a prototype of this ready queue in the shape of a data queue, \ie nodes on the queue are structures with a single $int$ representing a thread and intrusive data fields.
+Using this prototype, preliminary performance experiments confirm the expected performance in Table~\ref{tab:perfcases}.
+However, these experiments only offer a hint at the actual performance of the scheduler since threads are involved in more complex operations, \eg threads are not independent of each other: when a thread blocks some other thread must intervene to wake it.
 I have also integrated this prototype into the \CFA runtime, but have not yet created performance experiments to compare results, as creating one-to-one comparisons between the prototype and the \CFA runtime will be complex.
 …
 Threads on a cluster are always scheduled on one of the processors of the cluster.
 Currently, the runtime handles dynamically adding and removing processors from clusters at any time.
 Since this is part of the existing design, the proposed scheduler must also support this behaviour.
+Since this feature is part of the existing design, the proposed scheduler must also support this behaviour.
 However, dynamically resizing a cluster is considered a rare event associated with setup, tear down and major configuration changes.
 This assumption is made both in the design of the proposed scheduler as well as in the original design of the \CFA runtime system.
 As such, the proposed scheduler must honour the correctness of this behaviour but does not have any performance objectives with regard to resizing a cluster.
 How long adding or removing processors take and how much this disrupts the performance of other threads is considered a secondary concern since it should be amortized over long periods of times.
+That is, the time to add or remove processors and how much this disrupts the performance of other threads is considered a secondary concern since it should be amortized over long periods of times.
 However, as mentioned in Section~\ref{sec:queue}, contention on the underlying queues can have a direct impact on performance.
 The number of underlying queues must therefore be adjusted as the number of processors grows or shrinks.
 …
 There are possible alternatives to the reader-writer lock solution.
 This problem is effectively a memory reclamation problem and as such there is a large body of research on the subject\cite{michael2004hazard, brown2015reclaiming}.
+This problem is effectively a memory reclamation problem and as such there is a large body of research on the subject~\cite{brown2015reclaiming, michael2004hazard}.
 However, the reader-write lock-solution is simple and can be leveraged to solve other problems (\eg processor ordering and memory reclamation of threads), which makes it an attractive solution.
 …
 Individual processors always finish scheduling user threads before looking for new work, which means that the last processor to go to sleep cannot miss threads scheduled from inside the cluster (if they do, that demonstrates the ready queue is not linearizable).
 However, this guarantee does not hold if threads are scheduled from outside the cluster, either due to an external event like timers and I/O, or due to a user (or kernel) thread migrating from a different cluster.
 In this case, missed signals can lead to the cluster deadlocking\footnote{Clusters should only deadlock in cases where a \CFA programmer \emph{actually} write \CFA code that leads to a deadlock.}.
+In this case, missed signals can lead to the cluster deadlocking\footnote{Clusters should only deadlock in cases where a \CFA programmer \emph{actually} writes \CFA code that leads to a deadlock.}.
 Therefore, it is important that the scheduling of threads include a mechanism where signals \emph{cannot} be missed.
 For performance reasons, it can be advantageous to have a secondary mechanism that allows signals to be missed in cases where it cannot lead to a deadlock.
+To be safe, this process must include a ``handshake'' where it is guaranteed that either~: the sleeping processor notices that a user thread is scheduled after the sleeping processor signalled its intent to block or code scheduling threads sees the intent to sleep before scheduling and be able to wake-up the processor.
+To be safe, this process must include a ``handshake'' where it is guaranteed that either:
+\begin{enumerate}
+\item
+the sleeping processor notices that a user thread is scheduled after the sleeping processor signalled its intent to block or
+\item
+code scheduling threads sees the intent to sleep before scheduling and be able to wake-up the processor.
+\end{enumerate}
 This matter is complicated by the fact that pthreads and Linux offer few tools to implement this solution and no guarantee of ordering of threads waking up for most of these tools.
 Another important issue is avoiding kernel threads sleeping and waking frequently because there is a significant operating-system cost.
 This scenario happens when a program oscillates between high and low activity, needing most and then fewer processors.
+This scenario happens when a program oscillates between high and low activity, needing most and then few processors.
 A possible partial solution is to order the processors so that the one which most recently went to sleep is woken up.
 This allows other sleeping processors to reach deeper sleep state (when these are available) while keeping ``hot'' processors warmer.
 …
 Processors that are unnecessarily unblocked lead to unnecessary contention, CPU usage, and power consumption, while too many sleeping processors can lead to suboptimal throughput.
 Furthermore, transitions from sleeping to awake and vice versa also add unnecessary latency.
 There is already a wealth of research on the subject\cite{schillings1996engineering, wiki:thunderherd} and I may use an existing approach for the idle-sleep heuristic in this project, \eg\cite{Karsten20}.
+There is already a wealth of research on the subject~\cite{schillings1996engineering, wiki:thunderherd} and I may use an existing approach for the idle-sleep heuristic in this project, \eg~\cite{Karsten20}.
 \subsection{Asynchronous I/O}
 …
 an event-engine to (de)multiplex the operations,
 \item
 and a synchronous interface for users to use.
+and a synchronous interface for users.
 \end{enumerate}
 None of these components currently exist in \CFA and I will need to build all three for this project.
 \paragraph{OS Abstraction}
 One fundamental part for converting blocking I/O operations into non-blocking ones is having an underlying asynchronous I/O interface to direct the I/O operations.
+\paragraph{OS Asynchronous Abstraction}
+One fundamental part for converting blocking I/O operations into non-blocking is having an underlying asynchronous I/O interface to direct the I/O operations.
 While there exists many different APIs for asynchronous I/O, it is not part of this proposal to create a novel API.
 It is sufficient to make one work in the complex context of the \CFA runtime.
 \uC uses the $select$\cite{select} as its interface, which handles ttys, pipes and sockets, but not disk.
+\uC uses the $select$~\cite{select} as its interface, which handles ttys, pipes and sockets, but not disk.
 $select$ entails significant complexity and is being replaced in UNIX operating systems, which make it a less interesting alternative.
 Another popular interface is $epoll$\cite{epoll}, which is supposed to be cheaper than $select$.
 However, $epoll$ also does not handle the file system and anecdotal evidence suggest it has problems with Linux pipes and $TTY$s.
 A popular cross-platform alternative is $libuv$\cite{libuv}, which offers asynchronous sockets and asynchronous file system operations (among other features).
+Another popular interface is $epoll$~\cite{epoll}, which is supposed to be cheaper than $select$.
+However, $epoll$ also does not handle the file system and anecdotal evidence suggest it has problems with Linux pipes and ttys.
+A popular cross-platform alternative is $libuv$~\cite{libuv}, which offers asynchronous sockets and asynchronous file system operations (among other features).
 However, as a full-featured library it includes much more than I need and could conflict with other features of \CFA unless significant effort is made to merge them together.
 A very recent alternative that I am investigating is $io_uring$\cite{io_uring}.
+A very recent alternative that I am investigating is $io_uring$~\cite{io_uring}.
 It claims to address some of the issues with $epoll$ and my early investigating suggests that the claim is accurate.
 $io_uring$ uses a much more general approach where system calls are registered to a queue and later executed by the kernel, rather than relying on system calls to return an error instead of blocking and subsequently waiting for changes on file descriptors.
 I believe this approach allows for fewer problems, \eg the manpage for $open$\cite{open} states:
+$io_uring$ uses a much more general approach where system calls are registered to a queue and later executed by the kernel, rather than relying on system calls to support returning an error instead of blocking.
+I believe this approach allows for fewer problems, \eg the manpage for $open$~\cite{open} states:
 \begin{quote}
 Note that [the $O_NONBLOCK$ flag] has no effect for regular files and block devices;
 …
 Since $O_NONBLOCK$ semantics might eventually be implemented, applications should not depend upon blocking behaviour when specifying this flag for regular files and block devices.
 \end{quote}
 This makes approach based on $epoll$/$select$ less reliable since they may not work for every file descriptors.
 For this reason, I plan to use $io_uring$ as the OS abstraction for the \CFA runtime unless further work shows problems I haven't encountered yet.
 However, only a small subset of the features are available in Ubuntu as of April 2020\cite{wiki:ubuntu-linux}, which will limit performance comparisons.
+This makes approaches based on $select$/$epoll$ less reliable since they may not work for every file descriptors.
+For this reason, I plan to use $io_uring$ as the OS abstraction for the \CFA runtime unless further work encounters a fatal problem.
+However, only a small subset of the features are available in Ubuntu as of April 2020~\cite{wiki:ubuntu-linux}, which will limit performance comparisons.
 I do not believe this will affect the comparison result.
 \paragraph{Event Engine}
 Laying on top of the asynchronous interface layer is the event engine.
+Above the OS asynchronous abstraction is the event engine.
 This engine is responsible for multiplexing (batching) the synchronous I/O requests into asynchronous I/O requests and demultiplexing the results to appropriate blocked user threads.
 This step can be straightforward for simple cases, but becomes quite complex when there are thousands of user threads performing both reads and writes, possibly on overlapping file descriptors.
 …
 The interface can be novel but it is preferable to match the existing POSIX interface when possible to be compatible with existing code.
 Matching allows C programs written using this interface to be transparently converted to \CFA with minimal effort.
 Where new functionality is needed, I will create a novel interface to fill gaps and provide advanced features.
+Where new functionality is needed, I will add novel interface extensions to fill gaps and provide advanced features.
 …
 \section{Discussion}
 I believe that runtime system and scheduling are still open topics.
 Many ``state of the art'' production frameworks still use single-threaded event loops because of performance considerations, \eg \cite{nginx-design}, and, to my knowledge, no widely available system language offers modern threading facilities.
+Many ``state of the art'' production frameworks still use single-threaded event loops because of performance considerations, \eg~\cite{nginx-design}, and, to my knowledge, no widely available system language offers modern threading facilities.
 I believe the proposed work offers a novel runtime and scheduling package, where existing work only offers fragments that users must assemble themselves when possible.

doc/theses/thierry_delisle_PhD/comp_II/img/system.fig

-              rae2c27a
+              rc76bd34
 #FIG 3.2  Produced by xfig version 3.2.5c
+#FIG 3.2  Produced by xfig version 3.2.7b
 Landscape
 Center
 …
 3 0 1 -1 -1 0 0 20 0.000 1 0.0000 4500 3600 15 15 4500 3600 4515 3615
 -6
-3225 4125 4650 4425
-4350 4200 4650 4350
-3 0 1 -1 -1 0 0 20 0.000 1 0.0000 4425 4275 15 15 4425 4275 4440 4290
-3 0 1 -1 -1 0 0 20 0.000 1 0.0000 4500 4275 15 15 4500 4275 4515 4290
-3 0 1 -1 -1 0 0 20 0.000 1 0.0000 4575 4275 15 15 4575 4275 4590 4290
--6
-1 0 1 -1 -1 0 0 -1 0.000 1 0.0000 3450 4275 225 150 3450 4275 3675 4425
-1 0 1 -1 -1 0 0 -1 0.000 1 0.0000 4050 4275 225 150 4050 4275 4275 4425
--6
-6675 4125 7500 4425
-7200 4200 7500 4350
-3 0 1 -1 -1 0 0 20 0.000 1 0.0000 7275 4275 15 15 7275 4275 7290 4290
-3 0 1 -1 -1 0 0 20 0.000 1 0.0000 7350 4275 15 15 7350 4275 7365 4290
-3 0 1 -1 -1 0 0 20 0.000 1 0.0000 7425 4275 15 15 7425 4275 7440 4290
--6
-1 0 1 -1 -1 0 0 -1 0.000 1 0.0000 6900 4275 225 150 6900 4275 7125 4425
--6
 6675 3525 8025 3975
 1 0 1 -1 -1 0 0 -1 0.000 0 0 -1 1 0 2
 …
 3 0 1 -1 -1 0 0 -1 0.000 1 0.0000 3975 2850 150 150 3975 2850 4125 2850
 3 0 1 -1 -1 0 0 -1 0.000 1 0.0000 7200 2775 150 150 7200 2775 7350 2775
 3 0 1 0 0 0 0 0 0.000 1 0.0000 2250 4830 30 30 2250 4830 2280 4860
+3 0 1 0 0 0 0 0 0.000 1 0.0000 2250 4830 30 30 2250 4830 2280 4830
 3 0 1 0 0 0 0 0 0.000 1 0.0000 7200 2775 30 30 7200 2775 7230 2805
 3 0 1 -1 -1 0 0 -1 0.000 1 0.0000 3525 3600 150 150 3525 3600 3675 3600
+3 0 1 -1 -1 0 0 -1 0.000 1 0.0000 3875 4800 100 100 3875 4800 3975 4800
+1 0 1 -1 -1 0 0 -1 0.000 1 0.0000 4650 4800 150 75 4650 4800 4800 4875
+3 0 1 -1 -1 0 0 -1 0.000 1 0.0000 4625 4838 100 100 4625 4838 4725 4838
 2 0 1 -1 -1 0 0 -1 0.000 0 0 0 0 0 5
 4200 2400 3750 1950 3750 1950 4200 2400 4200
 …
 1 1.00 45.00 90.00
 3750 7875 2325 7200 2325 7200 2550
+2 1 1 -1 -1 0 0 -1 3.000 0 0 0 0 0 5
+4950 6750 4950 6750 4725 6975 4725 6975 4950
 2 0 1 -1 -1 0 0 -1 0.000 0 0 0 0 0 5
 4950 5850 4725 5625 4725 5625 4950 5850 4950
+2 1 1 -1 -1 0 0 -1 3.000 0 0 0 0 0 5
+4950 6750 4950 6750 4725 6975 4725 6975 4950
+1 -1 0 0 0 10 0.0000 2 105 720 5550 4425 Processors\001
+1 -1 0 0 0 10 0.0000 2 120 1005 4200 3225 Blocked Tasks\001
+1 -1 0 0 0 10 0.0000 2 150 870 4200 3975 Ready Tasks\001
+1 -1 0 0 0 10 0.0000 2 135 1095 7350 1725 Other Cluster(s)\001
+1 -1 0 0 0 10 0.0000 2 105 840 4650 1725 User Cluster\001
+1 -1 0 0 0 10 0.0000 2 150 615 2175 3675 Manager\001
+1 -1 0 0 0 10 0.0000 2 105 990 2175 3525 Discrete-event\001
+1 -1 0 0 0 10 0.0000 2 135 795 2175 4350 preemption\001
+0 -1 0 0 0 10 0.0000 2 150 1290 2325 4875 generator/coroutine\001
+0 -1 0 0 0 10 0.0000 2 120 270 4050 4875 task\001
+0 -1 0 0 0 10 0.0000 2 105 450 7050 4875 cluster\001
+0 -1 0 0 0 10 0.0000 2 105 660 5925 4875 processor\001
+0 -1 0 0 0 10 0.0000 2 105 555 4875 4875 monitor\001
+1 -1 0 0 0 10 0.0000 2 135 900 5550 4425 Processors\001
+1 -1 0 0 0 10 0.0000 2 165 1170 4200 3975 Ready Threads\001
+1 -1 0 0 0 10 0.0000 2 165 1440 7350 1725 Other Cluster(s)\001
+1 -1 0 0 0 10 0.0000 2 135 1080 4650 1725 User Cluster\001
+1 -1 0 0 0 10 0.0000 2 165 630 2175 3675 Manager\001
+1 -1 0 0 0 10 0.0000 2 135 1260 2175 3525 Discrete-event\001
+1 -1 0 0 0 10 0.0000 2 150 900 2175 4350 preemption\001
+0 -1 0 0 0 10 0.0000 2 135 630 7050 4875 cluster\001
+1 -1 0 0 0 10 0.0000 2 135 1350 4200 3225 Blocked Threads\001
+0 -1 0 0 0 10 0.0000 2 135 540 4800 4875 thread\001
+0 -1 0 0 0 10 0.0000 2 120 810 5925 4875 processor\001
+0 -1 0 0 0 10 0.0000 2 165 1710 2325 4875 generator/coroutine\001

doc/user/Makefile

rae2c27a	rc76bd34
55	55
56	56	${DOCUMENT} : ${BASE}.ps
57		ps2pdf $<
	57	ps2pdf -dPDFSETTINGS=/prepress $<
58	58
59	59	${BASE}.ps : ${BASE}.dvi

doc/user/user.tex

-              rae2c27a
+              rc76bd34
 %% Created On       : Wed Apr  6 14:53:29 2016
 %% Last Modified By : Peter A. Buhr
 %% Last Modified On : Fri Mar  6 13:34:52 2020
 %% Update Count     : 3924
+%% Last Modified On : Mon Oct  5 08:57:29 2020
+%% Update Count     : 3998
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
 \usepackage{upquote}                                                                    % switch curled `'" to straight
 \usepackage{calc}
-\usepackage{xspace}
 \usepackage{varioref}                                                                   % extended references
+\usepackage{listings}                                                                   % format program code
+\usepackage[labelformat=simple,aboveskip=0pt,farskip=0pt]{subfig}
+\renewcommand{\thesubfigure}{\alph{subfigure})}
 \usepackage[flushmargin]{footmisc}                                              % support label/reference in footnote
 \usepackage{latexsym}                                   % \Box glyph
 \usepackage{mathptmx}                                   % better math font with "times"
 \usepackage[usenames]{color}
+\input{common}                                          % common CFA document macros
+\usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}
+\usepackage{breakurl}
+\usepackage[pagewise]{lineno}
+\renewcommand{\linenumberfont}{\scriptsize\sffamily}
+\usepackage[firstpage]{draftwatermark}
+\SetWatermarkLightness{0.9}
+% Default underscore is too low and wide. Cannot use lstlisting "literate" as replacing underscore
+% removes it as a variable-name character so keywords in variables are highlighted. MUST APPEAR
+% AFTER HYPERREF.
+\renewcommand{\textunderscore}{\leavevmode\makebox[1.2ex][c]{\rule{1ex}{0.075ex}}}
+\setlength{\topmargin}{-0.45in}                                                 % move running title into header
+\setlength{\headsep}{0.25in}
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\CFAStyle                                                                                               % use default CFA format-style
+\lstnewenvironment{C++}[1][]                            % use C++ style
+{\lstset{language=C++,moredelim=**[is][\protect\color{red}]{®}{®},#1}}
+{}
+\newcommand{\CFALatin}{}
 % inline code ©...© (copyright symbol) emacs: C-q M-)
 % red highlighting ®...® (registered trademark symbol) emacs: C-q M-.
 …
 % keyword escape ¶...¶ (pilcrow symbol) emacs: C-q M-^
 % math escape $...$ (dollar symbol)
+\input{common}                                          % common CFA document macros
+\usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}
+\usepackage{breakurl}
+\renewcommand\footnoterule{\kern -3pt\rule{0.3\linewidth}{0.15pt}\kern 2pt}
+\usepackage[pagewise]{lineno}
+\renewcommand{\linenumberfont}{\scriptsize\sffamily}
+\usepackage[firstpage]{draftwatermark}
+\SetWatermarkLightness{0.9}
+% Default underscore is too low and wide. Cannot use lstlisting "literate" as replacing underscore
+% removes it as a variable-name character so keywords in variables are highlighted. MUST APPEAR
+% AFTER HYPERREF.
+\renewcommand{\textunderscore}{\leavevmode\makebox[1.2ex][c]{\rule{1ex}{0.075ex}}}
+\setlength{\topmargin}{-0.45in}                                                 % move running title into header
+\setlength{\headsep}{0.25in}
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\CFAStyle                                                                                               % use default CFA format-style
+\lstnewenvironment{C++}[1][]                            % use C++ style
+{\lstset{language=C++,moredelim=**[is][\protect\color{red}]{®}{®},#1}}
+{}
+\newsavebox{\myboxA}
+\newsavebox{\myboxB}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
 \newcommand{\G}[1]{{\Textbf[OliveGreen]{#1}}}
 \newcommand{\KWC}{K-W C\xspace}
-\newsavebox{\LstBox}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
 The signature feature of \CFA is \emph{\Index{overload}able} \Index{parametric-polymorphic} functions~\cite{forceone:impl,Cormack90,Duggan96} with functions generalized using a ©forall© clause (giving the language its name):
 \begin{lstlisting}
+\begin{cfa}
 ®forall( otype T )® T identity( T val ) { return val; }
 int forty_two = identity( 42 ); §\C{// T is bound to int, forty\_two == 42}§
 \end{lstlisting}
+\end{cfa}
 % extending the C type system with parametric polymorphism and overloading, as opposed to the \Index*[C++]{\CC{}} approach of object-oriented extensions.
 \CFA{}\hspace{1pt}'s polymorphism was originally formalized by \Index*{Glen Ditchfield}\index{Ditchfield, Glen}~\cite{Ditchfield92}, and first implemented by \Index*{Richard Bilson}\index{Bilson, Richard}~\cite{Bilson03}.
 …
 \begin{comment}
 A simple example is leveraging the existing type-unsafe (©void *©) C ©bsearch© to binary search a sorted floating array:
 \begin{lstlisting}
+\begin{cfa}
 void * bsearch( const void * key, const void * base, size_t dim, size_t size,
                                 int (* compar)( const void *, const void * ));
 …
 double key = 5.0, vals[10] = { /* 10 sorted floating values */ };
 double * val = (double *)bsearch( &key, vals, 10, sizeof(vals[0]), comp ); §\C{// search sorted array}§
 \end{lstlisting}
+\end{cfa}
 which can be augmented simply with a polymorphic, type-safe, \CFA-overloaded wrappers:
 \begin{lstlisting}
+\begin{cfa}
 forall( otype T | { int ?<?( T, T ); } ) T * bsearch( T key, const T * arr, size_t size ) {
         int comp( const void * t1, const void * t2 ) { /* as above with double changed to T */ }
 …
 double * val = bsearch( 5.0, vals, 10 ); §\C{// selection based on return type}§
 int posn = bsearch( 5.0, vals, 10 );
 \end{lstlisting}
+\end{cfa}
 The nested function ©comp© provides the hidden interface from typed \CFA to untyped (©void *©) C, plus the cast of the result.
 Providing a hidden ©comp© function in \CC is awkward as lambdas do not use C calling-conventions and template declarations cannot appear at block scope.
 …
 \CFA has replacement libraries condensing hundreds of existing C functions into tens of \CFA overloaded functions, all without rewriting the actual computations.
 For example, it is possible to write a type-safe \CFA wrapper ©malloc© based on the C ©malloc©:
 \begin{lstlisting}
+\begin{cfa}
 forall( dtype T | sized(T) ) T * malloc( void ) { return (T *)malloc( sizeof(T) ); }
 int * ip = malloc(); §\C{// select type and size from left-hand side}§
 double * dp = malloc();
 struct S {...} * sp = malloc();
 \end{lstlisting}
+\end{cfa}
 where the return type supplies the type/size of the allocation, which is impossible in most type systems.
 \end{comment}
 …
 the same level as a ©case© clause; the target label may be case ©default©, but only associated
 with the current ©switch©/©choose© statement.
-\subsection{Loop Control}
-The ©for©/©while©/©do-while© loop-control allows empty or simplified ranges (see Figure~\ref{f:LoopControlExamples}).
-\begin{itemize}
-\item
-The loop index is polymorphic in the type of the comparison value N (when the start value is implicit) or the start value M.
-\item
-An empty conditional implies comparison value of ©1© (true).
-\item
-A comparison N is implicit up-to exclusive range [0,N©®)®©.
-\item
-A comparison ©=© N is implicit up-to inclusive range [0,N©®]®©.
-\item
-The up-to range M ©~©\index{~@©~©} N means exclusive range [M,N©®)®©.
-\item
-The up-to range M ©~=©\index{~=@©~=©} N means inclusive range [M,N©®]®©.
-\item
-The down-to range M ©-~©\index{-~@©-~©} N means exclusive range [N,M©®)®©.
-\item
-The down-to range M ©-~=©\index{-~=@©-~=©} N means inclusive range [N,M©®]®©.
-\item
-©0© is the implicit start value;
-\item
-©1© is the implicit increment value.
-\item
-The up-to range uses operator ©+=© for increment;
-\item
-The down-to range uses operator ©-=© for decrement.
-\item
-©@© means put nothing in this field.
-\item
-©:© means start another index.
-\end{itemize}
 \begin{figure}
 …
+\subsection{Loop Control}
+The ©for©/©while©/©do-while© loop-control allows empty or simplified ranges (see Figure~\ref{f:LoopControlExamples}).
+\begin{itemize}
+\item
+The loop index is polymorphic in the type of the comparison value N (when the start value is implicit) or the start value M.
+\item
+An empty conditional implies comparison value of ©1© (true).
+\item
+A comparison N is implicit up-to exclusive range [0,N©®)®©.
+\item
+A comparison ©=© N is implicit up-to inclusive range [0,N©®]®©.
+\item
+The up-to range M ©~©\index{~@©~©} N means exclusive range [M,N©®)®©.
+\item
+The up-to range M ©~=©\index{~=@©~=©} N means inclusive range [M,N©®]®©.
+\item
+The down-to range M ©-~©\index{-~@©-~©} N means exclusive range [N,M©®)®©.
+\item
+The down-to range M ©-~=©\index{-~=@©-~=©} N means inclusive range [N,M©®]®©.
+\item
+©0© is the implicit start value;
+\item
+©1© is the implicit increment value.
+\item
+The up-to range uses operator ©+=© for increment;
+\item
+The down-to range uses operator ©-=© for decrement.
+\item
+©@© means put nothing in this field.
+\item
+©:© means start another index.
+\end{itemize}
 %\subsection{\texorpdfstring{Labelled \protect\lstinline@continue@ / \protect\lstinline@break@}{Labelled continue / break}}
 \subsection{\texorpdfstring{Labelled \LstKeywordStyle{continue} / \LstKeywordStyle{break} Statement}{Labelled continue / break Statement}}
 …
 for ©break©, the target label can also be associated with a ©switch©, ©if© or compound (©{}©) statement.
 \VRef[Figure]{f:MultiLevelExit} shows ©continue© and ©break© indicating the specific control structure, and the corresponding C program using only ©goto© and labels.
 The innermost loop has 7 exit points, which cause continuation or termination of one or more of the 7 \Index{nested control-structure}s.
+The innermost loop has 8 exit points, which cause continuation or termination of one or more of the 7 \Index{nested control-structure}s.
 \begin{figure}
+\begin{tabular}{@{\hspace{\parindentlnth}}l@{\hspace{\parindentlnth}}l@{\hspace{\parindentlnth}}l@{}}
+\multicolumn{1}{@{\hspace{\parindentlnth}}c@{\hspace{\parindentlnth}}}{\textbf{\CFA}}   & \multicolumn{1}{@{\hspace{\parindentlnth}}c}{\textbf{C}}      \\
+\begin{cfa}
+®LC:® {
+        ... §declarations§ ...
+        ®LS:® switch ( ... ) {
+          case 3:
+                ®LIF:® if ( ... ) {
+                        ®LF:® for ( ... ) {
+                                ®LW:® while ( ... ) {
+                                        ... break ®LC®; ...
+                                        ... break ®LS®; ...
+                                        ... break ®LIF®; ...
+                                        ... continue ®LF;® ...
+                                        ... break ®LF®; ...
+                                        ... continue ®LW®; ...
+                                        ... break ®LW®; ...
+                                } // while
+                        } // for
+                } else {
+                        ... break ®LIF®; ...
+                } // if
+        } // switch
+\centering
+\begin{lrbox}{\myboxA}
+\begin{cfa}[tabsize=3]
+®Compound:® {
+        ®Try:® try {
+                ®For:® for ( ... ) {
+                        ®While:® while ( ... ) {
+                                ®Do:® do {
+                                        ®If:® if ( ... ) {
+                                                ®Switch:® switch ( ... ) {
+                                                        case 3:
+                                                                ®break Compound®;
+                                                                ®break Try®;
+                                                                ®break For®;      /* or */  ®continue For®;
+                                                                ®break While®;  /* or */  ®continue While®;
+                                                                ®break Do®;      /* or */  ®continue Do®;
+                                                                ®break If®;
+                                                                ®break Switch®;
+                                                        } // switch
+                                                } else {
+                                                        ... ®break If®; ...     // terminate if
+                                                } // if
+                                } while ( ... ); // do
+                        } // while
+                } // for
+        } ®finally® { // always executed
+        } // try
 } // compound
 \end{cfa}
+&
+\begin{cfa}
+\end{lrbox}
+\begin{lrbox}{\myboxB}
+\begin{cfa}[tabsize=3]
+{
+        ... §declarations§ ...
+        switch ( ... ) {
+          case 3:
+                if ( ... ) {
+                        for ( ... ) {
+                                while ( ... ) {
+                                        ... goto ®LC®; ...
+                                        ... goto ®LS®; ...
+                                        ... goto ®LIF®; ...
+                                        ... goto ®LFC®; ...
+                                        ... goto ®LFB®; ...
+                                        ... goto ®LWC®; ...
+                                        ... goto ®LWB®; ...
+                                  ®LWC®: ; } ®LWB:® ;
+                          ®LFC:® ; } ®LFB:® ;
+                } else {
+                        ... goto ®LIF®; ...
+                } ®L3:® ;
+        } ®LS:® ;
+} ®LC:® ;
+\end{cfa}
+&
+\begin{cfa}
+// terminate compound
+// terminate switch
+// terminate if
+// continue loop
+// terminate loop
+// continue loop
+// terminate loop
+// terminate if
+\end{cfa}
+\end{tabular}
+                ®ForC:® for ( ... ) {
+                        ®WhileC:® while ( ... ) {
+                                ®DoC:® do {
+                                        if ( ... ) {
+                                                switch ( ... ) {
+                                                        case 3:
+                                                                ®goto Compound®;
+                                                                ®goto Try®;
+                                                                ®goto ForB®;      /* or */  ®goto ForC®;
+                                                                ®goto WhileB®;  /* or */  ®goto WhileC®;
+                                                                ®goto DoB®;      /* or */  ®goto DoC®;
+                                                                ®goto If®;
+                                                                ®goto Switch®;
+                                                        } ®Switch:® ;
+                                                } else {
+                                                        ... ®goto If®; ...      // terminate if
+                                                } ®If:®;
+                                } while ( ... ); ®DoB:® ;
+                        } ®WhileB:® ;
+                } ®ForB:® ;
+} ®Compound:® ;
+\end{cfa}
+\end{lrbox}
+\subfloat[\CFA]{\label{f:CFibonacci}\usebox\myboxA}
+\hspace{2pt}
+\vrule
+\hspace{2pt}
+\subfloat[C]{\label{f:CFAFibonacciGen}\usebox\myboxB}
 \caption{Multi-level Exit}
 \label{f:MultiLevelExit}
 …
 try {
         f(...);
 } catch( E e ; §boolean-predicate§ ) {          §\C[8cm]{// termination handler}§
+} catch( E e ; §boolean-predicate§ ) {          §\C{// termination handler}§
         // recover and continue
 } catchResume( E e ; §boolean-predicate§ ) { §\C{// resumption handler}\CRT§
+} catchResume( E e ; §boolean-predicate§ ) { §\C{// resumption handler}§
         // repair and return
 } finally {
 …
 For implicit formatted input, the common case is reading a sequence of values separated by whitespace, where the type of an input constant must match with the type of the input variable.
 \begin{cquote}
 \begin{lrbox}{\LstBox}
+\begin{lrbox}{\myboxA}
 \begin{cfa}[aboveskip=0pt,belowskip=0pt]
 int x;   double y   char z;
 …
 \end{lrbox}
 \begin{tabular}{@{}l@{\hspace{3em}}l@{\hspace{3em}}l@{}}
 \multicolumn{1}{@{}l@{}}{\usebox\LstBox} \\
+\multicolumn{1}{@{}l@{}}{\usebox\myboxA} \\
 \multicolumn{1}{c@{\hspace{2em}}}{\textbf{\CFA}}        & \multicolumn{1}{c@{\hspace{2em}}}{\textbf{\CC}}       & \multicolumn{1}{c}{\textbf{Python}}   \\
 \begin{cfa}[aboveskip=0pt,belowskip=0pt]
 …
 For example, an initial alignment and fill capability are preserved during a resize copy so the copy has the same alignment and extended storage is filled.
 Without sticky properties it is dangerous to use ©realloc©, resulting in an idiom of manually performing the reallocation to maintain correctness.
+\begin{cfa}
+\end{cfa}
 \CFA memory management extends allocation to support constructors for initialization of allocated storage, \eg in
 …
         // §\CFA§ safe general allocation, fill, resize, alignment, array
+        T * alloc( void );§\indexc{alloc}§
+        T * alloc( size_t dim );
+        T * alloc( T ptr[], size_t dim );
+        T * alloc_set( char fill );§\indexc{alloc_set}§
+        T * alloc_set( T fill );
+        T * alloc_set( size_t dim, char fill );
+        T * alloc_set( size_t dim, T fill );
+        T * alloc_set( size_t dim, const T fill[] );
+        T * alloc_set( T ptr[], size_t dim, char fill );
+        T * alloc_align( size_t align );
+        T * alloc_align( size_t align, size_t dim );
+        T * alloc_align( T ptr[], size_t align ); // aligned realloc array
+        T * alloc_align( T ptr[], size_t align, size_t dim ); // aligned realloc array
+        T * alloc_align_set( size_t align, char fill );
+        T * alloc_align_set( size_t align, T fill );
+        T * alloc_align_set( size_t align, size_t dim, char fill );
+        T * alloc_align_set( size_t align, size_t dim, T fill );
+        T * alloc_align_set( size_t align, size_t dim, const T fill[] );
+        T * alloc_align_set( T ptr[], size_t align, size_t dim, char fill );
+        T * alloc( void );§\indexc{alloc}§                                      §\C[3.5in]{// variable, T size}§
+        T * alloc( size_t dim );                                                        §\C{// array[dim], T size elements}§
+        T * alloc( T ptr[], size_t dim );                                       §\C{// realloc array[dim], T size elements}§
+        T * alloc_set( char fill );§\indexc{alloc_set}§         §\C{// variable, T size, fill bytes with value}§
+        T * alloc_set( T fill );                                                        §\C{// variable, T size, fill with value}§
+        T * alloc_set( size_t dim, char fill );                         §\C{// array[dim], T size elements, fill bytes with value}§
+        T * alloc_set( size_t dim, T fill );                            §\C{// array[dim], T size elements, fill elements with value}§
+        T * alloc_set( size_t dim, const T fill[] );            §\C{// array[dim], T size elements, fill elements with array}§
+        T * alloc_set( T ptr[], size_t dim, char fill );        §\C{// realloc array[dim], T size elements, fill bytes with value}§
+        T * alloc_align( size_t align );                                        §\C{// aligned variable, T size}§
+        T * alloc_align( size_t align, size_t dim );            §\C{// aligned array[dim], T size elements}§
+        T * alloc_align( T ptr[], size_t align );                       §\C{// realloc new aligned array}§
+        T * alloc_align( T ptr[], size_t align, size_t dim ); §\C{// realloc new aligned array[dim]}§
+        T * alloc_align_set( size_t align, char fill );         §\C{// aligned variable, T size, fill bytes with value}§
+        T * alloc_align_set( size_t align, T fill );            §\C{// aligned variable, T size, fill with value}§
+        T * alloc_align_set( size_t align, size_t dim, char fill ); §\C{// aligned array[dim], T size elements, fill bytes with value}§
+        T * alloc_align_set( size_t align, size_t dim, T fill ); §\C{// aligned array[dim], T size elements, fill elements with value}§
+        T * alloc_align_set( size_t align, size_t dim, const T fill[] ); §\C{// aligned array[dim], T size elements, fill elements with array}§
+        T * alloc_align_set( T ptr[], size_t align, size_t dim, char fill ); §\C{// realloc new aligned array[dim], fill new bytes with value}§
         // §\CFA§ safe initialization/copy, i.e., implicit size specification

libcfa/configure.ac

-              rae2c27a
+              rc76bd34
 AH_TEMPLATE([CFA_HAVE_IORING_OP_PROVIDE_BUFFERS],[Defined if io_uring support is present when compiling libcfathread and supports the operation IORING_OP_PROVIDE_BUFFERS.])
 AH_TEMPLATE([CFA_HAVE_IORING_OP_REMOVE_BUFFER],[Defined if io_uring support is present when compiling libcfathread and supports the operation IORING_OP_REMOVE_BUFFER.])
+AH_TEMPLATE([CFA_HAVE_IORING_OP_TEE],[Defined if io_uring support is present when compiling libcfathread and supports the operation IORING_OP_TEE.])
 AH_TEMPLATE([CFA_HAVE_IOSQE_FIXED_FILE],[Defined if io_uring support is present when compiling libcfathread and supports the flag FIXED_FILE.])
 AH_TEMPLATE([CFA_HAVE_IOSQE_IO_DRAIN],[Defined if io_uring support is present when compiling libcfathread and supports the flag IO_DRAIN.])
 …
 AH_TEMPLATE([CFA_HAVE_SPLICE_F_FD_IN_FIXED],[Defined if io_uring support is present when compiling libcfathread and supports the flag SPLICE_F_FD_IN_FIXED.])
 AH_TEMPLATE([CFA_HAVE_IORING_SETUP_ATTACH_WQ],[Defined if io_uring support is present when compiling libcfathread and supports the flag IORING_SETUP_ATTACH_WQ.])
+AH_TEMPLATE([HAVE_PREADV2],[Defined if preadv2 support is present when compiling libcfathread.])
+AH_TEMPLATE([HAVE_PWRITEV2],[Defined if pwritev2 support is present when compiling libcfathread.])
+AH_TEMPLATE([CFA_HAVE_PREADV2],[Defined if preadv2 support is present when compiling libcfathread.])
+AH_TEMPLATE([CFA_HAVE_PWRITEV2],[Defined if pwritev2 support is present when compiling libcfathread.])
+AH_TEMPLATE([CFA_HAVE_PWRITEV2],[Defined if pwritev2 support is present when compiling libcfathread.])
+AH_TEMPLATE([CFA_HAVE_STATX],[Defined if statx support is present when compiling libcfathread.])
+AH_TEMPLATE([CFA_HAVE_OPENAT2],[Defined if openat2 support is present when compiling libcfathread.])
 AH_TEMPLATE([__CFA_NO_STATISTICS__],[Defined if libcfathread was compiled without support for statistics.])
 define(ioring_ops, [IORING_OP_NOP,IORING_OP_READV,IORING_OP_WRITEV,IORING_OP_FSYNC,IORING_OP_READ_FIXED,IORING_OP_WRITE_FIXED,IORING_OP_POLL_ADD,IORING_OP_POLL_REMOVE,IORING_OP_SYNC_FILE_RANGE,IORING_OP_SENDMSG,IORING_OP_RECVMSG,IORING_OP_TIMEOUT,IORING_OP_TIMEOUT_REMOVE,IORING_OP_ACCEPT,IORING_OP_ASYNC_CANCEL,IORING_OP_LINK_TIMEOUT,IORING_OP_CONNECT,IORING_OP_FALLOCATE,IORING_OP_OPENAT,IORING_OP_CLOSE,IORING_OP_FILES_UPDATE,IORING_OP_STATX,IORING_OP_READ,IORING_OP_WRITE,IORING_OP_FADVISE,IORING_OP_MADVISE,IORING_OP_SEND,IORING_OP_RECV,IORING_OP_OPENAT2,IORING_OP_EPOLL_CTL,IORING_OP_SPLICE,IORING_OP_PROVIDE_BUFFERS,IORING_OP_REMOVE_BUFFER])
+define(ioring_ops, [IORING_OP_NOP,IORING_OP_READV,IORING_OP_WRITEV,IORING_OP_FSYNC,IORING_OP_READ_FIXED,IORING_OP_WRITE_FIXED,IORING_OP_POLL_ADD,IORING_OP_POLL_REMOVE,IORING_OP_SYNC_FILE_RANGE,IORING_OP_SENDMSG,IORING_OP_RECVMSG,IORING_OP_TIMEOUT,IORING_OP_TIMEOUT_REMOVE,IORING_OP_ACCEPT,IORING_OP_ASYNC_CANCEL,IORING_OP_LINK_TIMEOUT,IORING_OP_CONNECT,IORING_OP_FALLOCATE,IORING_OP_OPENAT,IORING_OP_CLOSE,IORING_OP_FILES_UPDATE,IORING_OP_STATX,IORING_OP_READ,IORING_OP_WRITE,IORING_OP_FADVISE,IORING_OP_MADVISE,IORING_OP_SEND,IORING_OP_RECV,IORING_OP_OPENAT2,IORING_OP_EPOLL_CTL,IORING_OP_SPLICE,IORING_OP_PROVIDE_BUFFERS,IORING_OP_REMOVE_BUFFER,IORING_OP_TEE])
 define(ioring_flags, [IOSQE_FIXED_FILE,IOSQE_IO_DRAIN,IOSQE_ASYNC,IOSQE_IO_LINK,IOSQE_IO_HARDLINK,SPLICE_F_FD_IN_FIXED,IORING_SETUP_ATTACH_WQ])
 …
         ])
 ])
+AC_CHECK_FUNCS([preadv2 pwritev2])
+AC_CHECK_FUNC([preadv2], [AC_DEFINE([CFA_HAVE_PREADV2])])
+AC_CHECK_FUNC([pwritev2], [AC_DEFINE([CFA_HAVE_PWRITEV2])])
 AC_CONFIG_FILES([
 …
         prelude/Makefile
         ])
+AC_CONFIG_FILES([src/concurrency/io/call.cfa], [python3 ${srcdir}/src/concurrency/io/call.cfa.in > src/concurrency/io/call.cfa])
 AC_CONFIG_HEADERS(prelude/defines.hfa)

libcfa/prelude/defines.hfa.in

-              rae2c27a
+              rc76bd34
 /* Defined if io_uring support is present when compiling libcfathread and
+   supports the operation IORING_OP_TEE. */
+#undef CFA_HAVE_IORING_OP_TEE
+/* Defined if io_uring support is present when compiling libcfathread and
    supports the operation IORING_OP_TIMEOUT. */
 #undef CFA_HAVE_IORING_OP_TIMEOUT
 …
 #undef CFA_HAVE_LINUX_IO_URING_H
+/* Defined if openat2 support is present when compiling libcfathread. */
+#undef CFA_HAVE_OPENAT2
+/* Defined if preadv2 support is present when compiling libcfathread. */
+#undef CFA_HAVE_PREADV2
+/* Defined if pwritev2 support is present when compiling libcfathread. */
+#undef CFA_HAVE_PWRITEV2
 /* Defined if io_uring support is present when compiling libcfathread and
    supports the flag SPLICE_F_FD_IN_FIXED. */
 #undef CFA_HAVE_SPLICE_F_FD_IN_FIXED
+/* Defined if statx support is present when compiling libcfathread. */
+#undef CFA_HAVE_STATX
 /* Location of include files. */
 #undef CFA_INCDIR
 …
 #undef HAVE_MEMORY_H
-/* Define to 1 if you have the `preadv2' function. */
-#undef HAVE_PREADV2
-/* Define to 1 if you have the `pwritev2' function. */
-#undef HAVE_PWRITEV2
 /* Define to 1 if you have the <stdint.h> header file. */
 #undef HAVE_STDINT_H

libcfa/src/Makefile.am

-              rae2c27a
+              rc76bd34
         iterator.hfa \
         limits.hfa \
+        memory.hfa \
         parseargs.hfa \
         rational.hfa \
 …
 inst_thread_headers_nosrc = \
         bits/random.hfa \
+        concurrency/clib/cfathread.h \
         concurrency/invoke.h \
         concurrency/kernel/fwd.hfa
 …
         concurrency/alarm.cfa \
         concurrency/alarm.hfa \
+        concurrency/clib/cfathread.cfa \
         concurrency/CtxSwitch-@ARCHITECTURE@.S \
         concurrency/invoke.c \
 …
         concurrency/io/setup.cfa \
         concurrency/io/types.hfa \
         concurrency/iocall.cfa \
+        concurrency/io/call.cfa \
         concurrency/iofwd.hfa \
         concurrency/kernel_private.hfa \

libcfa/src/bits/locks.hfa

-              rae2c27a
+              rc76bd34
         struct $thread;
         extern void park( __cfaabi_dbg_ctx_param );
         extern void unpark( struct $thread * this __cfaabi_dbg_ctx_param2 );
+        extern void park( void );
+        extern void unpark( struct $thread * this );
         static inline struct $thread * active_thread ();
 …
                                         /* paranoid */ verify( expected == 0p );
                                         if(__atomic_compare_exchange_n(&this.ptr, &expected, active_thread(), false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) {
                                                 park( __cfaabi_dbg_ctx );
+                                                park();
                                                 return true;
+                                        }
 …
                                 else {
                                         if(__atomic_compare_exchange_n(&this.ptr, &expected, 0p, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) {
                                                 unpark( expected __cfaabi_dbg_ctx2 );
+                                                unpark( expected );
                                                 return true;
+                                        }
 …
                                 /* paranoid */ verify( expected == 0p );
                                 if(__atomic_compare_exchange_n(&this.ptr, &expected, active_thread(), false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) {
                                         park( __cfaabi_dbg_ctx );
+                                        park();
                                         /* paranoid */ verify( this.ptr == 1p );
                                         return true;
 …
                         struct $thread * got = __atomic_exchange_n( &this.ptr, 1p, __ATOMIC_SEQ_CST);
                         if( got == 0p ) return false;
                         unpark( got __cfaabi_dbg_ctx2 );
+                        unpark( got );
                         return true;
+                }
 …
                                 struct oneshot * expected = this.ptr;
                                 // was this abandoned?
+                                if( expected == 3p ) { free( &this ); return false; }
+                                #if defined(__GNUC__) && __GNUC__ >= 7
+                                        #pragma GCC diagnostic push
+                                        #pragma GCC diagnostic ignored "-Wfree-nonheap-object"
+                                #endif
+                                        if( expected == 3p ) { free( &this ); return false; }
+                                #if defined(__GNUC__) && __GNUC__ >= 7
+                                        #pragma GCC diagnostic pop
+                                #endif
                                 /* paranoid */ verify( expected != 1p ); // Future is already fulfilled, should not happen

libcfa/src/concurrency/CtxSwitch-i386.S

-              rae2c27a
+              rc76bd34
 // Created On       : Tue Dec 6 12:27:26 2016
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Sun Aug 16 08:46:22 2020
 // Update Count     : 4
+// Last Modified On : Sun Sep  6 18:23:37 2020
+// Update Count     : 5
 //
 …
         // Copy the "from" context argument from the stack to register eax
         // Return address is at 0(%esp), with parameters following
+        // Return address is at 0(%esp), with parameters following.
         movl 4(%esp),%eax
 …
         movl %ebp,FP_OFFSET(%eax)
         // Copy the "to" context argument from the stack to register eax
         // Having pushed three words (= 12 bytes) on the stack, the
         // argument is now at 8 + 12 = 20(%esp)
+        // Copy the "to" context argument from the stack to register eax. Having
+        // pushed 3 words (= 12 bytes) on the stack, the argument is now at
+        // 8 + 12 = 20(%esp).
         movl 20(%esp),%eax

libcfa/src/concurrency/alarm.cfa

rae2c27a	rc76bd34
130	130
131	131	register_self( &node );
132		park( ~~__cfaabi_dbg_ctx~~ );
	132	park();
133	133
134	134	/* paranoid */ verify( !node.set );

libcfa/src/concurrency/coroutine.cfa

-              rae2c27a
+              rc76bd34
 //-----------------------------------------------------------------------------
+FORALL_DATA_INSTANCE(CoroutineCancelled,
+                (dtype coroutine_t | sized(coroutine_t)), (coroutine_t))
+struct __cfaehm_node {
+        struct _Unwind_Exception unwind_exception;
+        struct __cfaehm_node * next;
+        int handler_index;
+};
+forall(dtype T)
+void mark_exception(CoroutineCancelled(T) *) {}
+forall(dtype T | sized(T))
+void copy(CoroutineCancelled(T) * dst, CoroutineCancelled(T) * src) {
+        dst->the_coroutine = src->the_coroutine;
+        dst->the_exception = src->the_exception;
+}
+forall(dtype T)
+const char * msg(CoroutineCancelled(T) *) {
+        return "CoroutineCancelled(...)";
+}
+// This code should not be inlined. It is the error path on resume.
+forall(dtype T | is_coroutine(T))
+void __cfaehm_cancelled_coroutine( T & cor, $coroutine * desc ) {
+        verify( desc->cancellation );
+        desc->state = Cancelled;
+        exception_t * except = (exception_t *)(1 + (__cfaehm_node *)desc->cancellation);
+        CoroutineCancelled(T) except;
+        except.the_coroutine = &cor;
+        except.the_exception = except;
+        throwResume except;
+        except->virtual_table->free( except );
+        free( desc->cancellation );
+        desc->cancellation = 0p;
+}
+//-----------------------------------------------------------------------------
 // Global state variables
 …
         this->storage->limit = storage;
         this->storage->base  = (void*)((intptr_t)storage + size);
+        this->storage->exception_context.top_resume = 0p;
+        this->storage->exception_context.current_exception = 0p;
         __attribute__((may_alias)) intptr_t * istorage = (intptr_t*)&this->storage;
         *istorage |= userStack ? 0x1 : 0x0;

libcfa/src/concurrency/coroutine.hfa

-              rae2c27a
+              rc76bd34
 #include <assert.h>
 #include "invoke.h"
+#include "../exception.hfa"
+//-----------------------------------------------------------------------------
+// Exception thrown from resume when a coroutine stack is cancelled.
+// Should not have to be be sized (see trac #196).
+FORALL_DATA_EXCEPTION(CoroutineCancelled,
+                (dtype coroutine_t | sized(coroutine_t)), (coroutine_t)) (
+        coroutine_t * the_coroutine;
+        exception_t * the_exception;
+);
+forall(dtype T)
+void mark_exception(CoroutineCancelled(T) *);
+forall(dtype T | sized(T))
+void copy(CoroutineCancelled(T) * dst, CoroutineCancelled(T) * src);
+forall(dtype T)
+const char * msg(CoroutineCancelled(T) *);
 //-----------------------------------------------------------------------------
 …
 // Anything that implements this trait can be resumed.
 // Anything that is resumed is a coroutine.
+trait is_coroutine(dtype T) {
+      void main(T & this);
+      $coroutine * get_coroutine(T & this);
+trait is_coroutine(dtype T | sized(T)
+                | is_resumption_exception(CoroutineCancelled(T))
+                | VTABLE_ASSERTION(CoroutineCancelled, (T))) {
+        void main(T & this);
+        $coroutine * get_coroutine(T & this);
 };
 …
+        }
+}
+forall(dtype T | is_coroutine(T))
+void __cfaehm_cancelled_coroutine( T & cor, $coroutine * desc );
 // Resume implementation inlined for performance
 …
         // always done for performance testing
         $ctx_switch( src, dst );
+        if ( unlikely(dst->cancellation) ) {
+                __cfaehm_cancelled_coroutine( cor, dst );
+        }
         return cor;

libcfa/src/concurrency/exception.cfa

-              rae2c27a
+              rc76bd34
 STOP_AT_END_FUNCTION(coroutine_cancelstop,
+        // TODO: Instead pass information to the last resumer.
+        struct $coroutine * src = ($coroutine *)stop_param;
+        struct $coroutine * dst = src->last;
+        $ctx_switch( src, dst );
         abort();
+)

libcfa/src/concurrency/exception.hfa

-              rae2c27a
+              rc76bd34
 #include "bits/defs.hfa"
 #include "invoke.h"
-struct _Unwind_Exception;
-// It must also be usable as a C header file.
 #ifdef __cforall
 extern "C" {
+#define HIDE_EXPORTS
 #endif
+#include "unwind.h"
 struct exception_context_t * this_exception_context(void) OPTIONAL_THREAD;
 …
 #ifdef __cforall
+#undef HIDE_EXPORTS
+}
 #endif

libcfa/src/concurrency/invoke.h

-              rae2c27a
+              rc76bd34
         };
         enum __Coroutine_State { Halted, Start, Primed, Blocked, Ready, Active };
+        enum __Coroutine_State { Halted, Start, Primed, Blocked, Ready, Active, Cancelled };
         struct $coroutine {
 …
         };
+        // Wrapper for gdb
+        struct cfathread_coroutine_t { struct $coroutine debug; };
         static inline struct __stack_t * __get_stack( struct $coroutine * cor ) {
 …
                 struct __condition_node_t * dtor_node;
         };
+        // Wrapper for gdb
+        struct cfathread_monitor_t { struct $monitor debug; };
         struct __monitor_group_t {
 …
                 } node;
+                #ifdef __CFA_DEBUG__
+                        // previous function to park/unpark the thread
+                        const char * park_caller;
+                        int park_result;
+                        enum __Coroutine_State park_state;
+                        bool park_stale;
+                        const char * unpark_caller;
+                        int unpark_result;
+                        enum __Coroutine_State unpark_state;
+                        bool unpark_stale;
+                #if defined( __CFA_WITH_VERIFY__ )
+                        unsigned long long canary;
                 #endif
         };
+        // Wrapper for gdb
+        struct cfathread_thread_t { struct $thread debug; };
         #ifdef __CFA_DEBUG__

libcfa/src/concurrency/io.cfa

-              rae2c27a
+              rc76bd34
                 if( block ) {
                         enable_interrupts( __cfaabi_dbg_ctx );
                         park( __cfaabi_dbg_ctx );
+                        park();
                         disable_interrupts();
+                }
 …
                 if(nextt) {
                         unpark( nextt __cfaabi_dbg_ctx2 );
+                        unpark( nextt );
                         enable_interrupts( __cfaabi_dbg_ctx );
                         return true;
 …
         static inline void process(struct io_uring_cqe & cqe ) {
+                struct __io_user_data_t * data = (struct __io_user_data_t *)(uintptr_t)cqe.user_data;
+                __cfadbg_print_safe( io, "Kernel I/O : Syscall completed : cqe %p, result %d for %p\n", data, cqe.res, data->thrd );
+                data->result = cqe.res;
+                post( data->sem );
+                struct io_future_t * future = (struct io_future_t *)(uintptr_t)cqe.user_data;
+                __cfadbg_print_safe( io, "Kernel I/O : Syscall completed : cqe %p, result %d for %p\n", future, cqe.res, data->thrd );
+                fulfil( *future, cqe.res );
+        }

libcfa/src/concurrency/io/setup.cfa

-              rae2c27a
+              rc76bd34
         static void * iopoll_loop( __attribute__((unused)) void * args ) {
                 __processor_id_t id;
+                id.full_proc = false;
                 id.id = doregister(&id);
                 __cfaabi_dbg_print_safe( "Kernel : IO poller thread starting\n" );
 …
                                         thrd.link.next = 0p;
                                         thrd.link.prev = 0p;
-                                        __cfaabi_dbg_debug_do( thrd.unpark_stale = true );
                                         // Fixup the thread state
 …
                                 // unpark the fast io_poller
                                 unpark( &thrd __cfaabi_dbg_ctx2 );
+                                unpark( &thrd );
+                        }
                         else {
 …
+                        }
                 } else {
                         unpark( &thrd __cfaabi_dbg_ctx2 );
+                        unpark( &thrd );
+                }

libcfa/src/concurrency/io/types.hfa

-              rae2c27a
+              rc76bd34
 #pragma once
+extern "C" {
+        #include <linux/types.h>
+}
+#include "bits/locks.hfa"
 #if defined(CFA_HAVE_LINUX_IO_URING_H)
-        extern "C" {
-                #include <linux/types.h>
+        }
-      #include "bits/locks.hfa"
         #define LEADER_LOCK
         struct __leaderlock_t {
 …
         };
-        //-----------------------------------------------------------------------
-        // IO user data
-        struct __io_user_data_t {
-                __s32 result;
-                oneshot sem;
-        };
         //-----------------------------------------------------------------------
         // Misc
 …
         void __ioctx_prepare_block($io_ctx_thread & ctx, struct epoll_event & ev);
 #endif
+//-----------------------------------------------------------------------
+// IO user data
+struct io_future_t {
+        future_t self;
+        __s32 result;
+};
+static inline {
+        bool fulfil( io_future_t & this, __s32 result ) {
+                this.result = result;
+                return fulfil(this.self);
+        }
+        // Wait for the future to be fulfilled
+        bool wait( io_future_t & this ) {
+                return wait(this.self);
+        }
+}

libcfa/src/concurrency/iofwd.hfa

-              rae2c27a
+              rc76bd34
 struct cluster;
+struct io_future_t;
 struct io_context;
 struct io_cancellation;
 …
 struct statx;
+extern ssize_t cfa_preadv2(int fd, const struct iovec *iov, int iovcnt, off_t offset, int flags, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern ssize_t cfa_pwritev2(int fd, const struct iovec *iov, int iovcnt, off_t offset, int flags, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern int cfa_fsync(int fd, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern int cfa_sync_file_range(int fd, int64_t offset, int64_t nbytes, unsigned int flags, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern ssize_t cfa_sendmsg(int sockfd, const struct msghdr *msg, int flags, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern ssize_t cfa_recvmsg(int sockfd, struct msghdr *msg, int flags, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern ssize_t cfa_send(int sockfd, const void *buf, size_t len, int flags, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern ssize_t cfa_recv(int sockfd, void *buf, size_t len, int flags, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern int cfa_accept4(int sockfd, struct sockaddr *addr, socklen_t *addrlen, int flags, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern int cfa_connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern int cfa_fallocate(int fd, int mode, uint64_t offset, uint64_t len, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern int cfa_fadvise(int fd, uint64_t offset, uint64_t len, int advice, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern int cfa_madvise(void *addr, size_t length, int advice, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern int cfa_openat(int dirfd, const char *pathname, int flags, mode_t mode, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern int cfa_close(int fd, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern int cfa_statx(int dirfd, const char *pathname, int flags, unsigned int mask, struct statx *statxbuf, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern ssize_t cfa_read(int fd, void *buf, size_t count, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern ssize_t cfa_write(int fd, void *buf, size_t count, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern ssize_t cfa_splice(int fd_in, loff_t *off_in, int fd_out, loff_t *off_out, size_t len, unsigned int flags, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+extern ssize_t cfa_tee(int fd_in, int fd_out, size_t len, unsigned int flags, int submit_flags = 0, Duration timeout = -1`s, io_cancellation * cancellation = 0p, io_context * context = 0p);
+//----------
+// synchronous calls
+#if defined(CFA_HAVE_PREADV2)
+        extern ssize_t cfa_preadv2(int fd, const struct iovec *iov, int iovcnt, off_t offset, int flags, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+#endif
+#if defined(CFA_HAVE_PWRITEV2)
+        extern ssize_t cfa_pwritev2(int fd, const struct iovec *iov, int iovcnt, off_t offset, int flags, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+#endif
+extern int cfa_fsync(int fd, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+extern int cfa_epoll_ctl(int epfd, int op, int fd, struct epoll_event *event, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+extern int cfa_sync_file_range(int fd, off64_t offset, off64_t nbytes, unsigned int flags, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+extern  ssize_t cfa_sendmsg(int sockfd, const struct msghdr *msg, int flags, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+extern ssize_t cfa_recvmsg(int sockfd, struct msghdr *msg, int flags, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+extern ssize_t cfa_send(int sockfd, const void *buf, size_t len, int flags, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+extern ssize_t cfa_recv(int sockfd, void *buf, size_t len, int flags, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+extern int cfa_accept4(int sockfd, struct sockaddr *addr, socklen_t *addrlen, int flags, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+extern int cfa_connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+extern int cfa_fallocate(int fd, int mode, off_t offset, off_t len, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+extern int cfa_posix_fadvise(int fd, off_t offset, off_t len, int advice, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+extern int cfa_madvise(void *addr, size_t length, int advice, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+extern int cfa_openat(int dirfd, const char *pathname, int flags, mode_t mode, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+#if defined(CFA_HAVE_OPENAT2)
+        extern int cfa_openat2(int dirfd, const char *pathname, struct open_how * how, size_t size, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+#endif
+extern int cfa_close(int fd, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+#if defined(CFA_HAVE_STATX)
+        extern int cfa_statx(int dirfd, const char *pathname, int flags, unsigned int mask, struct statx *statxbuf, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+#endif
+extern ssize_t cfa_read(int fd, void * buf, size_t count, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+extern ssize_t cfa_write(int fd, void * buf, size_t count, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+extern ssize_t cfa_splice(int fd_in, loff_t *off_in, int fd_out, loff_t *off_out, size_t len, unsigned int flags, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+extern ssize_t cfa_tee(int fd_in, int fd_out, size_t len, unsigned int flags, int submit_flags, Duration timeout, io_cancellation * cancellation, io_context * context);
+//----------
+// asynchronous calls
+#if defined(CFA_HAVE_PREADV2)
+        extern void async_preadv2(io_future_t & future, int fd, const struct iovec *iov, int iovcnt, off_t offset, int flags, int submit_flags, io_cancellation * cancellation, io_context * context);
+#endif
+#if defined(CFA_HAVE_PWRITEV2)
+        extern void async_pwritev2(io_future_t & future, int fd, const struct iovec *iov, int iovcnt, off_t offset, int flags, int submit_flags, io_cancellation * cancellation, io_context * context);
+#endif
+extern void async_fsync(io_future_t & future, int fd, int submit_flags, io_cancellation * cancellation, io_context * context);
+extern void async_epoll_ctl(io_future_t & future, int epfd, int op, int fd, struct epoll_event *event, int submit_flags, io_cancellation * cancellation, io_context * context);
+extern void async_sync_file_range(io_future_t & future, int fd, off64_t offset, off64_t nbytes, unsigned int flags, int submit_flags, io_cancellation * cancellation, io_context * context);
+extern void async_sendmsg(io_future_t & future, int sockfd, const struct msghdr *msg, int flags, int submit_flags, io_cancellation * cancellation, io_context * context);
+extern void async_recvmsg(io_future_t & future, int sockfd, struct msghdr *msg, int flags, int submit_flags, io_cancellation * cancellation, io_context * context);
+extern void async_send(io_future_t & future, int sockfd, const void *buf, size_t len, int flags, int submit_flags, io_cancellation * cancellation, io_context * context);
+extern void async_recv(io_future_t & future, int sockfd, void *buf, size_t len, int flags, int submit_flags, io_cancellation * cancellation, io_context * context);
+extern void async_accept4(io_future_t & future, int sockfd, struct sockaddr *addr, socklen_t *addrlen, int flags, int submit_flags, io_cancellation * cancellation, io_context * context);
+extern void async_connect(io_future_t & future, int sockfd, const struct sockaddr *addr, socklen_t addrlen, int submit_flags, io_cancellation * cancellation, io_context * context);
+extern void async_fallocate(io_future_t & future, int fd, int mode, off_t offset, off_t len, int submit_flags, io_cancellation * cancellation, io_context * context);
+extern void async_posix_fadvise(io_future_t & future, int fd, off_t offset, off_t len, int advice, int submit_flags, io_cancellation * cancellation, io_context * context);
+extern void async_madvise(io_future_t & future, void *addr, size_t length, int advice, int submit_flags, io_cancellation * cancellation, io_context * context);
+extern void async_openat(io_future_t & future, int dirfd, const char *pathname, int flags, mode_t mode, int submit_flags, io_cancellation * cancellation, io_context * context);
+#if defined(CFA_HAVE_OPENAT2)
+        extern void async_openat2(io_future_t & future, int dirfd, const char *pathname, struct open_how * how, size_t size, int submit_flags, io_cancellation * cancellation, io_context * context);
+#endif
+extern void async_close(io_future_t & future, int fd, int submit_flags, io_cancellation * cancellation, io_context * context);
+#if defined(CFA_HAVE_STATX)
+        extern void async_statx(io_future_t & future, int dirfd, const char *pathname, int flags, unsigned int mask, struct statx *statxbuf, int submit_flags, io_cancellation * cancellation, io_context * context);
+#endif
+void async_read(io_future_t & future, int fd, void * buf, size_t count, int submit_flags, io_cancellation * cancellation, io_context * context);
+extern void async_write(io_future_t & future, int fd, void * buf, size_t count, int submit_flags, io_cancellation * cancellation, io_context * context);
+extern void async_splice(io_future_t & future, int fd_in, loff_t *off_in, int fd_out, loff_t *off_out, size_t len, unsigned int flags, int submit_flags, io_cancellation * cancellation, io_context * context);
+extern void async_tee(io_future_t & future, int fd_in, int fd_out, size_t len, unsigned int flags, int submit_flags, io_cancellation * cancellation, io_context * context);
 //-----------------------------------------------------------------------------

libcfa/src/concurrency/kernel.cfa

-              rae2c27a
+              rc76bd34
         $coroutine * proc_cor = get_coroutine(this->runner);
-        // Update global state
-        kernelTLS.this_thread = thrd_dst;
         // set state of processor coroutine to inactive
         verify(proc_cor->state == Active);
 …
                 thrd_dst->state = Active;
+                __cfaabi_dbg_debug_do(
+                        thrd_dst->park_stale   = true;
+                        thrd_dst->unpark_stale = true;
+                )
+                // Update global state
+                kernelTLS.this_thread = thrd_dst;
                 /* paranoid */ verify( ! kernelTLS.preemption_state.enabled );
                 /* paranoid */ verify( kernelTLS.this_thread == thrd_dst );
+                /* paranoid */ verify( thrd_dst->context.SP );
                 /* paranoid */ verifyf( ((uintptr_t)thrd_dst->context.SP) < ((uintptr_t)__get_stack(thrd_dst->curr_cor)->base ) || thrd_dst->curr_cor == proc_cor, "ERROR : Destination $thread %p has been corrupted.\n StackPointer too small.\n", thrd_dst ); // add escape condition if we are setting up the processor
                 /* paranoid */ verifyf( ((uintptr_t)thrd_dst->context.SP) > ((uintptr_t)__get_stack(thrd_dst->curr_cor)->limit) || thrd_dst->curr_cor == proc_cor, "ERROR : Destination $thread %p has been corrupted.\n StackPointer too large.\n", thrd_dst ); // add escape condition if we are setting up the processor
+                /* paranoid */ verify( 0x0D15EA5E0D15EA5E == thrd_dst->canary );
                 // set context switch to the thread that the processor is executing
-                verify( thrd_dst->context.SP );
                 __cfactx_switch( &proc_cor->context, &thrd_dst->context );
                 // when __cfactx_switch returns we are back in the processor coroutine
+                /* paranoid */ verify( 0x0D15EA5E0D15EA5E == thrd_dst->canary );
                 /* paranoid */ verifyf( ((uintptr_t)thrd_dst->context.SP) > ((uintptr_t)__get_stack(thrd_dst->curr_cor)->limit), "ERROR : Destination $thread %p has been corrupted.\n StackPointer too large.\n", thrd_dst );
                 /* paranoid */ verifyf( ((uintptr_t)thrd_dst->context.SP) < ((uintptr_t)__get_stack(thrd_dst->curr_cor)->base ), "ERROR : Destination $thread %p has been corrupted.\n StackPointer too small.\n", thrd_dst );
+                /* paranoid */ verify( thrd_dst->context.SP );
                 /* paranoid */ verify( kernelTLS.this_thread == thrd_dst );
                 /* paranoid */ verify( ! kernelTLS.preemption_state.enabled );
+                // Reset global state
+                kernelTLS.this_thread = 0p;
                 // We just finished running a thread, there are a few things that could have happened.
 …
                         // The thread has halted, it should never be scheduled/run again
                         // We may need to wake someone up here since
                         unpark( this->destroyer __cfaabi_dbg_ctx2 );
+                        unpark( this->destroyer );
                         this->destroyer = 0p;
                         break RUNNING;
 …
                 // set state of processor coroutine to active and the thread to inactive
                 int old_ticket = __atomic_fetch_sub(&thrd_dst->ticket, 1, __ATOMIC_SEQ_CST);
-                __cfaabi_dbg_debug_do( thrd_dst->park_result = old_ticket; )
                 switch(old_ticket) {
                         case 1:
 …
         // Just before returning to the processor, set the processor coroutine to active
         proc_cor->state = Active;
-        kernelTLS.this_thread = 0p;
         /* paranoid */ verify( ! kernelTLS.preemption_state.enabled );
 …
                         __x87_store;
                 #endif
+                verify( proc_cor->context.SP );
+                /* paranoid */ verify( proc_cor->context.SP );
+                /* paranoid */ verify( 0x0D15EA5E0D15EA5E == thrd_src->canary );
                 __cfactx_switch( &thrd_src->context, &proc_cor->context );
+                /* paranoid */ verify( 0x0D15EA5E0D15EA5E == thrd_src->canary );
                 #if defined( __i386 ) || defined( __x86_64 )
                         __x87_load;
 …
         /* paranoid */ #endif
         /* paranoid */ verifyf( thrd->link.next == 0p, "Expected null got %p", thrd->link.next );
+        /* paranoid */ verify( 0x0D15EA5E0D15EA5E == thrd->canary );
         if (thrd->preempted == __NO_PREEMPTION) thrd->state = Ready;
 …
 // KERNEL ONLY unpark with out disabling interrupts
+void __unpark(  struct __processor_id_t * id, $thread * thrd __cfaabi_dbg_ctx_param2 ) {
+        // record activity
+        __cfaabi_dbg_record_thrd( *thrd, false, caller );
+void __unpark(  struct __processor_id_t * id, $thread * thrd ) {
         int old_ticket = __atomic_fetch_add(&thrd->ticket, 1, __ATOMIC_SEQ_CST);
-        __cfaabi_dbg_debug_do( thrd->unpark_result = old_ticket; thrd->unpark_state = thrd->state; )
         switch(old_ticket) {
                 case 1:
 …
+}
 void unpark( $thread * thrd __cfaabi_dbg_ctx_param2 ) {
+void unpark( $thread * thrd ) {
         if( !thrd ) return;
         disable_interrupts();
         __unpark( (__processor_id_t*)kernelTLS.this_processor, thrd __cfaabi_dbg_ctx_fwd2 );
+        __unpark( (__processor_id_t*)kernelTLS.this_processor, thrd );
         enable_interrupts( __cfaabi_dbg_ctx );
+}
 void park( __cfaabi_dbg_ctx_param ) {
+void park( void ) {
         /* paranoid */ verify( kernelTLS.preemption_state.enabled );
         disable_interrupts();
         /* paranoid */ verify( ! kernelTLS.preemption_state.enabled );
         /* paranoid */ verify( kernelTLS.this_thread->preempted == __NO_PREEMPTION );
-        // record activity
-        __cfaabi_dbg_record_thrd( *kernelTLS.this_thread, true, caller );
         returnToKernel();
 …
         disable_interrupts();
                 /* paranoid */ verify( ! kernelTLS.preemption_state.enabled );
                 bool ret = post( this->idle );
+                post( this->idle );
         enable_interrupts( __cfaabi_dbg_ctx );
+}
 …
                 // atomically release spin lock and block
                 unlock( lock );
                 park( __cfaabi_dbg_ctx );
+                park();
                 return true;
+        }
 …
         // make new owner
         unpark( thrd __cfaabi_dbg_ctx2 );
+        unpark( thrd );
         return thrd != 0p;
 …
         count += diff;
         for(release) {
                 unpark( pop_head( waiting ) __cfaabi_dbg_ctx2 );
+                unpark( pop_head( waiting ) );
+        }
 …
                         this.prev_thrd = kernelTLS.this_thread;
+                }
-                void __cfaabi_dbg_record_thrd($thread & this, bool park, const char prev_name[]) {
-                        if(park) {
-                                this.park_caller   = prev_name;
-                                this.park_stale    = false;
+                        }
-                        else {
-                                this.unpark_caller = prev_name;
-                                this.unpark_stale  = false;
+                        }
+                }
+        }
+)

libcfa/src/concurrency/kernel.hfa

-              rae2c27a
+              rc76bd34
 extern "C" {
+#include <bits/pthreadtypes.h>
+        #include <bits/pthreadtypes.h>
+        #include <linux/types.h>
+}
 …
 // Processor id, required for scheduling threads
 struct __processor_id_t {
+        unsigned id;
+        unsigned id:24;
+        bool full_proc:1;
         #if !defined(__CFA_NO_STATISTICS__)
 …
 struct io_cancellation {
         uint32_t target;
+        __u64 target;
 };

libcfa/src/concurrency/kernel/fwd.hfa

-              rae2c27a
+              rc76bd34
         extern "Cforall" {
                 extern void park( __cfaabi_dbg_ctx_param );
                 extern void unpark( struct $thread * this __cfaabi_dbg_ctx_param2 );
+                extern void park( void );
+                extern void unpark( struct $thread * this );
                 static inline struct $thread * active_thread () { return TL_GET( this_thread ); }

libcfa/src/concurrency/kernel/startup.cfa

-              rae2c27a
+              rc76bd34
         link.next = 0p;
         link.prev = 0p;
+        #if defined( __CFA_WITH_VERIFY__ )
+                canary = 0x0D15EA5E0D15EA5E;
+        #endif
         node.next = 0p;
 …
         this.name = name;
         this.cltr = &_cltr;
         id = -1u;
+        full_proc = true;
         destroyer = 0p;
         do_terminate = false;

libcfa/src/concurrency/kernel_private.hfa

-              rae2c27a
+              rc76bd34
 // KERNEL ONLY unpark with out disabling interrupts
 void __unpark( struct __processor_id_t *, $thread * thrd __cfaabi_dbg_ctx_param2 );
+void __unpark( struct __processor_id_t *, $thread * thrd );
 static inline bool __post(single_sem & this, struct __processor_id_t * id) {
 …
                 else {
                         if(__atomic_compare_exchange_n(&this.ptr, &expected, 0p, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) {
                                 __unpark( id, expected __cfaabi_dbg_ctx2 );
+                                __unpark( id, expected );
                                 return true;
+                        }

libcfa/src/concurrency/monitor.cfa

-              rae2c27a
+              rc76bd34
         __cfaabi_dbg_print_safe( "Kernel : %10p Entering mon %p (%p)\n", thrd, this, this->owner);
+        if( !this->owner ) {
+        if( unlikely(0 != (0x1 & (uintptr_t)this->owner)) ) {
+                abort( "Attempt by thread \"%.256s\" (%p) to access joined monitor %p.", thrd->self_cor.name, thrd, this );
+        }
+        else if( !this->owner ) {
                 // No one has the monitor, just take it
                 __set_owner( this, thrd );
 …
                 unlock( this->lock );
                 park( __cfaabi_dbg_ctx );
+                park();
                 __cfaabi_dbg_print_safe( "Kernel : %10p Entered  mon %p\n", thrd, this);
 …
+}
 static void __dtor_enter( $monitor * this, fptr_t func ) {
+static void __dtor_enter( $monitor * this, fptr_t func, bool join ) {
         // Lock the monitor spinlock
         lock( this->lock __cfaabi_dbg_ctx2 );
 …
                 return;
+        }
         else if( this->owner == thrd) {
+        else if( this->owner == thrd && !join) {
                 // We already have the monitor... but where about to destroy it so the nesting will fail
                 // Abort!
                 abort( "Attempt to destroy monitor %p by thread \"%.256s\" (%p) in nested mutex.", this, thrd->self_cor.name, thrd );
+        }
+        // SKULLDUGGERY: join will act as a dtor so it would normally trigger to above check
+        // to avoid that it sets the owner to the special value thrd | 1p before exiting
+        else if( this->owner == ($thread*)(1 | (uintptr_t)thrd) ) {
+                // restore the owner and just return
+                __cfaabi_dbg_print_safe( "Kernel : Destroying free mon %p\n", this);
+                // No one has the monitor, just take it
+                this->owner = thrd;
+                verifyf( kernelTLS.this_thread == this->owner, "Expected owner to be %p, got %p (r: %i, m: %p)", kernelTLS.this_thread, this->owner, this->recursion, this );
+                unlock( this->lock );
+                return;
+        }
 …
                 // Release the next thread
                 /* paranoid */ verifyf( urgent->owner->waiting_thread == this->owner, "Expected owner to be %p, got %p (r: %i, m: %p)", kernelTLS.this_thread, this->owner, this->recursion, this );
                 unpark( urgent->owner->waiting_thread __cfaabi_dbg_ctx2 );
+                unpark( urgent->owner->waiting_thread );
                 // Park current thread waiting
                 park( __cfaabi_dbg_ctx );
+                park();
                 // Some one was waiting for us, enter
 …
                 // Park current thread waiting
                 park( __cfaabi_dbg_ctx );
+                park();
                 /* paranoid */ verifyf( kernelTLS.this_thread == this->owner, "Expected owner to be %p, got %p (r: %i, m: %p)", kernelTLS.this_thread, this->owner, this->recursion, this );
 …
         //We need to wake-up the thread
         /* paranoid */ verifyf( !new_owner || new_owner == this->owner, "Expected owner to be %p, got %p (m: %p)", new_owner, this->owner, this );
         unpark( new_owner __cfaabi_dbg_ctx2 );
+        unpark( new_owner );
+}
 // Leave single monitor for the last time
 void __dtor_leave( $monitor * this ) {
+void __dtor_leave( $monitor * this, bool join ) {
         __cfaabi_dbg_debug_do(
                 if( TL_GET( this_thread ) != this->owner ) {
                         abort( "Destroyed monitor %p has inconsistent owner, expected %p got %p.\n", this, TL_GET( this_thread ), this->owner);
+                }
                 if( this->recursion != 1 ) {
+                if( this->recursion != 1  && !join ) {
                         abort( "Destroyed monitor %p has %d outstanding nested calls.\n", this, this->recursion - 1);
+                }
+        )
+        this->owner = ($thread*)(1 | (uintptr_t)this->owner);
+}
 …
+}
+// Join a thread
+forall( dtype T | is_thread(T) )
+T & join( T & this ) {
+        $monitor *    m = get_monitor(this);
+        void (*dtor)(T& mutex this) = ^?{};
+        monitor_dtor_guard_t __guard = { &m, (fptr_t)dtor, true };
+        {
+                return this;
+        }
+}
 // Enter multiple monitor
 // relies on the monitor array being sorted
 …
 // Ctor for monitor guard
 // Sorts monitors before entering
 void ?{}( monitor_dtor_guard_t & this, $monitor * m [], fptr_t func ) {
+void ?{}( monitor_dtor_guard_t & this, $monitor * m [], fptr_t func, bool join ) {
         // optimization
         $thread * thrd = TL_GET( this_thread );
 …
         this.prev = thrd->monitors;
+        // Save whether we are in a join or not
+        this.join = join;
         // Update thread context (needed for conditions)
         (thrd->monitors){m, 1, func};
         __dtor_enter( this.m, func );
+        __dtor_enter( this.m, func, join );
+}
 …
 void ^?{}( monitor_dtor_guard_t & this ) {
         // Leave the monitors in order
         __dtor_leave( this.m );
+        __dtor_leave( this.m, this.join );
         // Restore thread context
 …
         // Wake the threads
         for(int i = 0; i < thread_count; i++) {
                 unpark( threads[i] __cfaabi_dbg_ctx2 );
+                unpark( threads[i] );
+        }
         // Everything is ready to go to sleep
         park( __cfaabi_dbg_ctx );
+        park();
         // We are back, restore the owners and recursions
 …
         // unpark the thread we signalled
         unpark( signallee __cfaabi_dbg_ctx2 );
+        unpark( signallee );
         //Everything is ready to go to sleep
         park( __cfaabi_dbg_ctx );
+        park();
 …
                                 // unpark the thread we signalled
                                 unpark( next __cfaabi_dbg_ctx2 );
+                                unpark( next );
                                 //Everything is ready to go to sleep
                                 park( __cfaabi_dbg_ctx );
+                                park();
                                 // We are back, restore the owners and recursions
 …
         //Everything is ready to go to sleep
         park( __cfaabi_dbg_ctx );
+        park();

libcfa/src/concurrency/monitor.hfa

-              rae2c27a
+              rc76bd34
         $monitor *    m;
         __monitor_group_t prev;
+        bool join;
 };
 void ?{}( monitor_dtor_guard_t & this, $monitor ** m, void (*func)() );
+void ?{}( monitor_dtor_guard_t & this, $monitor ** m, void (*func)(), bool join );
 void ^?{}( monitor_dtor_guard_t & this );

libcfa/src/concurrency/mutex.cfa

-              rae2c27a
+              rc76bd34
                 append( blocked_threads, kernelTLS.this_thread );
                 unlock( lock );
                 park( __cfaabi_dbg_ctx );
+                park();
+        }
         else {
 …
         this.is_locked = (this.blocked_threads != 0);
         unpark(
                 pop_head( this.blocked_threads ) __cfaabi_dbg_ctx2
+                pop_head( this.blocked_threads )
         );
         unlock( this.lock );
 …
                 append( blocked_threads, kernelTLS.this_thread );
                 unlock( lock );
                 park( __cfaabi_dbg_ctx );
+                park();
+        }
+}
 …
                 owner = thrd;
                 recursion_count = (thrd ? 1 : 0);
                 unpark( thrd __cfaabi_dbg_ctx2 );
+                unpark( thrd );
+        }
         unlock( lock );
 …
         lock( lock __cfaabi_dbg_ctx2 );
         unpark(
                 pop_head( this.blocked_threads ) __cfaabi_dbg_ctx2
+                pop_head( this.blocked_threads )
         );
         unlock( lock );
 …
         while(this.blocked_threads) {
                 unpark(
                         pop_head( this.blocked_threads ) __cfaabi_dbg_ctx2
+                        pop_head( this.blocked_threads )
                 );
+        }
 …
         append( this.blocked_threads, kernelTLS.this_thread );
         unlock( this.lock );
         park( __cfaabi_dbg_ctx );
+        park();
+}
 …
         unlock(l);
         unlock(this.lock);
         park( __cfaabi_dbg_ctx );
+        park();
         lock(l);
+}

libcfa/src/concurrency/preemption.cfa

-              rae2c27a
+              rc76bd34
                 kernelTLS.this_stats = this->curr_cluster->stats;
         #endif
         __unpark( id, this __cfaabi_dbg_ctx2 );
+        __unpark( id, this );
+}
 …
 static void * alarm_loop( __attribute__((unused)) void * args ) {
         __processor_id_t id;
+        id.full_proc = false;
         id.id = doregister(&id);

libcfa/src/concurrency/thread.cfa

-              rae2c27a
+              rc76bd34
         link.prev = 0p;
         link.preferred = -1;
+        #if defined( __CFA_WITH_VERIFY__ )
+                canary = 0x0D15EA5E0D15EA5E;
+        #endif
         node.next = 0p;
 …
 void ^?{}($thread& this) with( this ) {
+        #if defined( __CFA_WITH_VERIFY__ )
+                canary = 0xDEADDEADDEADDEAD;
+        #endif
         unregister(curr_cluster, this);
         ^self_cor{};

libcfa/src/concurrency/thread.hfa

-              rae2c27a
+              rc76bd34
 //----------
 // Park thread: block until corresponding call to unpark, won't block if unpark is already called
 void park( __cfaabi_dbg_ctx_param );
+void park( void );
 //----------
 // Unpark a thread, if the thread is already blocked, schedule it
 //                  if the thread is not yet block, signal that it should rerun immediately
 void unpark( $thread * this __cfaabi_dbg_ctx_param2 );
+void unpark( $thread * this );
 forall( dtype T | is_thread(T) )
 static inline void unpark( T & this __cfaabi_dbg_ctx_param2 ) { if(!&this) return; unpark( get_thread( this ) __cfaabi_dbg_ctx_fwd2 );}
+static inline void unpark( T & this ) { if(!&this) return; unpark( get_thread( this ) );}
 //----------
 …
 void sleep( Duration duration );
+//----------
+// join
+forall( dtype T | is_thread(T) )
+T & join( T & this );
 // Local Variables: //
 // mode: c //

libcfa/src/exception.h

-              rae2c27a
+              rc76bd34
 // implemented in the .c file either so they all have to be inline.
 trait is_exception(dtype T) {
+trait is_exception(dtype exceptT) {
         /* The first field must be a pointer to a virtual table.
          * That virtual table must be a decendent of the base exception virtual tab$
          */
         void mark_exception(T *);
+        void mark_exception(exceptT *);
         // This is never used and should be a no-op.
 };
 trait is_termination_exception(dtype T | is_exception(T)) {
         void defaultTerminationHandler(T &);
+trait is_termination_exception(dtype exceptT | is_exception(exceptT)) {
+        void defaultTerminationHandler(exceptT &);
 };
 trait is_resumption_exception(dtype T | is_exception(T)) {
         void defaultResumptionHandler(T &);
+trait is_resumption_exception(dtype exceptT | is_exception(exceptT)) {
+        void defaultResumptionHandler(exceptT &);
 };
 forall(dtype T | is_termination_exception(T))
 static inline void $throw(T & except) {
+forall(dtype exceptT | is_termination_exception(exceptT))
+static inline void $throw(exceptT & except) {
         __cfaehm_throw_terminate(
                 (exception_t *)&except,
 …
+}
 forall(dtype T | is_resumption_exception(T))
 static inline void $throwResume(T & except) {
+forall(dtype exceptT | is_resumption_exception(exceptT))
+static inline void $throwResume(exceptT & except) {
         __cfaehm_throw_resume(
                 (exception_t *)&except,
 …
+}
 forall(dtype T | is_exception(T))
 static inline void cancel_stack(T & except) __attribute__((noreturn)) {
+forall(dtype exceptT | is_exception(exceptT))
+static inline void cancel_stack(exceptT & except) __attribute__((noreturn)) {
         __cfaehm_cancel_stack( (exception_t *)&except );
+}
 forall(dtype T | is_exception(T))
 static inline void defaultTerminationHandler(T & except) {
+forall(dtype exceptT | is_exception(exceptT))
+static inline void defaultTerminationHandler(exceptT & except) {
         return cancel_stack( except );
+}
 forall(dtype T | is_exception(T))
 static inline void defaultResumptionHandler(T & except) {
+forall(dtype exceptT | is_exception(exceptT))
+static inline void defaultResumptionHandler(exceptT & except) {
         throw except;
+}

libcfa/src/exception.hfa

-              rae2c27a
+              rc76bd34
                 size_t size; \
                 void (*copy)(exception_name * this, exception_name * other); \
                 void (*free)(exception_name & this); \
+                void (*^?{})(exception_name & this); \
                 const char * (*msg)(exception_name * this); \
                 _CLOSE
 …
                 size_t size; \
                 void (*copy)(exception_name parameters * this, exception_name parameters * other); \
                 void (*free)(exception_name parameters & this); \
+                void (*^?{})(exception_name parameters & this); \
                 const char * (*msg)(exception_name parameters * this); \
                 _CLOSE

libcfa/src/heap.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Tue Dec 19 21:58:35 2017
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Thu Sep  3 16:22:54 2020
 // Update Count     : 943
+// Last Modified On : Mon Sep  7 22:17:46 2020
+// Update Count     : 957
 //
 …
                 size_t bsize, oalign;
                 headers( "resize", oaddr, header, freeElem, bsize, oalign );
                 size_t odsize = dataStorage( bsize, oaddr, header ); // data storage available in bucket
                 // same size, DO NOT preserve STICKY PROPERTIES.
                 if ( oalign <= libAlign() && size <= odsize && odsize <= size * 2 ) { // allow 50% wasted storage for smaller size
+                if ( oalign == libAlign() && size <= odsize && odsize <= size * 2 ) { // allow 50% wasted storage for smaller size
                         header->kind.real.blockSize &= -2;                      // no alignment and turn off 0 fill
                         header->kind.real.size = size;                          // reset allocation size
 …
                 size_t odsize = dataStorage( bsize, oaddr, header ); // data storage available in bucket
                 size_t osize = header->kind.real.size;                  // old allocation size
                 bool ozfill = (header->kind.real.blockSize & 2) != 0; // old allocation zero filled
           if ( unlikely( size <= odsize ) && size > odsize / 2 ) { // allow up to 50% wasted storage
+                bool ozfill = (header->kind.real.blockSize & 2); // old allocation zero filled
+          if ( unlikely( size <= odsize ) && odsize <= size * 2 ) { // allow up to 50% wasted storage
                         header->kind.real.size = size;                          // reset allocation size
                         if ( unlikely( ozfill ) && size > osize ) {     // previous request zero fill and larger ?
 …
                 void * naddr;
                 if ( likely( oalign <= libAlign() ) ) {                 // previous request not aligned ?
+                if ( likely( oalign == libAlign() ) ) {                 // previous request not aligned ?
                         naddr = mallocNoStats( size );                          // create new area
                 } else {
 …
         } // if
         // Attempt to reuse existing storage.
+        // Attempt to reuse existing alignment.
         HeapManager.Storage.Header * header = headerAddr( oaddr );
         bool isFakeHeader = header->kind.fake.alignment & 1 == 1;       // old fake header ?
         if ( unlikely ( ( isFakeHeader &&
                                  (uintptr_t)oaddr % nalign == 0 &&                              // lucky match ?
                                  header->kind.fake.alignment <= nalign &&               // ok to leave LSB at 1
                                  nalign <= 128 )                                                                // not too much alignment storage wasted ?
                         ||   ( (!isFakeHeader) &&                                                       // old real header ( aligned on libAlign ) ?
                                  nalign == libAlign() ) ) ) {                                   // new alignment also on libAlign
                 HeapManager.FreeHeader * freeElem;
                 size_t bsize, oalign;
                 headers( "resize", oaddr, header, freeElem, bsize, oalign );
                 size_t odsize = dataStorage( bsize, oaddr, header ); // data storage available in bucket
+                if ( size <= odsize && odsize <= size * 2 ) { // allow 50% wasted data storage
                         if ( isFakeHeader ) {
+        bool isFakeHeader = header->kind.fake.alignment & 1; // old fake header ?
+        size_t oalign;
+        if ( isFakeHeader ) {
+                oalign = header->kind.fake.alignment & -2;              // old alignment
+                if ( (uintptr_t)oaddr % nalign == 0                             // lucky match ?
+                         && ( oalign <= nalign                                          // going down
+                                  || (oalign >= nalign && oalign <= 256) ) // little alignment storage wasted ?
+                        ) {
+                        headerAddr( oaddr )->kind.fake.alignment = nalign | 1; // update alignment (could be the same)
+                        HeapManager.FreeHeader * freeElem;
+                        size_t bsize, oalign;
+                        headers( "resize", oaddr, header, freeElem, bsize, oalign );
+                        size_t odsize = dataStorage( bsize, oaddr, header ); // data storage available in bucket
+                        if ( size <= odsize && odsize <= size * 2 ) { // allow 50% wasted data storage
                                 headerAddr( oaddr )->kind.fake.alignment = nalign | 1; // update alignment (could be the same)
+                        }
+                        header->kind.real.blockSize &= -2;              // turn off 0 fill
+                        header->kind.real.size = size;                  // reset allocation size
+                        return oaddr;
+                } // if
+                                header->kind.real.blockSize &= -2;              // turn off 0 fill
+                                header->kind.real.size = size;                  // reset allocation size
+                                return oaddr;
+                        } // if
+                } // if
+        } else if ( ! isFakeHeader                                                      // old real header (aligned on libAlign) ?
+                                && nalign == libAlign() ) {                             // new alignment also on libAlign => no fake header needed
+                return resize( oaddr, size );                                   // duplicate special case checks
         } // if
 …
         } // if
+        HeapManager.Storage.Header * header;
+        HeapManager.FreeHeader * freeElem;
+        size_t bsize, oalign;
+        headers( "realloc", oaddr, header, freeElem, bsize, oalign );
+        // Attempt to reuse existing storage.
+        bool isFakeHeader = header->kind.fake.alignment & 1 == 1;       // old fake header ?
+        if ( unlikely ( ( isFakeHeader &&
+                                 (uintptr_t)oaddr % nalign == 0 &&                              // lucky match ?
+                                 header->kind.fake.alignment <= nalign &&               // ok to leave LSB at 1
+                                 nalign <= 128 )                                                                // not too much alignment storage wasted ?
+                        ||   ( (!isFakeHeader) &&                                                       // old real header ( aligned on libAlign ) ?
+                                 nalign == libAlign() ) ) ) {                                   // new alignment also on libAlign
+                if ( isFakeHeader ) {
+        // Attempt to reuse existing alignment.
+        HeapManager.Storage.Header * header = headerAddr( oaddr );
+        bool isFakeHeader = header->kind.fake.alignment & 1; // old fake header ?
+        size_t oalign;
+        if ( isFakeHeader ) {
+                oalign = header->kind.fake.alignment & -2;              // old alignment
+                if ( (uintptr_t)oaddr % nalign == 0                             // lucky match ?
+                         && ( oalign <= nalign                                          // going down
+                                  || (oalign >= nalign && oalign <= 256) ) // little alignment storage wasted ?
+                        ) {
                         headerAddr( oaddr )->kind.fake.alignment = nalign | 1; // update alignment (could be the same)
+                }
+                return realloc( oaddr, size );
+        } // if
+        // change size and copy old content to new storage
+                        return realloc( oaddr, size );                          // duplicate alignment and special case checks
+                } // if
+        } else if ( ! isFakeHeader                                                      // old real header (aligned on libAlign) ?
+                                && nalign == libAlign() )                               // new alignment also on libAlign => no fake header needed
+                return realloc( oaddr, size );                                  // duplicate alignment and special case checks
         #ifdef __STATISTICS__
 …
         #endif // __STATISTICS__
+        HeapManager.FreeHeader * freeElem;
+        size_t bsize;
+        headers( "realloc", oaddr, header, freeElem, bsize, oalign );
+        // change size and copy old content to new storage
         size_t osize = header->kind.real.size;                          // old allocation size
         bool ozfill = (header->kind.real.blockSize & 2) != 0; // old allocation zero filled
+        bool ozfill = (header->kind.real.blockSize & 2);        // old allocation zero filled
         void * naddr = memalignNoStats( nalign, size );         // create new aligned area

libcfa/src/limits.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Wed Apr  6 18:06:52 2016
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Thu Mar  1 16:22:51 2018
 // Update Count     : 74
+// Last Modified On : Wed Sep 30 22:56:32 2020
+// Update Count     : 76
 //
 …
 // Integral Constants
 const signed char MIN = SCHAR_MIN;
 const unsigned char MIN = 0;
 const short int MIN = SHRT_MIN;
 const unsigned short int MIN = 0;
 const int MIN = INT_MIN;
 const unsigned int MIN = 0;
 const long int MIN = LONG_MIN;
 const unsigned long int MIN = 0;
 const long long int MIN = LLONG_MIN;
 const unsigned long long int MIN = 0;
+signed char MIN = SCHAR_MIN;
+unsigned char MIN = 0;
+short int MIN = SHRT_MIN;
+unsigned short int MIN = 0;
+int MIN = INT_MIN;
+unsigned int MIN = 0;
+long int MIN = LONG_MIN;
+unsigned long int MIN = 0;
+long long int MIN = LLONG_MIN;
+unsigned long long int MIN = 0;
 const signed char MAX = SCHAR_MAX;
 const unsigned char MAX = UCHAR_MAX;
 const short int MAX = SHRT_MAX;
 const unsigned short int MAX = USHRT_MAX;
 const int MAX = INT_MAX;
 const unsigned int MAX = UINT_MAX;
 const long int MAX = LONG_MAX;
 const unsigned long int MAX = ULONG_MAX;
 const long long int MAX = LLONG_MAX;
 const unsigned long long int MAX = ULLONG_MAX;
+signed char MAX = SCHAR_MAX;
+unsigned char MAX = UCHAR_MAX;
+short int MAX = SHRT_MAX;
+unsigned short int MAX = USHRT_MAX;
+int MAX = INT_MAX;
+unsigned int MAX = UINT_MAX;
+long int MAX = LONG_MAX;
+unsigned long int MAX = ULONG_MAX;
+long long int MAX = LLONG_MAX;
+unsigned long long int MAX = ULLONG_MAX;
 // Floating-Point Constants
 const float MIN = FLT_MIN;
 const double MIN = DBL_MIN;
 const long double MIN = LDBL_MIN;
 const float _Complex MIN = __FLT_MIN__ + __FLT_MIN__ * I;
 const double _Complex MIN = DBL_MIN +  DBL_MIN * I;
 const long double _Complex MIN = LDBL_MIN + LDBL_MIN * I;
+float MIN = FLT_MIN;
+double MIN = DBL_MIN;
+long double MIN = LDBL_MIN;
+float _Complex MIN = __FLT_MIN__ + __FLT_MIN__ * I;
+double _Complex MIN = DBL_MIN +  DBL_MIN * I;
+long double _Complex MIN = LDBL_MIN + LDBL_MIN * I;
 const float MAX = FLT_MAX;
 const double MAX = DBL_MAX;
 const long double MAX = LDBL_MAX;
 const float _Complex MAX = FLT_MAX + FLT_MAX * I;
 const double _Complex MAX = DBL_MAX + DBL_MAX * I;
 const long double _Complex MAX = LDBL_MAX + LDBL_MAX * I;
+float MAX = FLT_MAX;
+double MAX = DBL_MAX;
+long double MAX = LDBL_MAX;
+float _Complex MAX = FLT_MAX + FLT_MAX * I;
+double _Complex MAX = DBL_MAX + DBL_MAX * I;
+long double _Complex MAX = LDBL_MAX + LDBL_MAX * I;
 const float PI = (float)M_PI;                                                   // pi
 const float PI_2 = (float)M_PI_2;                                               // pi / 2
 const float PI_4 = (float)M_PI_4;                                               // pi / 4
 const float _1_PI = (float)M_1_PI;                                              // 1 / pi
 const float _2_PI = (float)M_2_PI;                                              // 2 / pi
 const float _2_SQRT_PI = (float)M_2_SQRTPI;                             // 2 / sqrt(pi)
+float PI = (float)M_PI;                                                                 // pi
+float PI_2 = (float)M_PI_2;                                                             // pi / 2
+float PI_4 = (float)M_PI_4;                                                             // pi / 4
+float _1_PI = (float)M_1_PI;                                                    // 1 / pi
+float _2_PI = (float)M_2_PI;                                                    // 2 / pi
+float _2_SQRT_PI = (float)M_2_SQRTPI;                                   // 2 / sqrt(pi)
 const double PI = M_PI;                                                                 // pi
 const double PI_2 = M_PI_2;                                                             // pi / 2
 const double PI_4 = M_PI_4;                                                             // pi / 4
 const double _1_PI = M_1_PI;                                                    // 1 / pi
 const double _2_PI = M_2_PI;                                                    // 2 / pi
 const double _2_SQRT_PI = M_2_SQRTPI;                                   // 2 / sqrt(pi)
+double PI = M_PI;                                                                               // pi
+double PI_2 = M_PI_2;                                                                   // pi / 2
+double PI_4 = M_PI_4;                                                                   // pi / 4
+double _1_PI = M_1_PI;                                                                  // 1 / pi
+double _2_PI = M_2_PI;                                                                  // 2 / pi
+double _2_SQRT_PI = M_2_SQRTPI;                                                 // 2 / sqrt(pi)
 const long double PI = M_PIl;                                                   // pi
 const long double PI_2 = M_PI_2l;                                               // pi / 2
 const long double PI_4 = M_PI_4l;                                               // pi / 4
 const long double _1_PI = M_1_PIl;                                              // 1 / pi
 const long double _2_PI = M_2_PIl;                                              // 2 / pi
 const long double _2_SQRT_PI = M_2_SQRTPIl;                             // 2 / sqrt(pi)
+long double PI = M_PIl;                                                                 // pi
+long double PI_2 = M_PI_2l;                                                             // pi / 2
+long double PI_4 = M_PI_4l;                                                             // pi / 4
+long double _1_PI = M_1_PIl;                                                    // 1 / pi
+long double _2_PI = M_2_PIl;                                                    // 2 / pi
+long double _2_SQRT_PI = M_2_SQRTPIl;                                   // 2 / sqrt(pi)
 const float _Complex PI = (float)M_PI + 0.0_iF;                 // pi
 const float _Complex PI_2 = (float)M_PI_2 + 0.0_iF;             // pi / 2
 const float _Complex PI_4 = (float)M_PI_4 + 0.0_iF;             // pi / 4
 const float _Complex _1_PI = (float)M_1_PI + 0.0_iF;    // 1 / pi
 const float _Complex _2_PI = (float)M_2_PI + 0.0_iF;    // 2 / pi
 const float _Complex _2_SQRT_PI = (float)M_2_SQRTPI + 0.0_iF; // 2 / sqrt(pi)
+float _Complex PI = (float)M_PI + 0.0_iF;                               // pi
+float _Complex PI_2 = (float)M_PI_2 + 0.0_iF;                   // pi / 2
+float _Complex PI_4 = (float)M_PI_4 + 0.0_iF;                   // pi / 4
+float _Complex _1_PI = (float)M_1_PI + 0.0_iF;                  // 1 / pi
+float _Complex _2_PI = (float)M_2_PI + 0.0_iF;                  // 2 / pi
+float _Complex _2_SQRT_PI = (float)M_2_SQRTPI + 0.0_iF; // 2 / sqrt(pi)
 const double _Complex PI = M_PI + 0.0_iD;                               // pi
 const double _Complex PI_2 = M_PI_2 + 0.0_iD;                   // pi / 2
 const double _Complex PI_4 = M_PI_4 + 0.0_iD;                   // pi / 4
 const double _Complex _1_PI = M_1_PI + 0.0_iD;                  // 1 / pi
 const double _Complex _2_PI = M_2_PI + 0.0_iD;                  // 2 / pi
 const double _Complex _2_SQRT_PI = M_2_SQRTPI + 0.0_iD; // 2 / sqrt(pi)
+double _Complex PI = M_PI + 0.0_iD;                                             // pi
+double _Complex PI_2 = M_PI_2 + 0.0_iD;                                 // pi / 2
+double _Complex PI_4 = M_PI_4 + 0.0_iD;                                 // pi / 4
+double _Complex _1_PI = M_1_PI + 0.0_iD;                                // 1 / pi
+double _Complex _2_PI = M_2_PI + 0.0_iD;                                // 2 / pi
+double _Complex _2_SQRT_PI = M_2_SQRTPI + 0.0_iD;               // 2 / sqrt(pi)
 const long double _Complex PI = M_PIl + 0.0_iL;                 // pi
 const long double _Complex PI_2 = M_PI_2l + 0.0_iL;             // pi / 2
 const long double _Complex PI_4 = M_PI_4l + 0.0_iL;             // pi / 4
 const long double _Complex _1_PI = M_1_PIl + 0.0_iL;    // 1 / pi
 const long double _Complex _2_PI = M_2_PIl + 0.0_iL;    // 2 / pi
 const long double _Complex _2_SQRT_PI = M_2_SQRTPIl + 0.0_iL; // 2 / sqrt(pi)
+long double _Complex PI = M_PIl + 0.0_iL;                               // pi
+long double _Complex PI_2 = M_PI_2l + 0.0_iL;                   // pi / 2
+long double _Complex PI_4 = M_PI_4l + 0.0_iL;                   // pi / 4
+long double _Complex _1_PI = M_1_PIl + 0.0_iL;                  // 1 / pi
+long double _Complex _2_PI = M_2_PIl + 0.0_iL;                  // 2 / pi
+long double _Complex _2_SQRT_PI = M_2_SQRTPIl + 0.0_iL; // 2 / sqrt(pi)
 const float E = (float)M_E;                                                             // e
 const float LOG2_E = (float)M_LOG2E;                                    // log_2(e)
 const float LOG10_E = (float)M_LOG10E;                                  // log_10(e)
 const float LN_2 = (float)M_LN2;                                                // log_e(2)
 const float LN_10 = (float)M_LN10;                                              // log_e(10)
 const float SQRT_2 = (float)M_SQRT2;                                    // sqrt(2)
 const float _1_SQRT_2 = (float)M_SQRT1_2;                               // 1 / sqrt(2)
+float E = (float)M_E;                                                                   // e
+float LOG2_E = (float)M_LOG2E;                                                  // log_2(e)
+float LOG10_E = (float)M_LOG10E;                                                // log_10(e)
+float LN_2 = (float)M_LN2;                                                              // log_e(2)
+float LN_10 = (float)M_LN10;                                                    // log_e(10)
+float SQRT_2 = (float)M_SQRT2;                                                  // sqrt(2)
+float _1_SQRT_2 = (float)M_SQRT1_2;                                             // 1 / sqrt(2)
 const double E = M_E;                                                                   // e
 const double LOG2_E = M_LOG2E;                                                  // log_2(e)
 const double LOG10_E = M_LOG10E;                                                // log_10(e)
 const double LN_2 = M_LN2;                                                              // log_e(2)
 const double LN_10 = M_LN10;                                                    // log_e(10)
 const double SQRT_2 = M_SQRT2;                                                  // sqrt(2)
 const double _1_SQRT_2 = M_SQRT1_2;                                             // 1 / sqrt(2)
+double E = M_E;                                                                                 // e
+double LOG2_E = M_LOG2E;                                                                // log_2(e)
+double LOG10_E = M_LOG10E;                                                              // log_10(e)
+double LN_2 = M_LN2;                                                                    // log_e(2)
+double LN_10 = M_LN10;                                                                  // log_e(10)
+double SQRT_2 = M_SQRT2;                                                                // sqrt(2)
+double _1_SQRT_2 = M_SQRT1_2;                                                   // 1 / sqrt(2)
 const long double E = M_El;                                                             // e
 const long double LOG2_E = M_LOG2El;                                    // log_2(e)
 const long double LOG10_E = M_LOG10El;                                  // log_10(e)
 const long double LN_2 = M_LN2l;                                                // log_e(2)
 const long double LN_10 = M_LN10l;                                              // log_e(10)
 const long double SQRT_2 = M_SQRT2l;                                    // sqrt(2)
 const long double _1_SQRT_2 = M_SQRT1_2l;                               // 1 / sqrt(2)
+long double E = M_El;                                                                   // e
+long double LOG2_E = M_LOG2El;                                                  // log_2(e)
+long double LOG10_E = M_LOG10El;                                                // log_10(e)
+long double LN_2 = M_LN2l;                                                              // log_e(2)
+long double LN_10 = M_LN10l;                                                    // log_e(10)
+long double SQRT_2 = M_SQRT2l;                                                  // sqrt(2)
+long double _1_SQRT_2 = M_SQRT1_2l;                                             // 1 / sqrt(2)
 const float _Complex E = M_E + 0.0_iF;                                  // e
 const float _Complex LOG2_E = M_LOG2E + 0.0_iF;                 // log_2(e)
 const float _Complex LOG10_E = M_LOG10E + 0.0_iF;               // log_10(e)
 const float _Complex LN_2 = M_LN2 + 0.0_iF;                             // log_e(2)
 const float _Complex LN_10 = M_LN10 + 0.0_iF;                   // log_e(10)
 const float _Complex SQRT_2 = M_SQRT2 + 0.0_iF;                 // sqrt(2)
 const float _Complex _1_SQRT_2 = M_SQRT1_2 + 0.0_iF;    // 1 / sqrt(2)
+float _Complex E = M_E + 0.0_iF;                                                // e
+float _Complex LOG2_E = M_LOG2E + 0.0_iF;                               // log_2(e)
+float _Complex LOG10_E = M_LOG10E + 0.0_iF;                             // log_10(e)
+float _Complex LN_2 = M_LN2 + 0.0_iF;                                   // log_e(2)
+float _Complex LN_10 = M_LN10 + 0.0_iF;                                 // log_e(10)
+float _Complex SQRT_2 = M_SQRT2 + 0.0_iF;                               // sqrt(2)
+float _Complex _1_SQRT_2 = M_SQRT1_2 + 0.0_iF;                  // 1 / sqrt(2)
 const double _Complex E = M_E + 0.0_iD;                                 // e
 const double _Complex LOG2_E = M_LOG2E + 0.0_iD;                // log_2(e)
 const double _Complex LOG10_E = M_LOG10E + 0.0_iD;              // log_10(e)
 const double _Complex LN_2 = M_LN2 + 0.0_iD;                    // log_e(2)
 const double _Complex LN_10 = M_LN10 + 0.0_iD;                  // log_e(10)
 const double _Complex SQRT_2 = M_SQRT2 + 0.0_iD;                // sqrt(2)
 const double _Complex _1_SQRT_2 = M_SQRT1_2 + 0.0_iD;   // 1 / sqrt(2)
+double _Complex E = M_E + 0.0_iD;                                               // e
+double _Complex LOG2_E = M_LOG2E + 0.0_iD;                              // log_2(e)
+double _Complex LOG10_E = M_LOG10E + 0.0_iD;                    // log_10(e)
+double _Complex LN_2 = M_LN2 + 0.0_iD;                                  // log_e(2)
+double _Complex LN_10 = M_LN10 + 0.0_iD;                                // log_e(10)
+double _Complex SQRT_2 = M_SQRT2 + 0.0_iD;                              // sqrt(2)
+double _Complex _1_SQRT_2 = M_SQRT1_2 + 0.0_iD;                 // 1 / sqrt(2)
 const long double _Complex E = M_El + 0.0_iL;                   // e
 const long double _Complex LOG2_E = M_LOG2El + 0.0_iL;  // log_2(e)
 const long double _Complex LOG10_E = M_LOG10El + 0.0_iL; // log_10(e)
 const long double _Complex LN_2 = M_LN2l + 0.0_iL;              // log_e(2)
 const long double _Complex LN_10 = M_LN10l + 0.0_iL;    // log_e(10)
 const long double _Complex SQRT_2 = M_SQRT2l + 0.0_iL;  // sqrt(2)
 const long double _Complex _1_SQRT_2 = M_SQRT1_2l + 0.0_iL; // 1 / sqrt(2)
+long double _Complex E = M_El + 0.0_iL;                                 // e
+long double _Complex LOG2_E = M_LOG2El + 0.0_iL;                // log_2(e)
+long double _Complex LOG10_E = M_LOG10El + 0.0_iL;              // log_10(e)
+long double _Complex LN_2 = M_LN2l + 0.0_iL;                    // log_e(2)
+long double _Complex LN_10 = M_LN10l + 0.0_iL;                  // log_e(10)
+long double _Complex SQRT_2 = M_SQRT2l + 0.0_iL;                // sqrt(2)
+long double _Complex _1_SQRT_2 = M_SQRT1_2l + 0.0_iL;   // 1 / sqrt(2)
 // Local Variables: //

libcfa/src/limits.hfa

-              rae2c27a
+              rc76bd34
 // Created On       : Wed Apr  6 18:06:52 2016
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Thu Mar  1 16:20:54 2018
 // Update Count     : 13
+// Last Modified On : Wed Sep 30 22:56:35 2020
+// Update Count     : 15
 //
 …
 // Integral Constants
 extern const signed char MIN;
 extern const unsigned char MIN;
 extern const short int MIN;
 extern const unsigned short int MIN;
 extern const int MIN;
 extern const unsigned int MIN;
 extern const long int MIN;
 extern const unsigned long int MIN;
 extern const long long int MIN;
 extern const unsigned long long int MIN;
+extern signed char MIN;
+extern unsigned char MIN;
+extern short int MIN;
+extern unsigned short int MIN;
+extern int MIN;
+extern unsigned int MIN;
+extern long int MIN;
+extern unsigned long int MIN;
+extern long long int MIN;
+extern unsigned long long int MIN;
 extern const signed char MAX;
 extern const unsigned char MAX;
 extern const short int MAX;
 extern const unsigned short int MAX;
 extern const int MAX;
 extern const unsigned int MAX;
 extern const long int MAX;
 extern const unsigned long int MAX;
 extern const long long int MAX;
 extern const unsigned long long int MAX;
+extern signed char MAX;
+extern unsigned char MAX;
+extern short int MAX;
+extern unsigned short int MAX;
+extern int MAX;
+extern unsigned int MAX;
+extern long int MAX;
+extern unsigned long int MAX;
+extern long long int MAX;
+extern unsigned long long int MAX;
 // Floating-Point Constants
 extern const float MIN;
 extern const double MIN;
 extern const long double MIN;
 extern const float _Complex MIN;
 extern const double _Complex MIN;
 extern const long double _Complex MIN;
+extern float MIN;
+extern double MIN;
+extern long double MIN;
+extern float _Complex MIN;
+extern double _Complex MIN;
+extern long double _Complex MIN;
 extern const float MAX;
 extern const double MAX;
 extern const long double MAX;
 extern const float _Complex MAX;
 extern const double _Complex MAX;
 extern const long double _Complex MAX;
+extern float MAX;
+extern double MAX;
+extern long double MAX;
+extern float _Complex MAX;
+extern double _Complex MAX;
+extern long double _Complex MAX;
 extern const float PI;                                                                  // pi
 extern const float PI_2;                                                                // pi / 2
 extern const float PI_4;                                                                // pi / 4
 extern const float _1_PI;                                                               // 1 / pi
 extern const float _2_PI;                                                               // 2 / pi
 extern const float _2_SQRT_PI;                                                  // 2 / sqrt(pi)
+extern float PI;                                                                                // pi
+extern float PI_2;                                                                              // pi / 2
+extern float PI_4;                                                                              // pi / 4
+extern float _1_PI;                                                                             // 1 / pi
+extern float _2_PI;                                                                             // 2 / pi
+extern float _2_SQRT_PI;                                                                // 2 / sqrt(pi)
 extern const double PI;                                                                 // pi
 extern const double PI_2;                                                               // pi / 2
 extern const double PI_4;                                                               // pi / 4
 extern const double _1_PI;                                                              // 1 / pi
 extern const double _2_PI;                                                              // 2 / pi
 extern const double _2_SQRT_PI;                                                 // 2 / sqrt(pi)
+extern double PI;                                                                               // pi
+extern double PI_2;                                                                             // pi / 2
+extern double PI_4;                                                                             // pi / 4
+extern double _1_PI;                                                                    // 1 / pi
+extern double _2_PI;                                                                    // 2 / pi
+extern double _2_SQRT_PI;                                                               // 2 / sqrt(pi)
 extern const long double PI;                                                    // pi
 extern const long double PI_2;                                                  // pi / 2
 extern const long double PI_4;                                                  // pi / 4
 extern const long double _1_PI;                                                 // 1 / pi
 extern const long double _2_PI;                                                 // 2 / pi
 extern const long double _2_SQRT_PI;                                    // 2 / sqrt(pi)
+extern long double PI;                                                                  // pi
+extern long double PI_2;                                                                // pi / 2
+extern long double PI_4;                                                                // pi / 4
+extern long double _1_PI;                                                               // 1 / pi
+extern long double _2_PI;                                                               // 2 / pi
+extern long double _2_SQRT_PI;                                                  // 2 / sqrt(pi)
 extern const float _Complex PI;                                                 // pi
 extern const float _Complex PI_2;                                               // pi / 2
 extern const float _Complex PI_4;                                               // pi / 4
 extern const float _Complex _1_PI;                                              // 1 / pi
 extern const float _Complex _2_PI;                                              // 2 / pi
 extern const float _Complex _2_SQRT_PI;                                 // 2 / sqrt(pi)
+extern float _Complex PI;                                                               // pi
+extern float _Complex PI_2;                                                             // pi / 2
+extern float _Complex PI_4;                                                             // pi / 4
+extern float _Complex _1_PI;                                                    // 1 / pi
+extern float _Complex _2_PI;                                                    // 2 / pi
+extern float _Complex _2_SQRT_PI;                                               // 2 / sqrt(pi)
 extern const double _Complex PI;                                                // pi
 extern const double _Complex PI_2;                                              // pi / 2
 extern const double _Complex PI_4;                                              // pi / 4
 extern const double _Complex _1_PI;                                             // 1 / pi
 extern const double _Complex _2_PI;                                             // 2 / pi
 extern const double _Complex _2_SQRT_PI;                                // 2 / sqrt(pi)
+extern double _Complex PI;                                                              // pi
+extern double _Complex PI_2;                                                    // pi / 2
+extern double _Complex PI_4;                                                    // pi / 4
+extern double _Complex _1_PI;                                                   // 1 / pi
+extern double _Complex _2_PI;                                                   // 2 / pi
+extern double _Complex _2_SQRT_PI;                                              // 2 / sqrt(pi)
 extern const long double _Complex PI;                                   // pi
 extern const long double _Complex PI_2;                                 // pi / 2
 extern const long double _Complex PI_4;                                 // pi / 4
 extern const long double _Complex _1_PI;                                // 1 / pi
 extern const long double _Complex _2_PI;                                // 2 / pi
 extern const long double _Complex _2_SQRT_PI;                   // 2 / sqrt(pi)
+extern long double _Complex PI;                                                 // pi
+extern long double _Complex PI_2;                                               // pi / 2
+extern long double _Complex PI_4;                                               // pi / 4
+extern long double _Complex _1_PI;                                              // 1 / pi
+extern long double _Complex _2_PI;                                              // 2 / pi
+extern long double _Complex _2_SQRT_PI;                                 // 2 / sqrt(pi)
 extern const float E;                                                                   // e
 extern const float LOG2_E;                                                              // log_2(e)
 extern const float LOG10_E;                                                             // log_10(e)
 extern const float LN_2;                                                                // log_e(2)
 extern const float LN_10;                                                               // log_e(10)
 extern const float SQRT_2;                                                              // sqrt(2)
 extern const float _1_SQRT_2;                                                   // 1 / sqrt(2)
+extern float E;                                                                                 // e
+extern float LOG2_E;                                                                    // log_2(e)
+extern float LOG10_E;                                                                   // log_10(e)
+extern float LN_2;                                                                              // log_e(2)
+extern float LN_10;                                                                             // log_e(10)
+extern float SQRT_2;                                                                    // sqrt(2)
+extern float _1_SQRT_2;                                                                 // 1 / sqrt(2)
 extern const double E;                                                                  // e
 extern const double LOG2_E;                                                             // log_2(e)
 extern const double LOG10_E;                                                    // log_10(e)
 extern const double LN_2;                                                               // log_e(2)
 extern const double LN_10;                                                              // log_e(10)
 extern const double SQRT_2;                                                             // sqrt(2)
 extern const double _1_SQRT_2;                                                  // 1 / sqrt(2)
+extern double E;                                                                                // e
+extern double LOG2_E;                                                                   // log_2(e)
+extern double LOG10_E;                                                                  // log_10(e)
+extern double LN_2;                                                                             // log_e(2)
+extern double LN_10;                                                                    // log_e(10)
+extern double SQRT_2;                                                                   // sqrt(2)
+extern double _1_SQRT_2;                                                                // 1 / sqrt(2)
 extern const long double E;                                                             // e
 extern const long double LOG2_E;                                                // log_2(e)
 extern const long double LOG10_E;                                               // log_10(e)
 extern const long double LN_2;                                                  // log_e(2)
 extern const long double LN_10;                                                 // log_e(10)
 extern const long double SQRT_2;                                                // sqrt(2)
 extern const long double _1_SQRT_2;                                             // 1/sqrt(2)
+extern long double E;                                                                   // e
+extern long double LOG2_E;                                                              // log_2(e)
+extern long double LOG10_E;                                                             // log_10(e)
+extern long double LN_2;                                                                // log_e(2)
+extern long double LN_10;                                                               // log_e(10)
+extern long double SQRT_2;                                                              // sqrt(2)
+extern long double _1_SQRT_2;                                                   // 1/sqrt(2)
 extern const float _Complex E;                                                  // e
 extern const float _Complex LOG2_E;                                             // log_2(e)
 extern const float _Complex LOG10_E;                                    // log_10(e)
 extern const float _Complex LN_2;                                               // log_e(2)
 extern const float _Complex LN_10;                                              // log_e(10)
 extern const float _Complex SQRT_2;                                             // sqrt(2)
 extern const float _Complex _1_SQRT_2;                                  // 1 / sqrt(2)
+extern float _Complex E;                                                                // e
+extern float _Complex LOG2_E;                                                   // log_2(e)
+extern float _Complex LOG10_E;                                                  // log_10(e)
+extern float _Complex LN_2;                                                             // log_e(2)
+extern float _Complex LN_10;                                                    // log_e(10)
+extern float _Complex SQRT_2;                                                   // sqrt(2)
+extern float _Complex _1_SQRT_2;                                                // 1 / sqrt(2)
 extern const double _Complex E;                                                 // e
 extern const double _Complex LOG2_E;                                    // log_2(e)
 extern const double _Complex LOG10_E;                                   // log_10(e)
 extern const double _Complex LN_2;                                              // log_e(2)
 extern const double _Complex LN_10;                                             // log_e(10)
 extern const double _Complex SQRT_2;                                    // sqrt(2)
 extern const double _Complex _1_SQRT_2;                                 // 1 / sqrt(2)
+extern double _Complex E;                                                               // e
+extern double _Complex LOG2_E;                                                  // log_2(e)
+extern double _Complex LOG10_E;                                                 // log_10(e)
+extern double _Complex LN_2;                                                    // log_e(2)
+extern double _Complex LN_10;                                                   // log_e(10)
+extern double _Complex SQRT_2;                                                  // sqrt(2)
+extern double _Complex _1_SQRT_2;                                               // 1 / sqrt(2)
 extern const long double _Complex E;                                    // e
 extern const long double _Complex LOG2_E;                               // log_2(e)
 extern const long double _Complex LOG10_E;                              // log_10(e)
 extern const long double _Complex LN_2;                                 // log_e(2)
 extern const long double _Complex LN_10;                                // log_e(10)
 extern const long double _Complex SQRT_2;                               // sqrt(2)
 extern const long double _Complex _1_SQRT_2;                    // 1 / sqrt(2)
+extern long double _Complex E;                                                  // e
+extern long double _Complex LOG2_E;                                             // log_2(e)
+extern long double _Complex LOG10_E;                                    // log_10(e)
+extern long double _Complex LN_2;                                               // log_e(2)
+extern long double _Complex LN_10;                                              // log_e(10)
+extern long double _Complex SQRT_2;                                             // sqrt(2)
+extern long double _Complex _1_SQRT_2;                                  // 1 / sqrt(2)
 // Local Variables: //

libcfa/src/parseargs.cfa

-              rae2c27a
+              rc76bd34
 #include "limits.hfa"
 extern int cfa_args_argc;
 extern char ** cfa_args_argv;
 extern char ** cfa_args_envp;
+extern int cfa_args_argc __attribute__((weak));
+extern char ** cfa_args_argv __attribute__((weak));
+extern char ** cfa_args_envp __attribute__((weak));
 static void usage(char * cmd, cfa_option options[], size_t opt_count, const char * usage, FILE * out)  __attribute__ ((noreturn));
 void parse_args( cfa_option options[], size_t opt_count, const char * usage, char ** & left ) {
+        parse_args(cfa_args_argc, cfa_args_argv, options, opt_count, usage, left );
+        if( 0p != &cfa_args_argc ) {
+                parse_args(cfa_args_argc, cfa_args_argv, options, opt_count, usage, left );
+        }
+        else {
+                char * temp = "";
+                parse_args(0, &temp, options, opt_count, usage, left );
+        }
+}

src/AST/Convert.cpp

-              rae2c27a
+              rc76bd34
         const ast::DeclWithType * visit( const ast::FunctionDecl * node ) override final {
                 if ( inCache( node ) ) return nullptr;
+                // function decl contains real variables that the type must use.
+                // the structural change means function type in and out of decl
+                // must be handled **differently** on convert back to old.
+                auto ftype = new FunctionType(
+                        cv(node->type),
+                        (bool)node->type->isVarArgs
+                );
+                ftype->returnVals = get<DeclarationWithType>().acceptL(node->returns);
+                ftype->parameters = get<DeclarationWithType>().acceptL(node->params);
+                ftype->forall = get<TypeDecl>().acceptL( node->type->forall );
+                visitType(node->type, ftype);
                 auto decl = new FunctionDecl(
                         node->name,
                         Type::StorageClasses( node->storage.val ),
                         LinkageSpec::Spec( node->linkage.val ),
+                        get<FunctionType>().accept1( node->type ),
+                        ftype,
+                        //get<FunctionType>().accept1( node->type ),
                         {},
                         get<Attribute>().acceptL( node->attributes ),
 …
         const ast::Type * visit( const ast::FunctionType * node ) override final {
+                static std::string dummy_paramvar_prefix = "__param_";
+                static std::string dummy_returnvar_prefix = "__retval_";
                 auto ty = new FunctionType {
                         cv( node ),
                         (bool)node->isVarArgs
                 };
+                ty->returnVals = get<DeclarationWithType>().acceptL( node->returns );
+                ty->parameters = get<DeclarationWithType>().acceptL( node->params );
+                auto returns = get<Type>().acceptL(node->returns);
+                auto params = get<Type>().acceptL(node->params);
+                int ret_index = 0;
+                for (auto t: returns) {
+                        // xxx - LinkageSpec shouldn't matter but needs to be something
+                        ObjectDecl * dummy = new ObjectDecl(dummy_returnvar_prefix + std::to_string(ret_index++), {}, LinkageSpec::C, nullptr, t, nullptr);
+                        ty->returnVals.push_back(dummy);
+                }
+                int param_index = 0;
+                for (auto t: params) {
+                        ObjectDecl * dummy = new ObjectDecl(dummy_paramvar_prefix + std::to_string(param_index++), {}, LinkageSpec::C, nullptr, t, nullptr);
+                        ty->parameters.push_back(dummy);
+                }
+                // ty->returnVals = get<DeclarationWithType>().acceptL( node->returns );
+                // ty->parameters = get<DeclarationWithType>().acceptL( node->params );
                 ty->forall = get<TypeDecl>().acceptL( node->forall );
                 return visitType( node, ty );
+        }
         const ast::Type * postvisit( const ast::ReferenceToType * old, ReferenceToType * ty ) {
+        const ast::Type * postvisit( const ast::BaseInstType * old, ReferenceToType * ty ) {
                 ty->forall = get<TypeDecl>().acceptL( old->forall );
                 ty->parameters = get<Expression>().acceptL( old->params );
 …
         ast::Node * node = nullptr;
         /// cache of nodes that might be referenced by readonly<> for de-duplication
+        std::unordered_map< const BaseSyntaxNode *, ast::Node * > cache = {};
+        /// in case that some nodes are dropped by conversion (due to possible structural change)
+        /// use smart pointers in cache value to prevent accidental invalidation.
+        /// at conversion stage, all created nodes are guaranteed to be unique, therefore
+        /// const_casting out of smart pointers is permitted.
+        std::unordered_map< const BaseSyntaxNode *, ast::ptr<ast::Node> > cache = {};
         // Local Utilities:
 …
                 auto it = cache.find( old );
                 if ( it == cache.end() ) return false;
                 node = it->second;
+                node = const_cast<ast::Node *>(it->second.get());
                 return true;
+        }
 …
         virtual void visit( const FunctionDecl * old ) override final {
                 if ( inCache( old ) ) return;
+                auto paramVars = GET_ACCEPT_V(type->parameters, DeclWithType);
+                auto returnVars = GET_ACCEPT_V(type->returnVals, DeclWithType);
+                auto forall = GET_ACCEPT_V(type->forall, TypeDecl);
+                // function type is now derived from parameter decls instead of storing them
+                auto ftype = new ast::FunctionType((ast::ArgumentFlag)old->type->isVarArgs, cv(old->type));
+                ftype->params.reserve(paramVars.size());
+                ftype->returns.reserve(returnVars.size());
+                for (auto & v: paramVars) {
+                        ftype->params.emplace_back(v->get_type());
+                }
+                for (auto & v: returnVars) {
+                        ftype->returns.emplace_back(v->get_type());
+                }
+                ftype->forall = std::move(forall);
+                visitType(old->type, ftype);
                 auto decl = new ast::FunctionDecl{
                         old->location,
                         old->name,
+                        GET_ACCEPT_1(type, FunctionType),
+                        // GET_ACCEPT_1(type, FunctionType),
+                        std::move(paramVars),
+                        std::move(returnVars),
                         {},
                         { old->storageClasses.val },
 …
                         { old->get_funcSpec().val }
                 };
+                decl->type = ftype;
                 cache.emplace( old, decl );
                 decl->withExprs = GET_ACCEPT_V(withExprs, Expr);
                 decl->stmts = GET_ACCEPT_1(statements, CompoundStmt);
 …
                         cv( old )
                 };
+                ty->returns = GET_ACCEPT_V( returnVals, DeclWithType );
+                ty->params = GET_ACCEPT_V( parameters, DeclWithType );
+                auto returnVars = GET_ACCEPT_V(returnVals, DeclWithType);
+                auto paramVars = GET_ACCEPT_V(parameters, DeclWithType);
+                // ty->returns = GET_ACCEPT_V( returnVals, DeclWithType );
+                // ty->params = GET_ACCEPT_V( parameters, DeclWithType );
+                for (auto & v: returnVars) {
+                        ty->returns.emplace_back(v->get_type());
+                }
+                for (auto & v: paramVars) {
+                        ty->params.emplace_back(v->get_type());
+                }
                 ty->forall = GET_ACCEPT_V( forall, TypeDecl );
                 visitType( old, ty );
+        }
         void postvisit( const ReferenceToType * old, ast::ReferenceToType * ty ) {
+        void postvisit( const ReferenceToType * old, ast::BaseInstType * ty ) {
                 ty->forall = GET_ACCEPT_V( forall, TypeDecl );
                 ty->params = GET_ACCEPT_V( parameters, Expr );

src/AST/Decl.hpp

-              rae2c27a
+              rc76bd34
 class FunctionDecl : public DeclWithType {
 public:
+        std::vector<ptr<DeclWithType>> params;
+        std::vector<ptr<DeclWithType>> returns;
+        // declared type, derived from parameter declarations
         ptr<FunctionType> type;
         ptr<CompoundStmt> stmts;
         std::vector< ptr<Expr> > withExprs;
+        FunctionDecl( const CodeLocation & loc, const std::string & name, FunctionType * type,
+        FunctionDecl( const CodeLocation & loc, const std::string & name,
+                std::vector<ptr<DeclWithType>>&& params, std::vector<ptr<DeclWithType>>&& returns,
                 CompoundStmt * stmts, Storage::Classes storage = {}, Linkage::Spec linkage = Linkage::C,
                 std::vector<ptr<Attribute>>&& attrs = {}, Function::Specs fs = {})
         : DeclWithType( loc, name, storage, linkage, std::move(attrs), fs ), type( type ),
+        : DeclWithType( loc, name, storage, linkage, std::move(attrs), fs ), params(std::move(params)), returns(std::move(returns)),
           stmts( stmts ) {}

src/AST/ForallSubstitutor.hpp

-              rae2c27a
+              rc76bd34
+        }
+        template<typename node_t >
+        std::vector<ptr<node_t>> operator() (const std::vector<ptr<node_t>> & o) {
+                std::vector<ptr<node_t>> n;
+                n.reserve(o.size());
+                for (const node_t * d : o) { n.emplace_back(d->accept(*visitor)); }
+                return n;
+        }
+        /*
         /// Substitute parameter/return type
         std::vector< ptr< DeclWithType > > operator() ( const std::vector< ptr< DeclWithType > > & o ) {
 …
                 return n;
+        }
+        */
 };

src/AST/Fwd.hpp

rae2c27a	rc76bd34
107	107	class QualifiedType;
108	108	class FunctionType;
109		class ~~ReferenceTo~~Type;
	109	class BaseInstType;
110	110	template<typename decl_t> class SueInstType;
111	111	using StructInstType = SueInstType<StructDecl>;

src/AST/GenericSubstitution.cpp

rae2c27a	rc76bd34
42	42	private:
43	43	// make substitution for generic type
44		void makeSub( const ~~ReferenceTo~~Type * ty ) {
	44	void makeSub( const BaseInstType * ty ) {
45	45	visit_children = false;
46	46	const AggregateDecl * aggr = ty->aggr();

src/AST/Node.cpp

-              rae2c27a
+              rc76bd34
 template class ast::ptr_base< ast::FunctionType, ast::Node::ref_type::weak >;
 template class ast::ptr_base< ast::FunctionType, ast::Node::ref_type::strong >;
 template class ast::ptr_base< ast::ReferenceToType, ast::Node::ref_type::weak >;
 template class ast::ptr_base< ast::ReferenceToType, ast::Node::ref_type::strong >;
+template class ast::ptr_base< ast::BaseInstType, ast::Node::ref_type::weak >;
+template class ast::ptr_base< ast::BaseInstType, ast::Node::ref_type::strong >;
 template class ast::ptr_base< ast::StructInstType, ast::Node::ref_type::weak >;
 template class ast::ptr_base< ast::StructInstType, ast::Node::ref_type::strong >;

src/AST/Pass.hpp

-              rae2c27a
+              rc76bd34
 // | PureVisitor           - makes the visitor pure, it never modifies nodes in place and always
 //                           clones nodes it needs to make changes to
 // | WithTypeSubstitution  - provides polymorphic const TypeSubstitution * env for the
+// | WithConstTypeSubstitution - provides polymorphic const TypeSubstitution * typeSubs for the
 //                           current expression
 // | WithStmtsToAdd        - provides the ability to insert statements before or after the current
 …
 // | WithSymbolTable       - provides symbol table functionality
 // | WithForallSubstitutor - maintains links between TypeInstType and TypeDecl under mutation
+//
+// Other Special Members:
+// | result                - Either a method that takes no parameters or a field. If a method (or
+//                           callable field) get_result calls it, otherwise the value is returned.
 //-------------------------------------------------------------------------------------------------
 template< typename core_t >
 …
         virtual ~Pass() = default;
+        /// Storage for the actual pass.
+        core_t core;
+        /// If the core defines a result, call it if possible, otherwise return it.
+        inline auto get_result() -> decltype( __pass::get_result( core, '0' ) ) {
+                return __pass::get_result( core, '0' );
+        }
         /// Construct and run a pass on a translation unit.
         template< typename... Args >
 …
+        }
+        /// Contruct and run a pass on a pointer to extract a value.
+        template< typename node_type, typename... Args >
+        static auto read( node_type const * node, Args&&... args ) {
+                Pass<core_t> visitor( std::forward<Args>( args )... );
+                node_type const * temp = node->accept( visitor );
+                assert( temp == node );
+                return visitor.get_result();
+        }
+        // Versions of the above for older compilers.
         template< typename... Args >
         static void run( std::list< ptr<Decl> > & decls ) {
 …
+        }
+        /// Storage for the actual pass
+        core_t core;
+        template< typename node_type, typename... Args >
+        static auto read( node_type const * node ) {
+                Pass<core_t> visitor;
+                node_type const * temp = node->accept( visitor );
+                assert( temp == node );
+                return visitor.get_result();
+        }
         /// Visit function declarations
 …
 //-------------------------------------------------------------------------------------------------
-/// Keep track of the polymorphic const TypeSubstitution * env for the current expression
 /// If used the visitor will always clone nodes.
 struct PureVisitor {};
+/// Keep track of the polymorphic const TypeSubstitution * typeSubs for the current expression.
 struct WithConstTypeSubstitution {
         const TypeSubstitution * env = nullptr;
+        const TypeSubstitution * typeSubs = nullptr;
 };

src/AST/Pass.impl.hpp

-              rae2c27a
+              rc76bd34
                 __pedantic_pass_assert( expr );
                 const ast::TypeSubstitution ** env_ptr = __pass::env( core, 0);
                 if ( env_ptr && expr->env ) {
                         *env_ptr = expr->env;
+                const ast::TypeSubstitution ** typeSubs_ptr = __pass::typeSubs( core, 0 );
+                if ( typeSubs_ptr && expr->env ) {
+                        *typeSubs_ptr = expr->env;
+                }
 …
                 // These may be modified by subnode but most be restored once we exit this statemnet.
                 ValueGuardPtr< const ast::TypeSubstitution * > __old_env         ( __pass::env( core, 0) );
+                ValueGuardPtr< const ast::TypeSubstitution * > __old_env         ( __pass::typeSubs( core, 0 ) );
                 ValueGuardPtr< typename std::remove_pointer< decltype(stmts_before) >::type > __old_decls_before( stmts_before );
                 ValueGuardPtr< typename std::remove_pointer< decltype(stmts_after ) >::type > __old_decls_after ( stmts_after  );
 …
                         __pass::symtab::addId( core, 0, func );
                         VISIT(
+                                // parameter declarations are now directly here
+                                maybe_accept( node, &FunctionDecl::params );
+                                maybe_accept( node, &FunctionDecl::returns );
+                                // foralls are still in function type
                                 maybe_accept( node, &FunctionDecl::type );
                                 // function body needs to have the same scope as parameters - CompoundStmt will not enter
 …
                 // These may be modified by subnode but most be restored once we exit this statemnet.
                 ValueGuardPtr< const ast::TypeSubstitution * > __old_env( __pass::env( core, 0) );
+                ValueGuardPtr< const ast::TypeSubstitution * > __old_env( __pass::typeSubs( core, 0 ) );
                 ValueGuardPtr< typename std::remove_pointer< decltype(stmts_before) >::type > __old_decls_before( stmts_before );
                 ValueGuardPtr< typename std::remove_pointer< decltype(stmts_after ) >::type > __old_decls_after ( stmts_after  );

src/AST/Pass.proto.hpp

-              rae2c27a
+              rc76bd34
         // List of fields and their expected types
         FIELD_PTR( env, const ast::TypeSubstitution * )
+        FIELD_PTR( typeSubs, const ast::TypeSubstitution * )
         FIELD_PTR( stmtsToAddBefore, std::list< ast::ptr< ast::Stmt > > )
         FIELD_PTR( stmtsToAddAfter , std::list< ast::ptr< ast::Stmt > > )
 …
         } // namespace forall
+        template<typename core_t>
+        static inline auto get_result( core_t & core, char ) -> decltype( core.result() ) {
+                return core.result();
+        }
+        template<typename core_t>
+        static inline auto get_result( core_t & core, int ) -> decltype( core.result ) {
+                return core.result;
+        }
+        template<typename core_t>
+        static inline void get_result( core_t &, long ) {}
 } // namespace __pass
 } // namespace ast

src/AST/Print.cpp

rae2c27a	rc76bd34
270	270	}
271	271
272		void preprint( const ast::~~ReferenceTo~~Type * node ) {
	272	void preprint( const ast::BaseInstType * node ) {
273	273	print( node->forall );
274	274	print( node->attributes );

src/AST/SymbolTable.cpp

-              rae2c27a
+              rc76bd34
                 if ( ! expr->result ) continue;
                 const Type * resTy = expr->result->stripReferences();
                 auto aggrType = dynamic_cast< const ReferenceToType * >( resTy );
+                auto aggrType = dynamic_cast< const BaseInstType * >( resTy );
                 assertf( aggrType, "WithStmt expr has non-aggregate type: %s",
                         toString( expr->result ).c_str() );
 …
+}
+/*
 void SymbolTable::addFunctionType( const FunctionType * ftype ) {
         addTypes( ftype->forall );
 …
         addIds( ftype->params );
+}
+*/
 void SymbolTable::lazyInitScope() {
 …
                 assert( ! params.empty() );
                 // use base type of pointer, so that qualifiers on the pointer type aren't considered.
                 const Type * base = InitTweak::getPointerBase( params.front()->get_type() );
+                const Type * base = InitTweak::getPointerBase( params.front() );
                 assert( base );
                 return Mangle::mangle( base );
 …
                         if ( dwt->name == "" ) {
                                 const Type * t = dwt->get_type()->stripReferences();
                                 if ( auto rty = dynamic_cast<const ReferenceToType *>( t ) ) {
+                                if ( auto rty = dynamic_cast<const BaseInstType *>( t ) ) {
                                         if ( ! dynamic_cast<const StructInstType *>(rty)
                                                 && ! dynamic_cast<const UnionInstType *>(rty) ) continue;

src/AST/SymbolTable.hpp

rae2c27a	rc76bd34
145	145
146	146	/// convenience function for adding all of the declarations in a function type to the indexer
147		void addFunctionType( const FunctionType * ftype );
	147	// void addFunctionType( const FunctionType * ftype );
148	148
149	149	private:

src/AST/Type.cpp

-              rae2c27a
+              rc76bd34
 // --- FunctionType
 FunctionType::FunctionType( const FunctionType & o )
 : ParameterizedType( o.qualifiers, copy( o.attributes ) ), returns(), params(),
 …
 namespace {
         bool containsTtype( const std::vector<ptr<DeclWithType>> & l ) {
+        bool containsTtype( const std::vector<ptr<Type>> & l ) {
                 if ( ! l.empty() ) {
                         return Tuples::isTtype( l.back()->get_type() );
+                        return Tuples::isTtype( l.back() );
+                }
                 return false;
 …
+}
 // --- ReferenceToType
 void ReferenceToType::initWithSub( const ReferenceToType & o, Pass< ForallSubstitutor > & sub ) {
+// --- BaseInstType
+void BaseInstType::initWithSub( const BaseInstType & o, Pass< ForallSubstitutor > & sub ) {
         ParameterizedType::initWithSub( o, sub ); // initialize substitution
         params = sub.core( o.params );            // apply to parameters
+}
 ReferenceToType::ReferenceToType( const ReferenceToType & o )
+BaseInstType::BaseInstType( const BaseInstType & o )
 : ParameterizedType( o.qualifiers, copy( o.attributes ) ), params(), name( o.name ),
   hoistType( o.hoistType ) {
 …
+}
 std::vector<readonly<Decl>> ReferenceToType::lookup( const std::string& name ) const {
+std::vector<readonly<Decl>> BaseInstType::lookup( const std::string& name ) const {
         assertf( aggr(), "Must have aggregate to perform lookup" );
 …
 SueInstType<decl_t>::SueInstType(
         const decl_t * b, CV::Qualifiers q, std::vector<ptr<Attribute>>&& as )
 : ReferenceToType( b->name, q, move(as) ), base( b ) {}
+: BaseInstType( b->name, q, move(as) ), base( b ) {}
 template<typename decl_t>
 …
 TraitInstType::TraitInstType(
         const TraitDecl * b, CV::Qualifiers q, std::vector<ptr<Attribute>>&& as )
 : ReferenceToType( b->name, q, move(as) ), base( b ) {}
+: BaseInstType( b->name, q, move(as) ), base( b ) {}
 // --- TypeInstType
 TypeInstType::TypeInstType( const TypeInstType & o )
 : ReferenceToType( o.name, o.qualifiers, copy( o.attributes ) ), base(), kind( o.kind ) {
+: BaseInstType( o.name, o.qualifiers, copy( o.attributes ) ), base(), kind( o.kind ) {
         Pass< ForallSubstitutor > sub;
         initWithSub( o, sub );      // initialize substitution

src/AST/Type.hpp

-              rae2c27a
+              rc76bd34
 class FunctionType final : public ParameterizedType {
 public:
+        std::vector<ptr<DeclWithType>> returns;
+        std::vector<ptr<DeclWithType>> params;
+//      std::vector<ptr<DeclWithType>> returns;
+//      std::vector<ptr<DeclWithType>> params;
+        std::vector<ptr<Type>> returns;
+        std::vector<ptr<Type>> params;
         /// Does the function accept a variable number of arguments following the arguments specified
 …
 /// base class for types that refer to types declared elsewhere (aggregates and typedefs)
 class ReferenceToType : public ParameterizedType {
+class BaseInstType : public ParameterizedType {
 protected:
         /// Initializes forall and parameters based on substitutor
         void initWithSub( const ReferenceToType & o, Pass< ForallSubstitutor > & sub );
+        void initWithSub( const BaseInstType & o, Pass< ForallSubstitutor > & sub );
 public:
         std::vector<ptr<Expr>> params;
 …
         bool hoistType = false;
         ReferenceToType(
+        BaseInstType(
                 const std::string& n, CV::Qualifiers q = {}, std::vector<ptr<Attribute>> && as = {} )
         : ParameterizedType(q, std::move(as)), params(), name(n) {}
         ReferenceToType( const ReferenceToType & o );
+        BaseInstType( const BaseInstType & o );
         /// Gets aggregate declaration this type refers to
 …
 private:
         virtual ReferenceToType * clone() const override = 0;
+        virtual BaseInstType * clone() const override = 0;
         MUTATE_FRIEND
 };
 …
 // Common implementation for the SUE instance types. Not to be used directly.
 template<typename decl_t>
 class SueInstType final : public ReferenceToType {
+class SueInstType final : public BaseInstType {
 public:
         using base_type = decl_t;
 …
         SueInstType(
                 const std::string& n, CV::Qualifiers q = {}, std::vector<ptr<Attribute>> && as = {} )
         : ReferenceToType( n, q, std::move(as) ), base() {}
+        : BaseInstType( n, q, std::move(as) ), base() {}
         SueInstType(
 …
 /// An instance of a trait type.
 class TraitInstType final : public ReferenceToType {
+class TraitInstType final : public BaseInstType {
 public:
         readonly<TraitDecl> base;
 …
         TraitInstType(
                 const std::string& n, CV::Qualifiers q = {}, std::vector<ptr<Attribute>> && as = {} )
         : ReferenceToType( n, q, std::move(as) ), base() {}
+        : BaseInstType( n, q, std::move(as) ), base() {}
         TraitInstType(
 …
 /// instance of named type alias (typedef or variable)
 class TypeInstType final : public ReferenceToType {
+class TypeInstType final : public BaseInstType {
 public:
         readonly<TypeDecl> base;
 …
                 const std::string& n, const TypeDecl * b, CV::Qualifiers q = {},
                 std::vector<ptr<Attribute>> && as = {} )
         : ReferenceToType( n, q, std::move(as) ), base( b ), kind( b->kind ) {}
+        : BaseInstType( n, q, std::move(as) ), base( b ), kind( b->kind ) {}
         TypeInstType( const std::string& n, TypeDecl::Kind k, CV::Qualifiers q = {},
                 std::vector<ptr<Attribute>> && as = {} )
         : ReferenceToType( n, q, std::move(as) ), base(), kind( k ) {}
+        : BaseInstType( n, q, std::move(as) ), base(), kind( k ) {}
         TypeInstType( const TypeInstType & o );

src/AST/TypeSubstitution.cpp

rae2c27a	rc76bd34
176	176	}
177	177
178		void TypeSubstitution::Substituter::handleAggregateType( const ~~ReferenceTo~~Type * type ) {
	178	void TypeSubstitution::Substituter::handleAggregateType( const BaseInstType * type ) {
179	179	GuardValue( boundVars );
180	180	// bind type variables from forall-qualifiers

src/AST/TypeSubstitution.hpp

rae2c27a	rc76bd34
169	169	void previsit( const ParameterizedType * type );
170	170	/// Records type variable bindings from forall-statements and instantiations of generic types
171		void handleAggregateType( const ~~ReferenceTo~~Type * type );
	171	void handleAggregateType( const BaseInstType * type );
172	172
173	173	void previsit( const StructInstType * aggregateUseType );

src/Common/Stats/Stats.cc

-              rae2c27a
+              rc76bd34
+        }
+        namespace ResolveTime {
+                bool enabled = false;
+        }
         struct {
                 const char * const opt;
 …
                 { "heap"    , Heap::enabled },
                 { "time"    , Time::enabled },
+                { "resolve" , ResolveTime::enabled },
         };

src/Common/module.mk

-              rae2c27a
+              rc76bd34
       Common/ErrorObjects.h \
       Common/Eval.cc \
+      Common/Examine.cc \
+      Common/Examine.h \
       Common/FilterCombos.h \
       Common/Indenter.h \
 …
       Common/Stats/Heap.cc \
       Common/Stats/Heap.h \
+      Common/Stats/ResolveTime.cc \
+      Common/Stats/ResolveTime.h \
       Common/Stats/Stats.cc \
       Common/Stats/Time.cc \

src/Concurrency/Keywords.cc

-              rae2c27a
+              rc76bd34
 #include <string>                         // for string, operator==
+#include <iostream>
+#include "Common/Examine.h"               // for isMainFor
 #include "Common/PassVisitor.h"           // for PassVisitor
 #include "Common/SemanticError.h"         // for SemanticError
 …
 #include "SynTree/Type.h"                 // for StructInstType, Type, PointerType
 #include "SynTree/Visitor.h"              // for Visitor, acceptAll
+#include "Virtual/Tables.h"
 class Attribute;
 namespace Concurrency {
+        inline static std::string getVTableName( std::string const & exception_name ) {
+                return exception_name.empty() ? std::string() : Virtual::vtableTypeName(exception_name);
+        }
         //=============================================================================================
         // Pass declarations
 …
           public:
+                ConcurrentSueKeyword( std::string&& type_name, std::string&& field_name, std::string&& getter_name, std::string&& context_error, bool needs_main, AggregateDecl::Aggregate cast_target ) :
+                  type_name( type_name ), field_name( field_name ), getter_name( getter_name ), context_error( context_error ), needs_main( needs_main ), cast_target( cast_target ) {}
+                ConcurrentSueKeyword( std::string&& type_name, std::string&& field_name,
+                        std::string&& getter_name, std::string&& context_error, std::string&& exception_name,
+                        bool needs_main, AggregateDecl::Aggregate cast_target ) :
+                  type_name( type_name ), field_name( field_name ), getter_name( getter_name ),
+                  context_error( context_error ), vtable_name( getVTableName( exception_name ) ),
+                  needs_main( needs_main ), cast_target( cast_target ) {}
                 virtual ~ConcurrentSueKeyword() {}
 …
                 void handle( StructDecl * );
+                void addVtableForward( StructDecl * );
                 FunctionDecl * forwardDeclare( StructDecl * );
                 ObjectDecl * addField( StructDecl * );
 …
                 const std::string getter_name;
                 const std::string context_error;
+                const std::string vtable_name;
                 bool needs_main;
                 AggregateDecl::Aggregate cast_target;
 …
                 StructDecl   * type_decl = nullptr;
                 FunctionDecl * dtor_decl = nullptr;
+                StructDecl * vtable_decl = nullptr;
         };
 …
                         "get_thread",
                         "thread keyword requires threads to be in scope, add #include <thread.hfa>\n",
+                        "",
                         true,
                         AggregateDecl::Thread
 …
                         "get_coroutine",
                         "coroutine keyword requires coroutines to be in scope, add #include <coroutine.hfa>\n",
+                        "CoroutineCancelled",
                         true,
                         AggregateDecl::Coroutine
 …
                         "get_monitor",
                         "monitor keyword requires monitors to be in scope, add #include <monitor.hfa>\n",
+                        "",
                         false,
                         AggregateDecl::Monitor
 …
                         "get_generator",
                         "Unable to find builtin type $generator\n",
+                        "",
                         true,
                         AggregateDecl::Generator
 …
         private:
-                DeclarationWithType * is_main( FunctionDecl * );
                 bool is_real_suspend( FunctionDecl * );
 …
                         handle( decl );
+                }
+                else if ( !vtable_decl && vtable_name == decl->name && decl->body ) {
+                        vtable_decl = decl;
+                }
+                // Might be able to get ride of is target.
+                assert( is_target(decl) == (cast_target == decl->kind) );
                 return decl;
+        }
         DeclarationWithType * ConcurrentSueKeyword::postmutate( FunctionDecl * decl ) {
+                if( !type_decl ) return decl;
+                if( !CodeGen::isDestructor( decl->name ) ) return decl;
+                auto params = decl->type->parameters;
+                if( params.size() != 1 ) return decl;
+                auto type = dynamic_cast<ReferenceType*>( params.front()->get_type() );
+                if( !type ) return decl;
+                auto stype = dynamic_cast<StructInstType*>( type->base );
+                if( !stype ) return decl;
+                if( stype->baseStruct != type_decl ) return decl;
+                if( !dtor_decl ) dtor_decl = decl;
+                if ( type_decl && isDestructorFor( decl, type_decl ) )
+                        dtor_decl = decl;
+                else if ( vtable_name.empty() )
+                        ;
+                else if ( auto param = isMainFor( decl, cast_target ) ) {
+                        // This should never trigger.
+                        assert( vtable_decl );
+                        // Should be safe because of isMainFor.
+                        StructInstType * struct_type = static_cast<StructInstType *>(
+                                static_cast<ReferenceType *>( param->get_type() )->base );
+                        assert( struct_type );
+                        declsToAddAfter.push_back( Virtual::makeVtableInstance( vtable_decl, {
+                                new TypeExpr( struct_type->clone() ),
+                        }, struct_type, nullptr ) );
+                }
                 return decl;
+        }
 …
                 if( !dtor_decl ) SemanticError( decl, context_error );
+                addVtableForward( decl );
                 FunctionDecl * func = forwardDeclare( decl );
                 ObjectDecl * field = addField( decl );
                 addRoutines( field, func );
+        }
+        void ConcurrentSueKeyword::addVtableForward( StructDecl * decl ) {
+                if ( vtable_decl ) {
+                        declsToAddBefore.push_back( Virtual::makeVtableForward( vtable_decl, {
+                                new TypeExpr( new StructInstType( noQualifiers, decl ) ),
+                        } ) );
+                // Its only an error if we want a vtable and don't have one.
+                } else if ( ! vtable_name.empty() ) {
+                        SemanticError( decl, context_error );
+                }
+        }
 …
         // Suspend keyword implementation
         //=============================================================================================
-        DeclarationWithType * SuspendKeyword::is_main( FunctionDecl * func) {
-                if(func->name != "main") return nullptr;
-                if(func->type->parameters.size() != 1) return nullptr;
-                auto param = func->type->parameters.front();
-                auto type  = dynamic_cast<ReferenceType * >(param->get_type());
-                if(!type) return nullptr;
-                auto obj   = dynamic_cast<StructInstType *>(type->base);
-                if(!obj) return nullptr;
-                if(!obj->baseStruct->is_generator()) return nullptr;
-                return param;
+        }
         bool SuspendKeyword::is_real_suspend( FunctionDecl * func ) {
                 if(isMangled(func->linkage)) return false; // the real suspend isn't mangled
 …
                 // Is this the main of a generator?
                 auto param = is_main( func );
+                auto param = isMainFor( func, AggregateDecl::Aggregate::Generator );
                 if(!param) return;
 …
+                                        {
                                                 new SingleInit( new AddressExpr( new VariableExpr( monitors ) ) ),
+                                                new SingleInit( new CastExpr( new VariableExpr( func ), generic_func->clone(), false ) )
+                                                new SingleInit( new CastExpr( new VariableExpr( func ), generic_func->clone(), false ) ),
+                                                new SingleInit( new ConstantExpr( Constant::from_bool( false ) ) )
                                         },
                                         noDesignators,
 …
 // tab-width: 4 //
 // End: //

src/GenPoly/InstantiateGeneric.cc

-              rae2c27a
+              rc76bd34
                 InstantiationMap< AggregateDecl, AggregateDecl > instantiations;
                 /// Set of types which are dtype-only generic (and therefore have static layout)
                 ScopedSet< AggregateDecl* > dtypeStatics;
+                std::set<AggregateDecl *> dtypeStatics;
                 /// Namer for concrete types
                 UniqueName typeNamer;
 …
         void GenericInstantiator::beginScope() {
                 instantiations.beginScope();
-                dtypeStatics.beginScope();
+        }
         void GenericInstantiator::endScope() {
                 instantiations.endScope();
-                dtypeStatics.endScope();
+        }

src/InitTweak/InitTweak.cc

-              rae2c27a
+              rc76bd34
                 if ( ftype->params.size() != 2 ) return false;
                 const ast::Type * t1 = getPointerBase( ftype->params.front()->get_type() );
+                const ast::Type * t1 = getPointerBase( ftype->params.front() );
                 if ( ! t1 ) return false;
                 const ast::Type * t2 = ftype->params.back()->get_type();
+                const ast::Type * t2 = ftype->params.back();
                 return ResolvExpr::typesCompatibleIgnoreQualifiers( t1, t2, ast::SymbolTable{} );

src/Parser/lex.ll

-              rae2c27a
+              rc76bd34
  * Created On       : Sat Sep 22 08:58:10 2001
  * Last Modified By : Peter A. Buhr
  * Last Modified On : Sat Feb 15 11:05:50 2020
  * Update Count     : 737
+ * Last Modified On : Tue Oct  6 18:15:41 2020
+ * Update Count     : 743
  */
 …
 #define IDENTIFIER_RETURN()     RETURN_VAL( typedefTable.isKind( yytext ) )
 #ifdef HAVE_KEYWORDS_FLOATXX                                                            // GCC >= 7 => keyword, otherwise typedef
+#ifdef HAVE_KEYWORDS_FLOATXX                                                    // GCC >= 7 => keyword, otherwise typedef
 #define FLOATXX(v) KEYWORD_RETURN(v);
 #else
 …
 __restrict__    { KEYWORD_RETURN(RESTRICT); }                   // GCC
 return                  { KEYWORD_RETURN(RETURN); }
         /* resume                       { KEYWORD_RETURN(RESUME); }                             // CFA */
+ /* resume                      { KEYWORD_RETURN(RESUME); }                             // CFA */
 short                   { KEYWORD_RETURN(SHORT); }
 signed                  { KEYWORD_RETURN(SIGNED); }

src/Parser/parser.yy

-              rae2c27a
+              rc76bd34
 // Created On       : Sat Sep  1 20:22:55 2001
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Thu May 28 12:11:45 2020
 // Update Count     : 4500
+// Last Modified On : Tue Oct  6 18:24:18 2020
+// Update Count     : 4610
 //
 …
 %token OTYPE FTYPE DTYPE TTYPE TRAIT                                    // CFA
 %token SIZEOF OFFSETOF
 // %token RESUME                                                                        // CFA
 %token SUSPEND                                                                  // CFA
+// %token RESUME                                                                                        // CFA
+%token SUSPEND                                                                                  // CFA
 %token ATTRIBUTE EXTENSION                                                              // GCC
 %token IF ELSE SWITCH CASE DEFAULT DO WHILE FOR BREAK CONTINUE GOTO RETURN
 …
 %type<en> conditional_expression                constant_expression                     assignment_expression           assignment_expression_opt
 %type<en> comma_expression                              comma_expression_opt
 %type<en> argument_expression_list_opt          argument_expression                     default_initialize_opt
+%type<en> argument_expression_list_opt  argument_expression                     default_initialize_opt
 %type<ifctl> if_control_expression
 %type<fctl> for_control_expression              for_control_expression_list
 …
 %type<decl> assertion assertion_list assertion_list_opt
 %type<en>   bit_subrange_size_opt bit_subrange_size
+%type<en> bit_subrange_size_opt bit_subrange_size
 %type<decl> basic_declaration_specifier basic_type_name basic_type_specifier direct_type indirect_type
 …
         | '(' aggregate_control '&' ')' cast_expression         // CFA
                 { $$ = new ExpressionNode( build_keyword_cast( $2, $5 ) ); }
-                // VIRTUAL cannot be opt because of look ahead issues
         | '(' VIRTUAL ')' cast_expression                                       // CFA
                 { $$ = new ExpressionNode( new VirtualCastExpr( maybeMoveBuild< Expression >( $4 ), maybeMoveBuildType( nullptr ) ) ); }
 …
         | unary_expression assignment_operator assignment_expression
+                {
                         if ( $2 == OperKinds::AtAssn ) {
                                 SemanticError( yylloc, "C @= assignment is currently unimplemented." ); $$ = nullptr;
                         } else {
+//                      if ( $2 == OperKinds::AtAssn ) {
+//                              SemanticError( yylloc, "C @= assignment is currently unimplemented." ); $$ = nullptr;
+//                      } else {
                                 $$ = new ExpressionNode( build_binary_val( $2, $1, $3 ) );
                         } // if
+//                      } // if
+                }
         | unary_expression '=' '{' initializer_list_opt comma_opt '}'
 …
 typedef_expression:
                 // GCC, naming expression type: typedef name = exp; gives a name to the type of an expression
+                // deprecated GCC, naming expression type: typedef name = exp; gives a name to the type of an expression
         TYPEDEF identifier '=' assignment_expression
+                {
+                        // $$ = DeclarationNode::newName( 0 );                  // unimplemented
+                        SemanticError( yylloc, "Typedef expression is currently unimplemented." ); $$ = nullptr;
+                        SemanticError( yylloc, "Typedef expression is deprecated, use typeof(...) instead." ); $$ = nullptr;
+                }
         | typedef_expression pop ',' push identifier '=' assignment_expression
+                {
+                        // $$ = DeclarationNode::newName( 0 );                  // unimplemented
+                        SemanticError( yylloc, "Typedef expression is currently unimplemented." ); $$ = nullptr;
+                }
+        ;
+//c_declaration:
+//      declaring_list pop ';'
+//      | typedef_declaration pop ';'
+//      | typedef_expression pop ';'                                            // GCC, naming expression type
+//      | sue_declaration_specifier pop ';'
+//      ;
+//
+//declaring_list:
+//              // A semantic check is required to ensure asm_name only appears on declarations with implicit or explicit static
+//              // storage-class
+//       declarator asm_name_opt initializer_opt
+//              {
+//                      typedefTable.addToEnclosingScope( IDENTIFIER );
+//                      $$ = ( $2->addType( $1 ))->addAsmName( $3 )->addInitializer( $4 );
+//              }
+//      | declaring_list ',' attribute_list_opt declarator asm_name_opt initializer_opt
+//              {
+//                      typedefTable.addToEnclosingScope( IDENTIFIER );
+//                      $$ = $1->appendList( $1->cloneBaseType( $4->addAsmName( $5 )->addInitializer( $6 ) ) );
+//              }
+//      ;
+                        SemanticError( yylloc, "Typedef expression is deprecated, use typeof(...) instead." ); $$ = nullptr;
+                }
+        ;
 c_declaration:
 …
                 { $$ = distAttr( $1, $2 ); }
         | typedef_declaration
         | typedef_expression                                                            // GCC, naming expression type
+        | typedef_expression                                                            // deprecated GCC, naming expression type
         | sue_declaration_specifier
+        ;
 …
                 { yyy = true; $$ = AggregateDecl::Union; }
         | EXCEPTION                                                                                     // CFA
+                { yyy = true; $$ = AggregateDecl::Exception; }
+                // { yyy = true; $$ = AggregateDecl::Exception; }
+                { SemanticError( yylloc, "exception aggregate is currently unimplemented." ); $$ = AggregateDecl::NoAggregate; }
+        ;

src/ResolvExpr/CandidateFinder.cpp

-              rae2c27a
+              rc76bd34
                         // mark conversion cost and also specialization cost of param type
                         const ast::Type * paramType = (*param)->get_type();
+                        // const ast::Type * paramType = (*param)->get_type();
                         cand->expr = ast::mutate_field_index(
                                 appExpr, &ast::ApplicationExpr::args, i,
                                 computeExpressionConversionCost(
                                         args[i], paramType, symtab, cand->env, convCost ) );
                         convCost.decSpec( specCost( paramType ) );
+                                        args[i], *param, symtab, cand->env, convCost ) );
+                        convCost.decSpec( specCost( *param ) );
                         ++param;  // can't be in for-loop update because of the continue
+                }
 …
                         if ( targetType && ! targetType->isVoid() && ! funcType->returns.empty() ) {
                                 // attempt to narrow based on expected target type
                                 const ast::Type * returnType = funcType->returns.front()->get_type();
+                                const ast::Type * returnType = funcType->returns.front();
                                 if ( ! unify(
                                         returnType, targetType, funcEnv, funcNeed, funcHave, funcOpen, symtab )
 …
                         std::size_t genStart = 0;
+                        for ( const ast::DeclWithType * param : funcType->params ) {
+                                auto obj = strict_dynamic_cast< const ast::ObjectDecl * >( param );
+                        // xxx - how to handle default arg after change to ftype representation?
+                        if (const ast::VariableExpr * varExpr = func->expr.as<ast::VariableExpr>()) {
+                                if (const ast::FunctionDecl * funcDecl = varExpr->var.as<ast::FunctionDecl>()) {
+                                        // function may have default args only if directly calling by name
+                                        // must use types on candidate however, due to RenameVars substitution
+                                        auto nParams = funcType->params.size();
+                                        for (size_t i=0; i<nParams; ++i) {
+                                                auto obj = funcDecl->params[i].strict_as<ast::ObjectDecl>();
+                                                if (!instantiateArgument(
+                                                        funcType->params[i], obj->init, args, results, genStart, symtab)) return;
+                                        }
+                                        goto endMatch;
+                                }
+                        }
+                        for ( const auto & param : funcType->params ) {
                                 // Try adding the arguments corresponding to the current parameter to the existing
                                 // matches
+                                // no default args for indirect calls
                                 if ( ! instantiateArgument(
+                                        obj->type, obj->init, args, results, genStart, symtab ) ) return;
+                        }
+                                        param, nullptr, args, results, genStart, symtab ) ) return;
+                        }
+                        endMatch:
                         if ( funcType->isVarArgs ) {
                                 // append any unused arguments to vararg pack
 …
                 /// Adds aggregate member interpretations
                 void addAggMembers(
                         const ast::ReferenceToType * aggrInst, const ast::Expr * expr,
+                        const ast::BaseInstType * aggrInst, const ast::Expr * expr,
                         const Candidate & cand, const Cost & addedCost, const std::string & name
                 ) {
 …
                 void postvisit( const ast::UntypedOffsetofExpr * offsetofExpr ) {
                         const ast::ReferenceToType * aggInst;
+                        const ast::BaseInstType * aggInst;
                         if (( aggInst = offsetofExpr->type.as< ast::StructInstType >() )) ;
                         else if (( aggInst = offsetofExpr->type.as< ast::UnionInstType >() )) ;

src/ResolvExpr/ConversionCost.cc

rae2c27a	rc76bd34
520	520	return convertToReferenceCost( src, refType, srcIsLvalue, symtab, env, localPtrsAssignable );
521	521	} else {
522		ast::Pass<ConversionCost_new> converter( dst, srcIsLvalue, symtab, env, localConversionCost );
523		src->accept( converter );
524		return converter.core.cost;
	522	return ast::Pass<ConversionCost_new>::read( src, dst, srcIsLvalue, symtab, env, localConversionCost );
525	523	}
526	524	}
…	…
563	561	}
564	562	} else {
565		ast::Pass<ConversionCost_new> converter( dst, srcIsLvalue, symtab, env, localConversionCost );
566		src->accept( converter );
567		return converter.core.cost;
	563	return ast::Pass<ConversionCost_new>::read( src, dst, srcIsLvalue, symtab, env, localConversionCost );
568	564	}
569	565	} else {

src/ResolvExpr/ConversionCost.h

rae2c27a	rc76bd34
88	88	static size_t traceId;
89	89	Cost cost;
	90	Cost result() { return cost; }
90	91
91	92	ConversionCost_new( const ast::Type * dst, bool srcIsLvalue, const ast::SymbolTable & symtab,

src/ResolvExpr/CurrentObject.cc

-              rae2c27a
+              rc76bd34
         class SimpleIterator final : public MemberIterator {
                 CodeLocation location;
                 readonly< Type > type = nullptr;
+                const Type * type = nullptr;
         public:
                 SimpleIterator( const CodeLocation & loc, const Type * t ) : location( loc ), type( t ) {}
 …
         class ArrayIterator final : public MemberIterator {
                 CodeLocation location;
                 readonly< ArrayType > array = nullptr;
                 readonly< Type > base = nullptr;
+                const ArrayType * array = nullptr;
+                const Type * base = nullptr;
                 size_t index = 0;
                 size_t size = 0;
 …
         MemberIterator * createMemberIterator( const CodeLocation & loc, const Type * type ) {
                 if ( auto aggr = dynamic_cast< const ReferenceToType * >( type ) ) {
+                if ( auto aggr = dynamic_cast< const BaseInstType * >( type ) ) {
                         if ( auto sit = dynamic_cast< const StructInstType * >( aggr ) ) {
                                 return new StructIterator{ loc, sit };
 …
                                         dynamic_cast< const EnumInstType * >( type )
                                                 || dynamic_cast< const TypeInstType * >( type ),
                                         "Encountered unhandled ReferenceToType in createMemberIterator: %s",
+                                        "Encountered unhandled BaseInstType in createMemberIterator: %s",
                                                 toString( type ).c_str() );
                                 return new SimpleIterator{ loc, type };
 …
                                         DesignatorChain & d = *dit;
                                         PRINT( std::cerr << "____actual: " << t << std::endl; )
                                         if ( auto refType = dynamic_cast< const ReferenceToType * >( t ) ) {
+                                        if ( auto refType = dynamic_cast< const BaseInstType * >( t ) ) {
                                                 // concatenate identical field names
                                                 for ( const Decl * mem : refType->lookup( nexpr->name ) ) {

src/ResolvExpr/Resolver.cc

-              rae2c27a
+              rc76bd34
 #include "Common/PassVisitor.h"          // for PassVisitor
 #include "Common/SemanticError.h"        // for SemanticError
+#include "Common/Stats/ResolveTime.h"    // for ResolveTime::start(), ResolveTime::stop()
 #include "Common/utility.h"              // for ValueGuard, group_iterate
 #include "InitTweak/GenInit.h"
 …
                 /// Finds deleted expressions in an expression tree
                 struct DeleteFinder_new final : public ast::WithShortCircuiting {
                         const ast::DeletedExpr * delExpr = nullptr;
+                        const ast::DeletedExpr * result = nullptr;
                         void previsit( const ast::DeletedExpr * expr ) {
                                 if ( delExpr ) { visit_children = false; }
                                 else { delExpr = expr; }
+                                if ( result ) { visit_children = false; }
+                                else { result = expr; }
+                        }
                         void previsit( const ast::Expr * ) {
                                 if ( delExpr ) { visit_children = false; }
+                                if ( result ) { visit_children = false; }
+                        }
                 };
 …
         /// Check if this expression is or includes a deleted expression
         const ast::DeletedExpr * findDeletedExpr( const ast::Expr * expr ) {
+                ast::Pass<DeleteFinder_new> finder;
+                expr->accept( finder );
+                return finder.core.delExpr;
+                return ast::Pass<DeleteFinder_new>::read( expr );
+        }
 …
                         const ast::Expr * untyped, const ast::SymbolTable & symtab
                 ) {
+                        return findKindExpression( untyped, symtab );
+                        Stats::ResolveTime::start( untyped );
+                        auto res = findKindExpression( untyped, symtab );
+                        Stats::ResolveTime::stop();
+                        return res;
+                }
         } // anonymous namespace
 …
                 template<typename Iter>
                 inline bool nextMutex( Iter & it, const Iter & end ) {
                         while ( it != end && ! (*it)->get_type()->is_mutex() ) { ++it; }
+                        while ( it != end && ! (*it)->is_mutex() ) { ++it; }
                         return it != end;
+                }
 …
                 const ast::ThrowStmt *       previsit( const ast::ThrowStmt * );
                 const ast::CatchStmt *       previsit( const ast::CatchStmt * );
+                const ast::CatchStmt *       postvisit( const ast::CatchStmt * );
                 const ast::WaitForStmt *     previsit( const ast::WaitForStmt * );
 …
         const ast::CatchStmt * Resolver_new::previsit( const ast::CatchStmt * catchStmt ) {
+                // TODO: This will need a fix for the decl/cond scoping problem.
+                // Until we are very sure this invarent (ifs that move between passes have thenPart)
+                // holds, check it. This allows a check for when to decode the mangling.
+                if ( auto ifStmt = catchStmt->body.as<ast::IfStmt>() ) {
+                        assert( ifStmt->thenPart );
+                }
+                // Encode the catchStmt so the condition can see the declaration.
                 if ( catchStmt->cond ) {
+                        ast::ptr< ast::Type > boolType = new ast::BasicType{ ast::BasicType::Bool };
+                        catchStmt = ast::mutate_field(
+                                catchStmt, &ast::CatchStmt::cond,
+                                findSingleExpression( catchStmt->cond, boolType, symtab ) );
+                        ast::CatchStmt * stmt = mutate( catchStmt );
+                        stmt->body = new ast::IfStmt( stmt->location, stmt->cond, nullptr, stmt->body );
+                        stmt->cond = nullptr;
+                        return stmt;
+                }
+                return catchStmt;
+        }
+        const ast::CatchStmt * Resolver_new::postvisit( const ast::CatchStmt * catchStmt ) {
+                // Decode the catchStmt so everything is stored properly.
+                const ast::IfStmt * ifStmt = catchStmt->body.as<ast::IfStmt>();
+                if ( nullptr != ifStmt && nullptr == ifStmt->thenPart ) {
+                        assert( ifStmt->cond );
+                        assert( ifStmt->elsePart );
+                        ast::CatchStmt * stmt = ast::mutate( catchStmt );
+                        stmt->cond = ifStmt->cond;
+                        stmt->body = ifStmt->elsePart;
+                        // ifStmt should be implicately deleted here.
+                        return stmt;
+                }
                 return catchStmt;
 …
                                                                 // Check if the argument matches the parameter type in the current
                                                                 // scope
                                                                 ast::ptr< ast::Type > paramType = (*param)->get_type();
+                                                                // ast::ptr< ast::Type > paramType = (*param)->get_type();
                                                                 if (
                                                                         ! unify(
                                                                                 arg->expr->result, paramType, resultEnv, need, have, open,
+                                                                                arg->expr->result, *param, resultEnv, need, have, open,
                                                                                 symtab )
                                                                 ) {
 …
                                                                         ss << "candidate function not viable: no known conversion "
                                                                                 "from '";
                                                                         ast::print( ss, (*param)->get_type() );
+                                                                        ast::print( ss, *param );
                                                                         ss << "' to '";
                                                                         ast::print( ss, arg->expr->result );

src/ResolvExpr/SatisfyAssertions.cpp

-              rae2c27a
+              rc76bd34
                                         if ( ! func ) continue;
                                         for ( const ast::DeclWithType * param : func->params ) {
                                                 cost.decSpec( specCost( param->get_type() ) );
+                                        for ( const auto & param : func->params ) {
+                                                cost.decSpec( specCost( param ) );
+                                        }

src/ResolvExpr/SpecCost.cc

-              rae2c27a
+              rc76bd34
                 void previsit( const ast::FunctionType * fty ) {
                         int minCount = std::numeric_limits<int>::max();
                         updateMinimumPresent( minCount, fty->params, decl_type );
                         updateMinimumPresent( minCount, fty->returns, decl_type );
+                        updateMinimumPresent( minCount, fty->params, type_deref );
+                        updateMinimumPresent( minCount, fty->returns, type_deref );
                         // Add another level to minCount if set.
                         count = toNoneOrInc( minCount );

src/ResolvExpr/Unify.cc

-              rae2c27a
+              rc76bd34
         template< typename Iterator1, typename Iterator2 >
         bool unifyDeclList( Iterator1 list1Begin, Iterator1 list1End, Iterator2 list2Begin, Iterator2 list2End, TypeEnvironment &env, AssertionSet &needAssertions, AssertionSet &haveAssertions, const OpenVarSet &openVars, const SymTab::Indexer &indexer ) {
+        bool unifyTypeList( Iterator1 list1Begin, Iterator1 list1End, Iterator2 list2Begin, Iterator2 list2End, TypeEnvironment &env, AssertionSet &needAssertions, AssertionSet &haveAssertions, const OpenVarSet &openVars, const SymTab::Indexer &indexer ) {
                 auto get_type = [](DeclarationWithType * dwt){ return dwt->get_type(); };
                 for ( ; list1Begin != list1End && list2Begin != list2End; ++list1Begin, ++list2Begin ) {
 …
                                         || flatOther->isTtype()
                         ) {
                                 if ( unifyDeclList( flatFunc->parameters.begin(), flatFunc->parameters.end(), flatOther->parameters.begin(), flatOther->parameters.end(), env, needAssertions, haveAssertions, openVars, indexer ) ) {
                                         if ( unifyDeclList( flatFunc->returnVals.begin(), flatFunc->returnVals.end(), flatOther->returnVals.begin(), flatOther->returnVals.end(), env, needAssertions, haveAssertions, openVars, indexer ) ) {
+                                if ( unifyTypeList( flatFunc->parameters.begin(), flatFunc->parameters.end(), flatOther->parameters.begin(), flatOther->parameters.end(), env, needAssertions, haveAssertions, openVars, indexer ) ) {
+                                        if ( unifyTypeList( flatFunc->returnVals.begin(), flatFunc->returnVals.end(), flatOther->returnVals.begin(), flatOther->returnVals.end(), env, needAssertions, haveAssertions, openVars, indexer ) ) {
                                                 // the original types must be used in mark assertions, since pointer comparisons are used
 …
                 /// returns flattened version of `src`
                 static std::vector< ast::ptr< ast::DeclWithType > > flattenList(
                         const std::vector< ast::ptr< ast::DeclWithType > > & src, ast::TypeEnvironment & env
+                static std::vector< ast::ptr< ast::Type > > flattenList(
+                        const std::vector< ast::ptr< ast::Type > > & src, ast::TypeEnvironment & env
                 ) {
                         std::vector< ast::ptr< ast::DeclWithType > > dst;
+                        std::vector< ast::ptr< ast::Type > > dst;
                         dst.reserve( src.size() );
                         for ( const ast::DeclWithType * d : src ) {
+                        for ( const auto & d : src ) {
                                 ast::Pass<TtypeExpander_new> expander{ env };
                                 // TtypeExpander pass is impure (may mutate nodes in place)
                                 // need to make nodes shared to prevent accidental mutation
                                 ast::ptr<ast::DeclWithType> dc = d->accept(expander);
                                 auto types = flatten( dc->get_type() );
+                                ast::ptr<ast::Type> dc = d->accept(expander);
+                                auto types = flatten( dc );
                                 for ( ast::ptr< ast::Type > & t : types ) {
                                         // outermost const, volatile, _Atomic qualifiers in parameters should not play
 …
                                         // requirements than a non-mutex function
                                         remove_qualifiers( t, ast::CV::Const | ast::CV::Volatile | ast::CV::Atomic );
                                         dst.emplace_back( new ast::ObjectDecl{ dc->location, "", t } );
+                                        dst.emplace_back( t );
+                                }
+                        }
 …
                 /// Creates a tuple type based on a list of DeclWithType
                 template< typename Iter >
                 static ast::ptr< ast::Type > tupleFromDecls( Iter crnt, Iter end ) {
+                static ast::ptr< ast::Type > tupleFromTypes( Iter crnt, Iter end ) {
                         std::vector< ast::ptr< ast::Type > > types;
                         while ( crnt != end ) {
                                 // it is guaranteed that a ttype variable will be bound to a flat tuple, so ensure
                                 // that this results in a flat tuple
                                 flatten( (*crnt)->get_type(), types );
+                                flatten( *crnt, types );
                                 ++crnt;
 …
                 template< typename Iter >
                 static bool unifyDeclList(
+                static bool unifyTypeList(
                         Iter crnt1, Iter end1, Iter crnt2, Iter end2, ast::TypeEnvironment & env,
                         ast::AssertionSet & need, ast::AssertionSet & have, const ast::OpenVarSet & open,
 …
                 ) {
                         while ( crnt1 != end1 && crnt2 != end2 ) {
                                 const ast::Type * t1 = (*crnt1)->get_type();
                                 const ast::Type * t2 = (*crnt2)->get_type();
+                                const ast::Type * t1 = *crnt1;
+                                const ast::Type * t2 = *crnt2;
                                 bool isTuple1 = Tuples::isTtype( t1 );
                                 bool isTuple2 = Tuples::isTtype( t2 );
 …
                                         // combine remainder of list2, then unify
                                         return unifyExact(
                                                 t1, tupleFromDecls( crnt2, end2 ), env, need, have, open,
+                                                t1, tupleFromTypes( crnt2, end2 ), env, need, have, open,
                                                 noWiden(), symtab );
                                 } else if ( ! isTuple1 && isTuple2 ) {
                                         // combine remainder of list1, then unify
                                         return unifyExact(
                                                 tupleFromDecls( crnt1, end1 ), t2, env, need, have, open,
+                                                tupleFromTypes( crnt1, end1 ), t2, env, need, have, open,
                                                 noWiden(), symtab );
+                                }
 …
                         if ( crnt1 != end1 ) {
                                 // try unifying empty tuple with ttype
                                 const ast::Type * t1 = (*crnt1)->get_type();
+                                const ast::Type * t1 = *crnt1;
                                 if ( ! Tuples::isTtype( t1 ) ) return false;
                                 return unifyExact(
                                         t1, tupleFromDecls( crnt2, end2 ), env, need, have, open,
+                                        t1, tupleFromTypes( crnt2, end2 ), env, need, have, open,
                                         noWiden(), symtab );
                         } else if ( crnt2 != end2 ) {
                                 // try unifying empty tuple with ttype
                                 const ast::Type * t2 = (*crnt2)->get_type();
+                                const ast::Type * t2 = *crnt2;
                                 if ( ! Tuples::isTtype( t2 ) ) return false;
                                 return unifyExact(
                                         tupleFromDecls( crnt1, end1 ), t2, env, need, have, open,
+                                        tupleFromTypes( crnt1, end1 ), t2, env, need, have, open,
                                         noWiden(), symtab );
+                        }
 …
+                }
                 static bool unifyDeclList(
                         const std::vector< ast::ptr< ast::DeclWithType > > & list1,
                         const std::vector< ast::ptr< ast::DeclWithType > > & list2,
+                static bool unifyTypeList(
+                        const std::vector< ast::ptr< ast::Type > > & list1,
+                        const std::vector< ast::ptr< ast::Type > > & list2,
                         ast::TypeEnvironment & env, ast::AssertionSet & need, ast::AssertionSet & have,
                         const ast::OpenVarSet & open, const ast::SymbolTable & symtab
                 ) {
                         return unifyDeclList(
+                        return unifyTypeList(
                                 list1.begin(), list1.end(), list2.begin(), list2.end(), env, need, have, open,
                                 symtab );
 …
                         ) return;
                         if ( ! unifyDeclList( params, params2, tenv, need, have, open, symtab ) ) return;
                         if ( ! unifyDeclList(
+                        if ( ! unifyTypeList( params, params2, tenv, need, have, open, symtab ) ) return;
+                        if ( ! unifyTypeList(
                                 func->returns, func2->returns, tenv, need, have, open, symtab ) ) return;
 …
         ast::ptr<ast::Type> extractResultType( const ast::FunctionType * func ) {
                 if ( func->returns.empty() ) return new ast::VoidType{};
                 if ( func->returns.size() == 1 ) return func->returns[0]->get_type();
+                if ( func->returns.size() == 1 ) return func->returns[0];
                 std::vector<ast::ptr<ast::Type>> tys;
                 for ( const ast::DeclWithType * decl : func->returns ) {
                         tys.emplace_back( decl->get_type() );
+                for ( const auto & decl : func->returns ) {
+                        tys.emplace_back( decl );
+                }
                 return new ast::TupleType{ std::move(tys) };

src/SymTab/Mangler.cc

-              rae2c27a
+              rc76bd34
                   private:
                         void mangleDecl( const ast::DeclWithType *declaration );
                         void mangleRef( const ast::ReferenceToType *refType, std::string prefix );
+                        void mangleRef( const ast::BaseInstType *refType, std::string prefix );
                         void printQualifiers( const ast::Type *type );
 …
                         GuardValue( inFunctionType );
                         inFunctionType = true;
+                        std::vector< ast::ptr< ast::Type > > returnTypes = getTypes( functionType->returns );
+                        if (returnTypes.empty()) mangleName << Encoding::void_t;
+                        else accept_each( returnTypes, *visitor );
+                        if (functionType->returns.empty()) mangleName << Encoding::void_t;
+                        else accept_each( functionType->returns, *visitor );
                         mangleName << "_";
+                        std::vector< ast::ptr< ast::Type > > paramTypes = getTypes( functionType->params );
+                        accept_each( paramTypes, *visitor );
+                        accept_each( functionType->params, *visitor );
                         mangleName << "_";
+                }
                 void Mangler_new::mangleRef( const ast::ReferenceToType * refType, std::string prefix ) {
+                void Mangler_new::mangleRef( const ast::BaseInstType * refType, std::string prefix ) {
                         printQualifiers( refType );

src/SymTab/Validate.cc

-              rae2c27a
+              rc76bd34
+        }
+        static bool isNonParameterAttribute( Attribute * attr ) {
+                static const std::vector<std::string> bad_names = {
+                        "aligned", "__aligned__",
+                };
+                for ( auto name : bad_names ) {
+                        if ( name == attr->name ) {
+                                return true;
+                        }
+                }
+                return false;
+        }
         Type * ReplaceTypedef::postmutate( TypeInstType * typeInst ) {
                 // instances of typedef types will come here. If it is an instance
 …
                         ret->location = typeInst->location;
                         ret->get_qualifiers() |= typeInst->get_qualifiers();
+                        // attributes are not carried over from typedef to function parameters/return values
+                        if ( ! inFunctionType ) {
+                                ret->attributes.splice( ret->attributes.end(), typeInst->attributes );
+                        } else {
+                                deleteAll( ret->attributes );
+                                ret->attributes.clear();
+                        }
+                        // GCC ignores certain attributes if they arrive by typedef, this mimics that.
+                        if ( inFunctionType ) {
+                                ret->attributes.remove_if( isNonParameterAttribute );
+                        }
+                        ret->attributes.splice( ret->attributes.end(), typeInst->attributes );
                         // place instance parameters on the typedef'd type
                         if ( ! typeInst->parameters.empty() ) {
 …
         /// Replaces enum types by int, and function/array types in function parameter and return
         /// lists by appropriate pointers
+        /*
         struct EnumAndPointerDecay_new {
                 const ast::EnumDecl * previsit( const ast::EnumDecl * enumDecl ) {
 …
+                }
         };
+        */
         /// expand assertions from a trait instance, performing appropriate type variable substitutions
 …
+                }
                 void checkGenericParameters( const ast::ReferenceToType * inst ) {
+                void checkGenericParameters( const ast::BaseInstType * inst ) {
                         for ( const ast::Expr * param : inst->params ) {
                                 if ( ! dynamic_cast< const ast::TypeExpr * >( param ) ) {
 …
 const ast::Type * validateType(
                 const CodeLocation & loc, const ast::Type * type, const ast::SymbolTable & symtab ) {
         ast::Pass< EnumAndPointerDecay_new > epc;
+        // ast::Pass< EnumAndPointerDecay_new > epc;
         ast::Pass< LinkReferenceToTypes_new > lrt{ loc, symtab };
         ast::Pass< ForallPointerDecay_new > fpd{ loc };
         return type->accept( epc )->accept( lrt )->accept( fpd );
+        return type->accept( lrt )->accept( fpd );
+}

src/Virtual/module.mk

-              rae2c27a
+              rc76bd34
 ###############################################################################
+SRC += Virtual/ExpandCasts.cc Virtual/ExpandCasts.h
+SRC += Virtual/ExpandCasts.cc Virtual/ExpandCasts.h \
+        Virtual/Tables.cc Virtual/Tables.h
+SRCDEMANGLE += Virtual/Tables.cc

tests/.expect/array.txt

rae2c27a	rc76bd34
	1	array.cfa: In function '_X4mainFi___1':
	2	array.cfa:55:9: note: #pragma message: Compiled

tests/.expect/cast.txt

rae2c27a	rc76bd34
	1	cast.cfa: In function '_X4mainFi_iPPKc__1':
	2	cast.cfa:18:9: note: #pragma message: Compiled

tests/.expect/enum.txt

rae2c27a	rc76bd34
	1	done

tests/.expect/expression.txt

rae2c27a	rc76bd34
	1	expression.cfa: In function '_X4mainFi___1':
	2	expression.cfa:89:9: note: #pragma message: Compiled

tests/.expect/forall.txt

rae2c27a	rc76bd34
	1	forall.cfa: In function '_X4mainFi___1':
	2	forall.cfa:218:9: note: #pragma message: Compiled

tests/.expect/heap.txt

rae2c27a	rc76bd34
	1	done

tests/.expect/identFuncDeclarator.txt

rae2c27a	rc76bd34
	1	identFuncDeclarator.cfa: In function '_X4mainFi___1':
	2	identFuncDeclarator.cfa:116:9: note: #pragma message: Compiled

tests/.expect/identParamDeclarator.txt

rae2c27a	rc76bd34
	1	done

tests/.expect/labelledExit.txt

rae2c27a	rc76bd34
	1	labelledExit.cfa: In function '_X4mainFi_iPPKc__1':
	2	labelledExit.cfa:183:9: note: #pragma message: Compiled

tests/.expect/limits.txt

rae2c27a	rc76bd34
	1	limits.cfa: In function '_X4mainFi_iPPKc__1':
	2	limits.cfa:151:9: note: #pragma message: Compiled

tests/.expect/maybe.txt

rae2c27a	rc76bd34
	1	done

tests/.expect/nested-types.txt

rae2c27a	rc76bd34
	1	nested-types.cfa: In function '_X4mainFi___1':
	2	nested-types.cfa:102:9: note: #pragma message: Compiled

tests/.expect/numericConstants.txt

rae2c27a	rc76bd34
	1	numericConstants.cfa: In function '_X4mainFi___1':
	2	numericConstants.cfa:68:9: note: #pragma message: Compiled

tests/.expect/operators.txt

rae2c27a	rc76bd34
	1	done

tests/.expect/result.txt

rae2c27a	rc76bd34
	1	done

tests/.expect/stdincludes.txt

rae2c27a	rc76bd34
	1	stdincludes.cfa: In function '_X4mainFi___1':
	2	stdincludes.cfa:52:9: note: #pragma message: Compiled

tests/.expect/switch.txt

rae2c27a	rc76bd34
	1	switch.cfa: In function '_X4mainFi___1':
	2	switch.cfa:105:9: note: #pragma message: Compiled

tests/.expect/typedefRedef-ERR1.txt

rae2c27a	rc76bd34
1	1	typedefRedef.cfa:4:1 error: Cannot redefine typedef: Foo
2		typedefRedef.cfa:60:1 error: Cannot redefine typedef: ARR
	2	typedefRedef.cfa:59:1 error: Cannot redefine typedef: ARR

tests/.expect/typedefRedef.txt

rae2c27a	rc76bd34
	1	typedefRedef.cfa: In function '_X4mainFi___1':
	2	typedefRedef.cfa:71:9: note: #pragma message: Compiled

tests/.expect/typeof.txt

rae2c27a	rc76bd34
	1	done

tests/.expect/variableDeclarator.txt

rae2c27a	rc76bd34
	1	variableDeclarator.cfa: In function '_X4mainFi_iPPKc__1':
	2	variableDeclarator.cfa:182:9: note: #pragma message: Compiled

tests/.expect/voidPtr.txt

rae2c27a	rc76bd34
	1	done

tests/Makefile.am

-              rae2c27a
+              rc76bd34
 ## Created On       : Sun May 31 09:08:15 2015
 ## Last Modified By : Peter A. Buhr
 ## Last Modified On : Tue Nov 20 11:18:51 2018
 ## Update Count     : 68
+## Last Modified On : Sun Sep 27 19:01:41 2020
+## Update Count     : 84
 ###############################################################################
 …
 # since automake doesn't have support for CFA we have to
 AM_CFLAGS = $(if $(test), 2> $(test), ) \
+        -fdebug-prefix-map=$(abspath ${abs_srcdir})= \
+        -fdebug-prefix-map=/tmp= \
+        -fno-diagnostics-show-caret \
         -g \
         -Wall \
 …
 # adjust CC to current flags
 CC = $(if $(DISTCC_CFA_PATH),distcc $(DISTCC_CFA_PATH) ${ARCH_FLAGS},$(TARGET_CFA) ${DEBUG_FLAGS} ${ARCH_FLAGS})
+CC = LC_ALL=C $(if $(DISTCC_CFA_PATH),distcc $(DISTCC_CFA_PATH) ${ARCH_FLAGS},$(TARGET_CFA) ${DEBUG_FLAGS} ${ARCH_FLAGS})
 CFACC = $(CC)
 …
 # adjusted CC but without the actual distcc call
 CFACCLOCAL = $(if $(DISTCC_CFA_PATH),$(DISTCC_CFA_PATH) ${ARCH_FLAGS},$(TARGET_CFA) ${DEBUG_FLAGS} ${ARCH_FLAGS})
+CFACCLINK = $(CFACCLOCAL) -quiet $(if $(test), 2> $(test), ) $($(shell echo "${@}_FLAGSLD" | sed 's/-\|\//_/g'))
 PRETTY_PATH=mkdir -p $(dir $(abspath ${@})) && cd ${srcdir} &&
 …
 % : %.cfa $(CFACCBIN)
         $(CFACOMPILETEST) -c -o $(abspath ${@}).o
+        $(CFACCLOCAL) $($(shell echo "${@}_FLAGSLD" | sed 's/-\|\//_/g')) $(abspath ${@}).o -o $(abspath ${@})
+        $(CFACCLINK) ${@}.o -o $(abspath ${@})
+        rm $(abspath ${@}).o
 # implicit rule for c++ test
 …
         $(CFACOMPILETEST) -CFA -XCFA -p -c -fsyntax-only -o $(abspath ${@})
-# Use for tests where the make command is expected to succeed but the expected.txt should be compared to stderr
-EXPECT_STDERR = builtins/sync warnings/self-assignment
-$(EXPECT_STDERR): % : %.cfa $(CFACCBIN)
-        $(CFACOMPILETEST) -c -fsyntax-only 2> $(abspath ${@})
 #------------------------------------------------------------------------------
 # CUSTOM TARGET
 #------------------------------------------------------------------------------
+# tests that just validate syntax and compiler output should be compared to stderr
+CFACOMPILE_SYNTAX = $(CFACOMPILETEST) -Wno-unused-variable -Wno-unused-label -c -fsyntax-only -o $(abspath ${@})
+SYNTAX_ONLY_CODE = expression typedefRedef variableDeclarator switch numericConstants identFuncDeclarator forall \
+        limits nested-types stdincludes cast labelledExit array builtins/sync warnings/self-assignment
+$(SYNTAX_ONLY_CODE): % : %.cfa $(CFACCBIN)
+        $(CFACOMPILE_SYNTAX)
+        $(if $(test), cp $(test) $(abspath ${@}), )
 # expected failures
 # use custom target since they require a custom define and custom dependencies
+# use custom target since they require a custom define *and* have a name that doesn't match the file
 alloc-ERROR : alloc.cfa $(CFACCBIN)
+        $(CFACOMPILETEST) -DERR1 -c -fsyntax-only -o $(abspath ${@})
+        $(CFACOMPILE_SYNTAX) -DERR1
+        -cp $(test) $(abspath ${@})
 typedefRedef-ERR1 : typedefRedef.cfa $(CFACCBIN)
+        $(CFACOMPILETEST) -DERR1 -c -fsyntax-only -o $(abspath ${@})
+        $(CFACOMPILE_SYNTAX) -DERR1
+        -cp $(test) $(abspath ${@})
 nested-types-ERR1 : nested-types.cfa $(CFACCBIN)
+        $(CFACOMPILETEST) -DERR1 -c -fsyntax-only -o $(abspath ${@})
+        $(CFACOMPILE_SYNTAX) -DERR1
+        -cp $(test) $(abspath ${@})
 nested-types-ERR2 : nested-types.cfa $(CFACCBIN)
+        $(CFACOMPILETEST) -DERR2 -c -fsyntax-only -o $(abspath ${@})
+        $(CFACOMPILE_SYNTAX) -DERR2
+        -cp $(test) $(abspath ${@})
 raii/memberCtors-ERR1 : raii/memberCtors.cfa $(CFACCBIN)
+        $(CFACOMPILETEST) -DERR1 -c -fsyntax-only -o $(abspath ${@})
+        $(CFACOMPILE_SYNTAX) -DERR1
+        -cp $(test) $(abspath ${@})
 raii/ctor-autogen-ERR1 : raii/ctor-autogen.cfa $(CFACCBIN)
+        $(CFACOMPILETEST) -DERR1 -c -fsyntax-only -o $(abspath ${@})
+        $(CFACOMPILE_SYNTAX) -DERR1
+        -cp $(test) $(abspath ${@})
 raii/dtor-early-exit-ERR1 : raii/dtor-early-exit.cfa $(CFACCBIN)
+        $(CFACOMPILETEST) -DERR1 -c -fsyntax-only -o $(abspath ${@})
+        $(CFACOMPILE_SYNTAX) -DERR1
+        -cp $(test) $(abspath ${@})
 raii/dtor-early-exit-ERR2 : raii/dtor-early-exit.cfa $(CFACCBIN)
+        $(CFACOMPILETEST) -DERR2 -c -fsyntax-only -o $(abspath ${@})
+        $(CFACOMPILE_SYNTAX) -DERR2
+        -cp $(test) $(abspath ${@})
 # Exception Tests
 …
         $(CFACCLOCAL) $($(shell echo "${@}_FLAGSLD" | sed 's/-\|\//_/g')) $(abspath ${@}).o -o $(abspath ${@})
+# Linking tests
+# Meta tests to make sure we see linking errors (can't compile with -O2 since it may multiply number of calls)
+linking/linkerror : linking/linkerror.cfa $(CFACCBIN)
+        $(CFACOMPILETEST) -O0 -c -o $(abspath ${@}).o
+        $(CFACCLINK)  -O0 ${@}.o -o $(abspath ${@})
+        rm $(abspath ${@}).o
 #------------------------------------------------------------------------------
 # Other targets

tests/alloc2.cfa

-              rae2c27a
+              rc76bd34
 void test_base( void * ip, size_t size, size_t align) {
         tests_total += 1;
+//      printf("DEBUG: starting test %d\n", tests_total);
         bool passed = (malloc_size(ip) == size) && (malloc_usable_size(ip) >= size) && (malloc_alignment(ip) == align) && ((uintptr_t)ip % align  == 0);
         if (!passed) {
 …
                 tests_failed += 1;
+        }
+//      printf("DEBUG: done test %d\n", tests_total);
+}
 void test_fill( void * ip_, size_t start, size_t end, char fill) {
         tests_total += 1;
+//      printf("DEBUG: starting test %d\n", tests_total);
         bool passed = true;
         char * ip = (char *) ip_;
 …
                 tests_failed += 1;
+        }
+//      printf("DEBUG: done test %d\n", tests_total);
+}
 void test_fill( void * ip_, size_t start, size_t end, int fill) {
         tests_total += 1;
+//      printf("DEBUG: starting test %d\n", tests_total);
         bool passed = true;
         int * ip = (int *) ip_;
 …
                 tests_failed += 1;
+        }
+//      printf("DEBUG: done test %d\n", tests_total);
+}
 void test_fill( void * ip_, size_t start, size_t end, int * fill) {
         tests_total += 1;
+//      printf("DEBUG: starting test %d\n", tests_total);
         bool passed = (memcmp((void*)((uintptr_t)ip_ + start), (void*)fill, end) == 0);
         if (!passed) {
 …
                 tests_failed += 1;
+        }
+//      printf("DEBUG: done test %d\n", tests_total);
+}
 void test_fill( void * ip_, size_t start, size_t end, T1 fill) {
         tests_total += 1;
+//      printf("DEBUG: starting test %d\n", tests_total);
         bool passed = true;
         T1 * ip = (T1 *) ip_;
 …
                 tests_failed += 1;
+        }
+//      printf("DEBUG: done test %d\n", tests_total);
+}
 void test_fill( void * ip_, size_t start, size_t end, T1 * fill) {
         tests_total += 1;
+//      printf("DEBUG: starting test %d\n", tests_total);
         bool passed = (memcmp((void*)((uintptr_t)ip_ + start), (void*)fill, end) == 0);
         if (!passed) {
 …
                 tests_failed += 1;
+        }
+//      printf("DEBUG: done test %d\n", tests_total);
+}
 void test_use( int * ip, size_t dim) {
         tests_total += 1;
+//      printf("DEBUG: starting test %d\n", tests_total);
         bool passed = true;
         for (i; 0 ~ dim) ip[i] = 0xdeadbeef;
 …
                 tests_failed += 1;
+        }
+//      printf("DEBUG: done test %d\n", tests_total);
+}
 void test_use( T1 * ip, size_t dim) {
         tests_total += 1;
+//      printf("DEBUG: starting test %d\n", tests_total);
         bool passed = true;
         for (i; 0 ~ dim) ip[i].data = 0xdeadbeef;
 …
                 tests_failed += 1;
+        }
+//      printf("DEBUG: done test %d\n", tests_total);
+}

tests/array.cfa

-              rae2c27a
+              rc76bd34
 //                               -*- Mode: C -*-
 //
+//                               -*- Mode: C -*-
+//
 // Cforall Version 1.0.0 Copyright (C) 2016 University of Waterloo
 //
 // The contents of this file are covered under the licence agreement in the
 // file "LICENCE" distributed with Cforall.
 //
+//
 // array.cfa -- test array declarations
 //
+//
 // Author           : Peter A. Buhr
 // Created On       : Tue Feb 19 21:18:06 2019
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Tue Feb 19 21:18:46 2019
 // Update Count     : 1
 //
+// Last Modified On : Sun Sep 27 09:05:40 2020
+// Update Count     : 4
+//
 int a1[];
+int a1[0];
 //int a2[*];
 //double a4[3.0];
 int m1[][3];
+int m1[0][3];
 //int m2[*][*];
 int m4[3][3];
 …
+}
+int main() {}
+int main() {
+        #if !defined(NO_COMPILED_PRAGMA)
+                #pragma message( "Compiled" )   // force non-empty .expect file
+        #endif
+}
 // Local Variables: //

tests/builtins/.expect/sync.txt

rae2c27a	rc76bd34
	1	builtins/sync.cfa: In function '_X4mainFi___1':
	2	builtins/sync.cfa:358:9: note: #pragma message: Compiled

tests/builtins/sync.cfa

-              rae2c27a
+              rc76bd34
         #if defined(__SIZEOF_INT128__)
         { __int128 ret; ret = __sync_fetch_and_nand(vplll, vlll); }
-        { __int128 ret; ret = __sync_fetch_and_nand_16(vplll, vlll); }
         #endif
 …
 int main() {
         return 0;
+        #pragma message( "Compiled" )                   // force non-empty .expect file
+}

tests/cast.cfa

-              rae2c27a
+              rc76bd34
 //Dummy main
+int main(int argc, char const *argv[])
+{
+        return 0;
+int main( int argc, char const * argv[] ) {
+        #pragma message( "Compiled" )                   // force non-empty .expect file
+}

tests/concurrent/.expect/cluster.txt

rae2c27a	rc76bd34
	1	done

tests/concurrent/cluster.cfa

rae2c27a	rc76bd34
32	32	}
33	33	}
34		~~return 0;~~
	34	printf( "done\n" ); // non-empty .expect file
35	35	}

tests/concurrent/examples/.expect/datingService.txt

rae2c27a	rc76bd34
	1	done

tests/concurrent/examples/datingService.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Mon Oct 30 12:56:20 2017
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Fri Jun 21 11:32:34 2019
 // Update Count     : 38
+// Last Modified On : Sun Sep 27 15:42:25 2020
+// Update Count     : 40
 //
 …
                 if ( girlck[ boyck[i] ] != boyck[ girlck[i] ] ) abort();
         } // for
+        printf( "done\n" );                                                                     // non-empty .expect file
 } // main

tests/concurrent/futures/.expect/basic.txt

rae2c27a	rc76bd34
	1	done

tests/concurrent/futures/basic.cfa

rae2c27a	rc76bd34
91	91	}
92	92	}
	93	printf( "done\n" ); // non-empty .expect file
	94
93	95	}

tests/concurrent/park/.expect/force_preempt.txt

rae2c27a	rc76bd34
	1	done

tests/concurrent/park/.expect/start_parked.txt

rae2c27a	rc76bd34
	1	done

tests/concurrent/park/contention.cfa

-              rae2c27a
+              rc76bd34
                 if(blocked[idx]) {
                         Thread * thrd = __atomic_exchange_n(&blocked[idx], 0p, __ATOMIC_SEQ_CST);
                         unpark( *thrd __cfaabi_dbg_ctx2 );
+                        unpark( *thrd );
                 } else {
                         Thread * thrd = __atomic_exchange_n(&blocked[idx], &this, __ATOMIC_SEQ_CST);
                         unpark( *thrd __cfaabi_dbg_ctx2 );
                         park( __cfaabi_dbg_ctx );
+                        unpark( *thrd );
+                        park();
+                }
+        }
 …
                         int idx = myrand() % blocked_size;
                         Thread * thrd = __atomic_exchange_n(&blocked[idx], 0p, __ATOMIC_SEQ_CST);
                         unpark( *thrd __cfaabi_dbg_ctx2 );
+                        unpark( *thrd );
                         yield( myrand() % 20 );
+                }

tests/concurrent/park/force_preempt.cfa

-              rae2c27a
+              rc76bd34
                 // Unpark this thread, don't force a yield
                 unpark( this __cfaabi_dbg_ctx2 );
+                unpark( this );
                 assert(mask == 0xCAFEBABA);
 …
                 // Park this thread,
                 assert(mask == (id_hash ^ 0xCAFEBABA));
                 park( __cfaabi_dbg_ctx );
+                park();
                 assert(mask == (id_hash ^ 0xCAFEBABA));
 …
                 Waiter waiters[5];
+        }
+        printf( "done\n" );                             // non-empty .expect file
+}

tests/concurrent/park/start_parked.cfa

-              rae2c27a
+              rc76bd34
 thread Parker {};
 void main( Parker & ) {
       park( __cfaabi_dbg_ctx );
+        park();
+}
 int main() {
+      for(1000) {
+            Parker parker;
+            unpark( parker __cfaabi_dbg_ctx2 );
+      }
+        for(1000) {
+                Parker parker;
+                unpark( parker );
+        }
+        printf( "done\n" );                                                                     // non-empty .expect file
+}

tests/enum.cfa

rae2c27a	rc76bd34
26	26	//Dummy main
27	27	int main(int argc, char const *argv[]) {
	28	printf( "done\n" ); // non-empty .expect file
28	29	}

tests/exceptions/.expect/virtual-cast.txt

rae2c27a	rc76bd34
	1	done

tests/exceptions/.expect/virtual-poly.txt

rae2c27a	rc76bd34
	1	done

tests/exceptions/virtual-cast.cfa

rae2c27a	rc76bd34
74	74	free(tri);
75	75	free(top);
	76	printf( "done\n" ); // non-empty .expect file
76	77	}

tests/exceptions/virtual-poly.cfa

rae2c27a	rc76bd34
77	77	mono_poly_test();
78	78	poly_poly_test();
	79	printf( "done\n" ); // non-empty .expect file
79	80	}

tests/expression.cfa

-              rae2c27a
+              rc76bd34
 int main() {
     int a[3] = { 0, 0, 0 };
     S s = { 3 }, * ps = &s;
     [int] t = { 3 };
     * [int] pt = &t;
     int i = 1, j = 2;
+        int a[3] = { 0, 0, 0 };
+        S s = { 3 }, * ps = &s;
+        [int] t = { 3 };
+        * [int] pt = &t;
+        int i = 1, j = 2;
     // operators
+        // operators
     !i;
     ~i;
     +i;
     -i;
     *ps;
     ++ps;
     --ps;
     ps++;
     ps--;
+        !i;
+        ~i;
+        +i;
+        -i;
+        *ps;
+        ++ps;
+        --ps;
+        ps++;
+        ps--;
     i + j;
     i - j;
     i * j;
+        i + j;
+        i - j;
+        i * j;
     i / j;
     i % j;
     i ^ j;
     i & j;
     i | j;
     i < j;
     i > j;
     i = j;
+        i / j;
+        i % j;
+        i ^ j;
+        i & j;
+        i | j;
+        i < j;
+        i > j;
+        i = j;
     i == j;
     i != j;
     i << j;
     i >> j;
     i <= j;
     i >= j;
     i && j;
     i || j;
     ps->i;
+        i == j;
+        i != j;
+        i << j;
+        i >> j;
+        i <= j;
+        i >= j;
+        i && j;
+        i || j;
+        ps->i;
     i *= j;
     i /= j;
     i %= j;
     i += j;
     i -= j;
     i &= j;
     i |= j;
     i ^= j;
     i <<= j;
     i >>= j;
+        i *= j;
+        i /= j;
+        i %= j;
+        i += j;
+        i -= j;
+        i &= j;
+        i |= j;
+        i ^= j;
+        i <<= j;
+        i >>= j;
     i ? i : j;
+        i ? i : j;
     // postfix function call
+        // postfix function call
+    (3 + 4)`mary;
+    ({3 + 4;})`mary;
+    [3, 4]`mary;
+`mary;
+    a[0]`mary;
+    a[0]`mary`mary;
+    s{0}`mary;
+    a[3]`jane++;
+    jack(3)`mary;
+    s.i`mary;
+    t.0`mary;
+    s.[i]`mary;
+    ps->i`mary;
+    pt->0`mary;
+    ps->[i]`mary;
+    i++`mary;
+    i--`mary;
+    (S){2}`mary;
+    (S)@{2}`mary;
+        (3 + 4)`mary;
+        ({3 + 4;})`mary;
+        [3, 4]`mary;
+`mary;
+        a[0]`mary;
+        a[0]`mary`mary;
+        s{0}`mary;
+        a[3]`jane++;
+        jack(3)`mary;
+        s.i`mary;
+        t.0`mary;
+        s.[i]`mary;
+        ps->i`mary;
+        pt->0`mary;
+        ps->[i]`mary;
+        i++`mary;
+        i--`mary;
+        (S){2}`mary;
+        (S)@{2}`mary;
+        #if !defined(NO_COMPILED_PRAGMA)
+                #pragma message( "Compiled" )   // force non-empty .expect file
+        #endif
 } // main

tests/forall.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Wed May  9 08:48:15 2018
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Tue Mar 19 08:29:38 2019
 // Update Count     : 32
+// Last Modified On : Sun Sep 27 08:43:20 2020
+// Update Count     : 35
 //
 …
+}
 forall( otype T ) inline static {
         int RT9( T ) { T t; }
+        int RT9( T ) { T t; return 3; }
+}
 …
 // w3 g3;
+int main( void ) {}
+int main( void ) {
+        #pragma message( "Compiled" )                   // force non-empty .expect file
+}
 // Local Variables: //

tests/heap.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Tue Nov  6 17:54:56 2018
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Sun Aug  9 08:05:16 2020
 // Update Count     : 57
+// Last Modified On : Fri Sep 25 15:21:52 2020
+// Update Count     : 73
 //
 …
                         free( area );
                 } // for
+        } // for
+        // check malloc/resize/free (sbrk)
+        for ( i; 2 ~ NoOfAllocs ~ 12 ) {
+                // initial N byte allocation
+                char * area = (char *)malloc( i );
+                area[0] = '\345'; area[i - 1] = '\345';                 // fill first/penultimate byte
+                // Do not start this loop index at 0 because resize of 0 bytes frees the storage.
+                int prev = i;
+                for ( s; i ~ 256 * 1024 ~ 26 ) {                                // start at initial memory request
+                        if ( area[0] != '\345' || area[prev - 1] != '\345' ) abort( "malloc/resize/free corrupt storage" );
+                        area = (char *)resize( area, s );                       // attempt to reuse storage
+                        area[0] = area[s - 1] = '\345';                         // fill last byte
+                        prev = s;
+                } // for
+                free( area );
+        } // for
+        // check malloc/resize/free (mmap)
+        for ( i; 2 ~ NoOfAllocs ~ 12 ) {
+                // initial N byte allocation
+                size_t s = i + default_mmap_start();                    // cross over point
+                char * area = (char *)malloc( s );
+                area[0] = '\345'; area[s - 1] = '\345';                 // fill first/penultimate byte
+                // Do not start this loop index at 0 because resize of 0 bytes frees the storage.
+                int prev = s;
+                for ( r; s ~ 256 * 1024 ~ 26 ) {                                // start at initial memory request
+                        if ( area[0] != '\345' || area[prev - 1] != '\345' ) abort( "malloc/resize/free corrupt storage" );
+                        area = (char *)resize( area, s );                       // attempt to reuse storage
+                        area[0] = area[r - 1] = '\345';                         // fill last byte
+                        prev = r;
+                } // for
+                free( area );
+        } // for
+        // check malloc/realloc/free (sbrk)
+        for ( i; 2 ~ NoOfAllocs ~ 12 ) {
+                // initial N byte allocation
+                char * area = (char *)malloc( i );
+                area[0] = '\345'; area[i - 1] = '\345';                 // fill first/penultimate byte
+                // Do not start this loop index at 0 because realloc of 0 bytes frees the storage.
+                int prev = i;
+                for ( s; i ~ 256 * 1024 ~ 26 ) {                                // start at initial memory request
+                        if ( area[0] != '\345' || area[prev - 1] != '\345' ) abort( "malloc/realloc/free corrupt storage" );
+                        area = (char *)realloc( area, s );                      // attempt to reuse storage
+                        area[s - 1] = '\345';                                           // fill last byte
+                        prev = s;
+                } // for
+                free( area );
+        } // for
+        // check malloc/realloc/free (mmap)
+        for ( i; 2 ~ NoOfAllocs ~ 12 ) {
+                // initial N byte allocation
+                size_t s = i + default_mmap_start();                    // cross over point
+                char * area = (char *)malloc( s );
+                area[0] = '\345'; area[s - 1] = '\345';                 // fill first/penultimate byte
+                // Do not start this loop index at 0 because realloc of 0 bytes frees the storage.
+                int prev = s;
+                for ( r; s ~ 256 * 1024 ~ 26 ) {                                // start at initial memory request
+                        if ( area[0] != '\345' || area[prev - 1] != '\345' ) abort( "malloc/realloc/free corrupt storage" );
+                        area = (char *)realloc( area, s );                      // attempt to reuse storage
+                        area[r - 1] = '\345';                                           // fill last byte
+                        prev = r;
+                } // for
+                free( area );
         } // for
 …
         } // for
+        // check memalign/resize with align/free
+        amount = 2;
+        for ( a; libAlign() ~= limit ~ a ) {                            // generate powers of 2
+                // initial N byte allocation
+                char * area = (char *)memalign( a, amount );    // aligned N-byte allocation
+                //sout | alignments[a] | area | endl;
+                if ( (size_t)area % a != 0 || malloc_alignment( area ) != a ) { // check for initial alignment
+                        abort( "memalign/resize with align/free bad alignment : memalign(%d,%d) = %p", (int)a, (int)amount, area );
+                } // if
+                area[0] = '\345'; area[amount - 2] = '\345';    // fill first/penultimate byte
+                // Do not start this loop index at 0 because resize of 0 bytes frees the storage.
+                for ( s; amount ~ 256 * 1024 ) {                                // start at initial memory request
+                        area = (char *)resize( area, a * 2, s );        // attempt to reuse storage
+                        //sout | i | area | endl;
+                        if ( (size_t)area % a * 2 != 0 ) {                      // check for initial alignment
+                                abort( "memalign/resize with align/free bad alignment %p", area );
+                        } // if
+                        area[s - 1] = '\345';                                           // fill last byte
+                } // for
+                free( area );
+        } // for
         // check memalign/realloc with align/free
 …
         // checkFreeOn();
         // malloc_stats();
+        printf( "done\n" );                                                                     // non-empty .expect file
+}

tests/identFuncDeclarator.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Wed Aug 17 08:36:34 2016
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Tue Nov  6 17:56:33 2018
 // Update Count     : 3
+// Last Modified On : Sun Sep 27 08:20:46 2020
+// Update Count     : 5
 //
 …
         int (* (* const f80)(int))();
         int (* const(* const f81)(int))();
+        #pragma message( "Compiled" )                   // force non-empty .expect file
+}

tests/identParamDeclarator.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Wed Aug 17 08:37:56 2016
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Tue Nov  6 17:56:44 2018
 // Update Count     : 3
+// Last Modified On : Fri Sep 25 14:31:08 2020
+// Update Count     : 4
 //
 …
 int main( int argc, char const *argv[] ) {                              // dummy main
         return 0;
+        printf( "done\n" );                                                                     // non-empty .expect file
+}

tests/labelledExit.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Wed Aug 10 07:29:39 2016
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Wed Feb  5 16:49:48 2020
 // Update Count     : 9
+// Last Modified On : Sun Sep 27 09:01:34 2020
+// Update Count     : 12
 //
 …
 int main( int argc, char const *argv[] ) {
         /* code */
+        #pragma message( "Compiled" )                                           // force non-empty .expect file
+}

tests/limits.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Tue May 10 20:44:20 2016
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Tue Nov  6 17:57:55 2018
 // Update Count     : 8
+// Last Modified On : Sun Sep 27 08:45:43 2020
+// Update Count     : 10
 //
 …
 int main(int argc, char const *argv[]) {
+        //DUMMY
+        return 0;
+        #pragma message( "Compiled" )                                           // force non-empty .expect file
+}

tests/maybe.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Thr May 25 16:02:00 2017
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Thu Jul 20 15:24:07 2017
 // Update Count     : 1
+// Last Modified On : Fri Sep 25 15:13:28 2020
+// Update Count     : 2
 //
 …
         //checkNamedConstructors();
         checkSetters();
+        printf( "done\n" );                             // non-empty .expect file
+}

tests/nested-types.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Mon Jul 9 10:20:03 2018
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Wed Feb 12 18:21:15 2020
 // Update Count     : 3
+// Last Modified On : Sun Sep 27 08:48:59 2020
+// Update Count     : 6
 //
 typedef int N;
 struct A {
   forall(otype T)
   struct N {
     T x;
   };
+        forall(otype T)
+        struct N {
+                T x;
+        };
 };
 struct S {
   struct T {
     int i;
     typedef int Bar;
   };
   T x;
+        struct T {
+                int i;
+                typedef int Bar;
+        };
+        T x;
   // struct U;
   typedef T Bar;
   typedef int Baz;
+        // struct U;
+        typedef T Bar;
+        typedef int Baz;
 };
 …
 int main() {
   // access nested struct
   S.T x;
+        // access nested struct
+        S.T x;
+  {
     struct S {
       int i;
       struct Z {
         double d;
       };
     };
+        {
+                struct S {
+                  int i;
+                  struct Z {
+                    double d;
+                  };
+                };
     S.Z z;   // gets local S
     .S.T y;  // lookup at global scope only
+                S.Z z;                                                                                  // gets local S
+                .S.T y;                                                                                 // lookup at global scope only
     const volatile .S.T q;
+                const volatile .S.T q;
 #if ERR1
     T err1;           // error: no T in scope
+                T err1;                                                                                 // error: no T in scope
 #endif
 #if ERR2
     .Z err2;          // error: no Z in global scope
     .S.Baz.Bar err3;  // error: .S.Baz => int, int is not aggregate and should not appear left of the dot
     .S.Z err4;        // error: no Z in global S
+                .Z err2;                                                                                // error: no Z in global scope
+                .S.Baz.Bar err3;                                                                // error: .S.Baz => int, int is not aggregate and should not appear left of the dot
+                .S.Z err4;                                                                              // error: no Z in global S
 #endif
+  }
+        }
   // U.S un;
+        // U.S un;
   S.Bar y;
   S.Baz x;
   S.T.Bar z;
+        S.Bar y;
+        S.Baz x;
+        S.T.Bar z;
+  // A.N(int) x;  // xxx - should not be an error, but currently is.
+        // A.N(int) x;  // xxx - should not be an error, but currently is.
+        #pragma message( "Compiled" )                   // force non-empty .expect file
+}

tests/numericConstants.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Wed May 24 22:10:36 2017
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Tue Feb  5 08:58:16 2019
 // Update Count     : 5
+// Last Modified On : Sun Sep 27 07:55:22 2020
+// Update Count     : 7
 //
 …
 x_ff.ffp0;                                     // hex real
 x_1.ffff_ffff_p_128_l;
+        #pragma message( "Compiled" )   // force non-empty .expect file
 } // main

tests/operators.cfa

rae2c27a	rc76bd34
31	31	int main(int argc, char const *argv[]) {
32	32	/* code */
33		~~return 0;~~
	33	printf( "done\n" ); // non-empty .expect file
34	34	}
35	35

tests/poly-o-cycle.cfa

rae2c27a	rc76bd34
1		// Check that a cycle of polymorphic ~~data~~ structures can be instancated.
	1	// Check that a cycle of polymorphic otype structures can be instancated.
2	2
3	3	#include <stdio.h>

tests/pybin/tools.py

-              rae2c27a
+              rc76bd34
                 raise
+def is_empty(fname):
+        if not os.path.isfile(fname):
+                return True
+        if os.stat(fname).st_size == 0:
+                return True
+        return False
 def is_ascii(fname):
         if settings.dry_run:
                 print("is_ascii: %s" % fname)
                 return True
+                return (True, "")
         if not os.path.isfile(fname):
                 return False
         code, out = sh("file %s" % fname, output_file=subprocess.PIPE)
+                return (False, "No file")
+        code, out = sh("file", fname, output_file=subprocess.PIPE)
         if code != 0:
                 return False
+                return (False, "'file EXPECT' failed with code {}".format(code))
         match = re.search(".*: (.*)", out)
         if not match:
+                return False
+        return match.group(1).startswith("ASCII text")
+                return (False, "Unreadable file type: '{}'".format(out))
+        if "ASCII text" in match.group(1):
+                return (True, "")
+        return (False, "File type should be 'ASCII text', was '{}'".format(match.group(1)))
 def is_exe(fname):
 …
                 return None
         file = open(file, mode)
+        file = open(file, mode, encoding="latin-1") # use latin-1 so all chars mean something.
         exitstack.push(file)
         return file

tests/raii/.expect/ctor-autogen.txt

rae2c27a	rc76bd34
	1	done

tests/raii/.expect/init_once.txt

rae2c27a	rc76bd34
	1	done

tests/raii/ctor-autogen.cfa

rae2c27a	rc76bd34
151	151	identity(gcs);
152	152	identity(gcu);
	153	printf( "done\n" ); // non-empty .expect file
153	154	}

tests/raii/init_once.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Tue Jun 14 15:43:35 2016
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Fri Mar 22 13:41:26 2019
 // Update Count     : 4
+// Last Modified On : Fri Sep 25 15:36:39 2020
+// Update Count     : 5
 //
 …
                 static_variable();
+        }
+        printf( "done\n" );                                                                     // non-empty .expect file
+}

tests/result.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Thr May 25 16:50:00 2017
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Thu Jul 20 15:24:12 2017
 // Update Count     : 1
+// Last Modified On : Fri Sep 25 15:22:59 2020
+// Update Count     : 2
 //
 …
         checkGetters();
         checkSetters();
+        printf( "done\n" );                             // non-empty .expect file
+}

tests/stdincludes.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Tue Aug 29 08:26:14 2017
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Tue Nov  6 18:00:53 2018
 // Update Count     : 6
+// Last Modified On : Sun Sep 27 08:51:38 2020
+// Update Count     : 8
 //
 …
 #include <wctype.h>
+int main() {}
+int main() {
+        #pragma message( "Compiled" )                   // force non-empty .expect file
+}
 // Local Variables: //

tests/switch.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Tue Jul 12 06:50:22 2016
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Tue Nov  6 18:01:34 2018
 // Update Count     : 37
+// Last Modified On : Sun Sep 27 08:35:02 2020
+// Update Count     : 43
 //
 …
                 j = 5;
         } // choose
+        #pragma message( "Compiled" )                                           // force non-empty .expect file
 } // main

tests/test.py

-              rae2c27a
+              rc76bd34
         test.prepare()
+        # ----------
+        # MAKE
+        # ----------
         # build, skipping to next test on error
         with Timed() as comp_dur:
                 make_ret, _ = make( test.target(), output_file=subprocess.DEVNULL, error=out_file, error_file = err_file )
+        # ----------
+        # RUN
+        # ----------
+        # run everything in a temp directory to make sure core file are handled properly
         run_dur = None
-        # run everything in a temp directory to make sure core file are handled properly
         with tempdir():
                 # if the make command succeeds continue otherwise skip to diff
 …
                 else:
                         if os.stat(out_file).st_size < 1048576:
                                 with open (out_file, "r") as myfile:
+                                with open (out_file, "r", encoding='latin-1') as myfile:  # use latin-1 so all chars mean something.
                                         error = myfile.read()
                         else:
 …
         make('clean', output_file=subprocess.DEVNULL, error=subprocess.DEVNULL)
+        # since python prints stacks by default on a interrupt, redo the interrupt handling to be silent
+        def worker_init():
+                def sig_int(signal_num, frame):
+                        pass
+                signal.signal(signal.SIGINT, sig_int)
+        # create the executor for our jobs and handle the signal properly
+        pool = multiprocessing.Pool(jobs, worker_init)
+        # create the executor for our jobs
+        pool = multiprocessing.Pool(jobs)
         failed = False
-        def stop(x, y):
-                print("Tests interrupted by user", file=sys.stderr)
-                sys.exit(1)
-        signal.signal(signal.SIGINT, stop)
         # for each test to run
 …
                 failed = 0
+                # check if the expected files aren't empty
+                if not options.regenerate_expected:
+                        for t in tests:
+                                if is_empty(t.expect()):
+                                        print('WARNING: test "{}" has empty .expect file'.format(t.target()), file=sys.stderr)
                 # for each build configurations, run the test
                 with Timed() as total_dur:

tests/typedefRedef.cfa

-              rae2c27a
+              rc76bd34
 typedef int ARR[];
 typedef int ARR[];
+// #ifdef ERR1
+// if a typedef has an array dimension,
+// it can only be redefined to the same dimension
+#ifdef ERR1
+// if a typedef has an array dimension, it can only be redefined to the same dimension
 typedef int ARR[2];
 // #endif
+#endif
 typedef int X;
 …
 int main() {
   typedef int ARR[sz];
+        typedef int ARR[sz];
   // can't redefine typedef which is VLA
+        // can't redefine typedef which is VLA
 #if ERR1
   typedef int ARR[sz];
+        typedef int ARR[sz];
 #endif
   Foo *x;
+        Foo * x;
   typedef struct Bar Foo;
   Foo *y;
+        typedef struct Bar Foo;
+        Foo * y;
+  typedef int *** pt;
+        typedef int *** pt;
+        #pragma message( "Compiled" )                   // force non-empty .expect file
+}

tests/typeof.cfa

-              rae2c27a
+              rc76bd34
 int main() {
+    int *v1;
+    typeof(v1) v2;
+    typeof(*v1) v3[4];
+    char *v4[4];
+    typeof(typeof(char *)[4]) v5;
+    typeof (int *) v6;
+    typeof( int ( int, int p ) ) *v7;
+    typeof( [int] ( int, int p ) ) *v8;
+    (typeof(v1)) v2; // cast with typeof
+        int *v1;
+        typeof(v1) v2;
+        typeof(*v1) v3[4];
+        char *v4[4];
+        typeof(typeof(char *)[4]) v5;
+        typeof (int *) v6;
+        typeof( int ( int, int p ) ) *v7;
+        typeof( [int] ( int, int p ) ) *v8;
+        (typeof(v1)) v2; // cast with typeof
+        printf( "done\n" );                             // non-empty .expect file
+}

tests/variableDeclarator.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Wed Aug 17 08:41:42 2016
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Tue Nov  6 18:02:16 2018
 // Update Count     : 2
+// Last Modified On : Sun Sep 27 07:46:17 2020
+// Update Count     : 13
 //
 …
 int (f2);
 int *f3;
 int **f4;
 int * const *f5;
+int * f3;
+int ** f4;
+int * const * f5;
 int * const * const f6;
 int *(f7);
 int **(f8);
 int * const *(f9);
+int * (f7);
+int ** (f8);
+int * const * (f9);
 int * const * const (f10);
 int (*f11);
 int (**f12);
 int (* const *f13);
+int (* f11);
+int (** f12);
+int (* const * f13);
 int (* const * const f14);
 int f15[];
+int f15[0];
 int f16[10];
 int (f17[]);
+int (f17[0]);
 int (f18[10]);
 int *f19[];
 int *f20[10];
 int **f21[];
 int **f22[10];
 int * const *f23[];
 int * const *f24[10];
 int * const * const f25[];
+int * f19[0];
+int * f20[10];
+int ** f21[0];
+int ** f22[10];
+int * const * f23[0];
+int * const * f24[10];
+int * const * const f25[0];
 int * const * const f26[10];
 int *(f27[]);
+int *(f27[0]);
 int *(f28[10]);
 int **(f29[]);
+int **(f29[0]);
 int **(f30[10]);
 int * const *(f31[]);
+int * const *(f31[0]);
 int * const *(f32[10]);
 int * const * const (f33[]);
+int * const * const (f33[0]);
 int * const * const (f34[10]);
 int (*f35)[];
 int (*f36)[10];
 int (**f37)[];
 int (**f38)[10];
 int (* const *f39)[];
 int (* const *f40)[10];
+int (* f35)[];
+int (* f36)[10];
+int (** f37)[];
+int (** f38)[10];
+int (* const * f39)[];
+int (* const * f40)[10];
 int (* const * const f41)[];
 int (* const * const f42)[10];
 int f43[][3];
+int f43[0][3];
 int f44[3][3];
 int (f45[])[3];
+int (f45[0])[3];
 int (f46[3])[3];
 int ((f47[]))[3];
+int ((f47[0]))[3];
 int ((f48[3]))[3];
 int *f49[][3];
 int *f50[3][3];
 int **f51[][3];
 int **f52[3][3];
 int * const *f53[][3];
 int * const *f54[3][3];
 int * const * const f55[][3];
+int * f49[0][3];
+int * f50[3][3];
+int ** f51[0][3];
+int ** f52[3][3];
+int * const * f53[0][3];
+int * const * f54[3][3];
+int * const * const f55[0][3];
 int * const * const f56[3][3];
 int (*f57[][3]);
 int (*f58[3][3]);
 int (**f59[][3]);
 int (**f60[3][3]);
 int (* const *f61[][3]);
 int (* const *f62[3][3]);
 int (* const * const f63[][3]);
+int (* f57[0][3]);
+int (* f58[3][3]);
+int (** f59[0][3]);
+int (** f60[3][3]);
+int (* const * f61[0][3]);
+int (* const * f62[3][3]);
+int (* const * const f63[0][3]);
 int (* const * const f64[3][3]);
 …
 int (f66)(int);
 int *f67(int);
 int **f68(int);
 int * const *f69(int);
+int * f67(int);
+int ** f68(int);
+int * const * f69(int);
 int * const * const f70(int);
 …
 int * const * const (f74)(int);
 int (*f75)(int);
 int (**f76)(int);
 int (* const *f77)(int);
+int (* f75)(int);
+int (** f76)(int);
+int (* const * f77)(int);
 int (* const * const f78)(int);
 int (*(*f79)(int))();
+int (*(* f79)(int))();
 int (*(* const f80)(int))();
 int (* const(* const f81)(int))();
 …
 //int fe2()[];                          // returning an array
 //int fe3()();                          // returning a function
 //int (*fe4)()();                               // returning a function
 //int ((*fe5())())[];                   // returning an array
+//int (* fe4)()();                              // returning a function
+//int ((* fe5())())[];                  // returning an array
+#ifdef __CFA__
 // Cforall extensions
 …
 const * const * int cf6;
 [] int cf15;
+[0] int cf15;
 [10] int cf16;
 [] * int cf19;
+[0] * int cf19;
 [10] * int cf20;
 int **cf21[];
+int ** cf21[0];
 [10] * * int cf22;
 [] * const * int cf23;
+[0] * const * int cf23;
 [10] * const * int cf24;
 [] const * const * int cf25;
+[0] const * const * int cf25;
 [10] const * const * int cf26;
 …
 const * const * [10] int cf42;
 [][3] int cf43;
+[0][3] int cf43;
 [3][3] int cf44;
 [][3] * int cf49;
+[0][3] * int cf49;
 [3][3] * int cf50;
 [][3] * * int cf51;
+[0][3] * * int cf51;
 [3][3] * * int cf52;
 [][3] const * int cf53;
+[0][3] const * int cf53;
 [3][3] * const * int cf54;
 [][3] const * const * int cf55;
+[0][3] const * const * int cf55;
 [3][3] const * const * int cf56;
 …
 *[]*[]* [ *[]*[] int ]( *[]*[] int, *[]*[] int ) v3;
+#endif // __CFA__
 //Dummy main
+int main(int argc, char const *argv[])
+{
+        return 0;
+int main( int argc, char const * argv[] ) {
+        #pragma message( "Compiled" )                                           // force non-empty .expect file
+}

tests/voidPtr.cfa

-              rae2c27a
+              rc76bd34
         if ( ! a ) {
                 abort();
+        }
+        }
+        printf( "done\n" );                             // non-empty .expect file
+}

tests/warnings/.expect/self-assignment.txt

rae2c27a	rc76bd34
24	24	... to:
25	25	reference to signed int
	26	warnings/self-assignment.cfa: In function '_X4mainFi___1':
	27	warnings/self-assignment.cfa:36:9: note: #pragma message: Compiled

tests/warnings/self-assignment.cfa

-              rae2c27a
+              rc76bd34
 // Created On       : Thu Mar 1 13:53:57 2018
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Wed Feb 20 07:56:17 2019
 // Update Count     : 3
+// Last Modified On : Sun Sep 27 09:24:34 2020
+// Update Count     : 6
 //
 …
         s.i = s.i;
         t.s.i = t.s.i;
+        #pragma message( "Compiled" )                   // force non-empty .expect file
+}
 // Local Variables: //
 // tab-width: 4 //
 // compile-command: "cfa dtor-early-exit" //
+// compile-command: "cfa self-assignment.cfa" //
 // End: //

tests/zombies/Rank2.c

-              rae2c27a
+              rc76bd34
 int ?=?( int *, int );
 forall(dtype DT) DT * ?=?( DT **, DT * );
+int ?=?( int &, int );
+forall(dtype DT) DT * ?=?( DT *&, DT * );
 void a() {
 …
         void h( int *null );
         forall( otype T ) T id( T );
         forall( dtype T ) T *0;
         int 0;
+//      forall( dtype T ) T *0;
+//      int 0;
         h( id( id( id( 0 ) ) ) );
+}

tests/zombies/Tuple.c

-              rae2c27a
+              rc76bd34
         [ 3, 5 ];
         [ a, b ] = 3;
 //      [ a, b ] = [ 4.6 ];
+        [ a, b ] = [ 4.6 ];
         [ a, b ] = 4.6;
         [ a, b ] = [ c, d ] = [ 3, 5 ];
 …
         [ a, b ] = t1 = [ c, d ];
         [ a, b ] = t1 = t2 = [ c, d ];
 //      t1 = [ 3, 4 ] = [ 3, 4 ] = t1 = [ 3, 4 ];
+        t1 = [ 3, 4 ] = [ 3, 4 ] = t1 = [ 3, 4 ];
         s.[ f1, i.[ f2, f3 ], f4 ] = [ 11, 12, 13, 3.14159 ];
 …
 //      [ a, , b, ] = h( 3, 3, 0, "abc" );                      /* ignore some results */
         sp->[ f4, f1 ] = sp->[ f1, f4 ];
         printf( "expecting 3, 17, 23, 4; got %d, %d, %d, %d\n", s.[ f4, i.[ f3, f2 ], f1 ] );
+        printf( "expecting 3, 17, 23, 4; got %g, %d, %d, %d\n", s.[ f4, i.[ f3, f2 ], f1 ] );
         rc = 0;
+}

tests/zombies/abstype.c

-              rae2c27a
+              rc76bd34
 // Created On       : Wed May 27 17:56:53 2015
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Tue Jun 14 14:27:48 2016
 // Update Count     : 9
+// Last Modified On : Wed Sep 30 13:55:47 2020
+// Update Count     : 10
 //
 …
+}
 forall( otype T ) lvalue T *?( T * );
+forall( otype T ) T *?( T * );
 int ?++( int * );
 int ?=?( int *, int );

tests/zombies/includes.c

-              rae2c27a
+              rc76bd34
 // Created On       : Wed May 27 17:56:53 2015
 // Last Modified By : Peter A. Buhr
 // Last Modified On : Wed Nov 15 23:06:24 2017
 // Update Count     : 597
+// Last Modified On : Wed Sep 30 13:59:18 2020
+// Update Count     : 598
 //
 …
 #if 1
 #define _GNU_SOURCE
+#include <a.out.h>
+#include <aio.h>
+#include <aliases.h>
+#include <alloca.h>
+#include <ansidecl.h>
+#include <ar.h>
+#include <argp.h>
+#include <argz.h>
+#include <assert.h>
+//#include <bfd.h>
+// #include <bfdlink.h>                         // keyword with
+#include <byteswap.h>
+#include <bzlib.h>
+#include <cblas.h>
+#include <cblas_f77.h>
+#include <complex.h>
+#include <com_err.h>
+#include <cpio.h>
+#include <crypt.h>
+#include <ctype.h>
+#include <curses.h>
+#include <dialog.h>
+#include <dirent.h>
+#include <dis-asm.h>
+#include <dlfcn.h>
+#include <dlg_colors.h>
+#include <dlg_config.h>
+#include <dlg_keys.h>
+#include <elf.h>
+#include <endian.h>
+#include <envz.h>
+#include <err.h>
+#include <errno.h>
+#include <error.h>
+#include <eti.h>
+#include <evdns.h>
+#include <event.h>
+// #include <a.out.h>
+// #include <aio.h>
+// #include <aliases.h>
+// #include <alloca.h>
+// #include <ansidecl.h>
+// #include <ar.h>
+// #include <argp.h>
+// #include <argz.h>
+// #include <assert.h>
+// #include <bfd.h>
+// #include <bfdlink.h>                                                                 // keyword with
+// #include <byteswap.h>
+// #include <bzlib.h>
+// #include <cblas.h>
+// #include <cblas_f77.h>
+// #include <complex.h>
+// #include <com_err.h>
+// #include <cpio.h>
+// #include <crypt.h>
+// #include <ctype.h>
+// #include <curses.h>
+// #include <dialog.h>
+// #include <dirent.h>
+// #include <dis-asm.h>
+// #include <dlfcn.h>
+// #include <dlg_colors.h>
+// #include <dlg_config.h>
+// #include <dlg_keys.h>
+// #include <elf.h>
+// #include <endian.h>
+// #include <envz.h>
+// #include <err.h>
+// #include <errno.h>
+// #include <error.h>
+// #include <eti.h>
+// #include <evdns.h>
+// #include <event.h>
 // #include <evhttp.h>
 // #include <sys/queue.h>
 // #include <evrpc.h>                                   // evrpc.h depends on sys/queue.h
+// #include <evrpc.h>                                                                           // evrpc.h depends on sys/queue.h
 // #include <evutil.h>
 // #include <execinfo.h>
 …
 // #include <fts.h>
 // #include <ftw.h>
 // #include <gconv.h>
 // #include <getopt.h>
 …
 // #include <gshadow.h>
 // #include <gssapi.h>
 // #include <hwloc.h>                                   // keyword thread (setjmp)
+#include <hwloc.h>                                                                              // keyword thread (setjmp)
 // #include <iconv.h>
 // #include <idna.h>

tests/zombies/structMember.cfa

-              rae2c27a
+              rc76bd34
 // C useless declarations
+#ifdef ERROR
         int;
         TD;
 …
         W(int);
         W(int).X;
+#endif // ERROR
 };

tools/gdb/utils-gdb.py

-              rae2c27a
+              rc76bd34
 STACK = []
-# A global variable to keep all system task name
-SysTask_Name = ["uLocalDebuggerReader", "uLocalDebugger", "uProcessorTask", "uBootTask", "uSystemTask",
-"uProcessorTask", "uPthread", "uProfiler"]
 not_supported_error_msg = "Not a supported command for this language"
 …
         return cluster_root
+def get_sched_lock():
+        """
+        Return: gdb.Value of __scheduler_lock
+        """
+        lock = gdb.parse_and_eval('_X16__scheduler_lockPS20__scheduler_RWLock_t_1')
+        if lock.address == 0x0:
+                print('No scheduler lock, program terminated')
+        return lock
+def all_clusters():
+        if not is_cforall():
+                return None
+        cluster_root = get_cluster_root()
+        if cluster_root.address == 0x0:
+                return
+        curr = cluster_root
+        ret = [curr]
+        while True:
+                curr = curr['_X4nodeS26__cluster____dbg_node_cltr_1']['_X4nextPS7cluster_1']
+                if curr == cluster_root:
+                        break
+                ret.append(curr)
+        return ret
+def all_processors():
+        if not is_cforall():
+                return None
+        cfa_t = get_cfa_types()
+        # get processors from registration to the RWlock
+        lock = get_sched_lock()
+        #get number of elements
+        count = lock['_X5readyVj_1']
+        #find all the procs
+        raw_procs = [lock['_X4dataPS21__scheduler_lock_id_t_1'][i]['_X6handleVPS16__processor_id_t_1'] for i in range(count)]
+        # pre cast full procs
+        procs = [p.cast(cfa_t.processor_ptr) for p in raw_procs if p['_X9full_procb_1']]
+        # sort procs by clusters
+        return sorted(procs, key=lambda p: p['_X4cltrPS7cluster_1'])
+def tls_for_pthread(pthrd):
+        prev = gdb.selected_thread()
+        inf = gdb.selected_inferior()
+        thrd = inf.thread_from_thread_handle( pthrd )
+        thrd.switch()
+        tls = gdb.parse_and_eval('&_X9kernelTLSS16KernelThreadData_1')
+        prev.switch()
+        return tls
+def tls_for_proc(proc):
+        return tls_for_pthread(proc['_X13kernel_threadm_1'])
+def thread_for_pthread(pthrd):
+        return tls_for_pthread(pthrd)['_X11this_threadVPS7$thread_1']
+def thread_for_proc(proc):
+        return tls_for_proc(proc)['_X11this_threadVPS7$thread_1']
 def find_curr_thread():
         # btstr = gdb.execute('bt', to_string = True).splitlines()
 …
         # return btstr[0].split('this=',1)[1].split(',')[0].split(')')[0]
         return None
-def all_clusters():
-        if not is_cforall():
-                return None
-        cluster_root = get_cluster_root()
-        if cluster_root.address == 0x0:
-                return
-        curr = cluster_root
-        ret = [curr]
-        while True:
-                curr = curr['_X4nodeS26__cluster____dbg_node_cltr_1']['_X4nextPS7cluster_1']
-                if curr == cluster_root:
-                        break
-                ret.append(curr)
-        return ret
 def lookup_cluster(name = None):
 …
         """Cforall: Display currently known processors
 Usage:
+        info processors                 : print out all the processors in the Main Cluster
+        info processors all             : print out all processors in all clusters
+        info processors                 : print out all the processors
         info processors <cluster_name>  : print out all processors in a given cluster
 """
 …
                 super(Processors, self).__init__('info processors', gdb.COMMAND_USER)
+        def print_processor(self, name, status, pending, address):
+                print('{:>20}  {:>11}  {:>13}  {:>20}'.format(name, status, pending, address))
+        def iterate_procs(self, root, active):
+                if root == 0x0:
+                        return
+                cfa_t = get_cfa_types()
+                curr = root
+                while True:
+                        processor = curr
+                        should_stop = processor['_X12do_terminateVb_1']
+        def print_processor(self, processor):
+                should_stop = processor['_X12do_terminateVb_1']
+                if not should_stop:
+                        midle = processor['_X6$linksS7$dlinks_S9processor__1']['_X4nextS9$mgd_link_Y13__tE_generic___1']['_X4elemPY13__tE_generic__1'] != 0x0
+                        end   = processor['_X6$linksS7$dlinks_S9processor__1']['_X4nextS9$mgd_link_Y13__tE_generic___1']['_X10terminatorPv_1'] != 0x0
+                        status = 'Idle' if midle or end else 'Active'
+                else:
                         stop_count  = processor['_X10terminatedS9semaphore_1']['_X5counti_1']
+                        if not should_stop:
+                                status = 'Active' if active else 'Idle'
+                        else:
+                                status_str  = 'Last Thread' if stop_count >= 0 else 'Terminating'
+                                status      = '{}({},{})'.format(status_str, should_stop, stop_count)
+                        self.print_processor(processor['_X4namePKc_1'].string(),
+                                        status, str(processor['_X18pending_preemptionb_1']), str(processor)
+                                )
+                        curr = curr['_X4nodeS28__processor____dbg_node_proc_1']['_X4nextPS9processor_1']
+                        if curr == root or curr == 0x0:
+                                break
+                        status_str  = 'Last Thread' if stop_count >= 0 else 'Terminating'
+                        status      = '{}({},{})'.format(status_str, should_stop, stop_count)
+                print('{:>20}  {:>11}  {:<7}  {:<}'.format(
+                        processor['_X4namePKc_1'].string(),
+                        status,
+                        str(processor['_X18pending_preemptionb_1']),
+                        str(processor)
+                ))
+                tls = tls_for_proc( processor )
+                thrd = tls['_X11this_threadVPS7$thread_1']
+                if thrd != 0x0:
+                        tname = '{} {}'.format(thrd['self_cor']['name'].string(), str(thrd))
+                else:
+                        tname = None
+                print('{:>20}  {}'.format('Thread', tname))
+                print('{:>20}  {}'.format('TLS', tls))
         #entry point from gdb
 …
                 if not arg:
-                        clusters = [lookup_cluster(None)]
-                elif arg == "all":
                         clusters = all_clusters()
                 else:
 …
                         return
+                cfa_t = get_cfa_types()
+                for cluster in clusters:
+                        print('Cluster: "{}"({})'.format(cluster['_X4namePKc_1'].string(), cluster.cast(cfa_t.cluster_ptr)))
+                        active_root = cluster.cast(cfa_t.cluster_ptr) \
+                                        ['_X5procsS8__dllist_S9processor__1'] \
+                                        ['_X4headPY15__TYPE_generic__1'] \
+                                        .cast(cfa_t.processor_ptr)
+                        idle_root = cluster.cast(cfa_t.cluster_ptr) \
+                                        ['_X5idlesS8__dllist_S9processor__1'] \
+                                        ['_X4headPY15__TYPE_generic__1'] \
+                                        .cast(cfa_t.processor_ptr)
+                        if idle_root != 0x0 or active_root != 0x0:
+                                self.print_processor('Name', 'Status', 'Pending Yield', 'Address')
+                                self.iterate_procs(active_root, True)
+                                self.iterate_procs(idle_root, False)
+                        else:
+                                print("No processors on cluster")
+                procs = all_processors()
+                print('{:>20}  {:>11}  {:<7}  {}'.format('Processor', '', 'Pending', 'Object'))
+                print('{:>20}  {:>11}  {:<7}  {}'.format('Name', 'Status', 'Yield', 'Address'))
+                cl = None
+                for p in procs:
+                        # if this is a different cluster print it
+                        if cl != p['_X4cltrPS7cluster_1']:
+                                if cl:
+                                        print()
+                                cl = p['_X4cltrPS7cluster_1']
+                                print('Cluster {}'.format(cl['_X4namePKc_1'].string()))
+                        # print the processor information
+                        self.print_processor(p)
                 print()

Note: See TracChangeset for help on using the changeset viewer.