missing add of terminating thread-heap statistics to master heap, check for environment variable CFA_MALLOC_STATS and print heap statistics at program termination