Quote from the above-linked Colfax page:
Case 1: The entire application fits in HBM.

This is the best case scenario. If your application fits in the HBM, then set the configuration mode to Flat mode and follow the numactl instructions in Section 3.1. This usage mode does not require any code modification, and works with any application (written in any language) provided that it does not have special allocators that specifically allocate elsewhere. Note that, although this procedure requires no source code changes, applications could still benefit from general memory optimization. For more on memory traffic optimization, refer to the various online references on optimization such as the HOW Series.

If numactl cannot be used, then using Cache mode could be an alternative. Because the problem fits in the HBM cache, there will only be a few HBM cache misses. HBM cache misses are the primary factor in the performance difference between addressable memory HBM and cache HBM, so using the Cache mode could get close to or even match the performance with Flat mode. However, there is still some inherent overhead associated with using HBM as cache, so if numactl is an option, we recommend to use that method.
Again, though, I've not found a way to control the configuration mode, the only visible NUMA-related subitems in the SuperMicro boot menu are disabled, and 'which numactl' comes up empty.

