Yes, you can set multiple workers, each of which can have multiple threads.

If I had to guess, I'd say the best performance would be 2 workers with either 8 or 16 threads each, with each worker assigned its own CCX on the chip. However, benchmarking in various configurations first would be wise.
