View Single Post
Old 2016-09-14, 15:42   #72
Serpentine Vermin Jar
Madpoo's Avatar
Jul 2014

1100110111012 Posts
Default KNL throughput benchmark

Attached is the Knights Landing benchmark run I'd asked for, and the user in question was awesome to help us out with this.

I was happy to see that it scales really well. There's barely a blip in difference between one worker and 64 workers.

At the 2048K FFT size, a solo single-cored worker does 42.34 ms/iter. With all 64 workers going, they still manage an average of somewhere around 44.5 ms/iter for an aggregate throughput of 1435.16 iter/sec.

Up at the higher end of the FFT sizes (I only requested 2M-5M to keep the data set to a dull roar...sorry, no 332M+ sized exponent sized FFTs but I could ask...)

5120K FFT = 110.82 ms/iter for a single worker, and ~ 119.5 ms/iter with all 64 going. Total throughput at that size = 534.78 iter/sec

Attached is the raw results... I was thinking it could be graphed or something to show how additional workers running affects the throughput of each other worker, but it's such a gentle curve from my quick glance that it would hopefully (and thankfully) be a boring graph. Only a 5-7% slowdown from 1 worker to 64 workers... yeah, I'll take that any day.
Attached Files
File Type: gz knl_benchmark.txt.gz (50.2 KB, 81 views)
Madpoo is offline   Reply With Quote