![]() |
|
|
#1 |
|
Mar 2017
1116 Posts |
Sadly, I have to bring this up again, I have a computer with an Intel HD4000 graphics, wanting to run mfakto on it. I saw this thread, so I thought it was possible. I have the Intel SDK downloaded onto my computer. I can't get it to work, because it won't detect the GPU as having OpenCL. I know there have been some posts on the HD4000 being able to support OpenCL, but from what I see, I need to do more things, so my questions are:
1. Where can I get the AMD Catalyst Driver for Intel HD Graphics? (As seen in this thread post #4) 2. Where to get the special Intel HD Graphics 4000 version of mfakto? I clicked on this link, and it brings up an error 404. (See post #12 of above thread to see what I am talking about.) Thanks! |
|
|
|
|
|
#2 | |
|
"Forget I exist"
Jul 2009
Dartmouth NS
8,461 Posts |
Quote:
|
|
|
|
|
|
|
#3 | |
|
Mar 2017
17 Posts |
Quote:
|
|
|
|
|
|
|
#4 |
|
"/X\(‘-‘)/X\"
Jan 2013
https://pedan.tech/
61608 Posts |
At the top right of each post, there is a # link, e.g. #4 for this post. You can right-click and copy the link.
Last fiddled with by Mark Rose on 2017-08-25 at 02:21 |
|
|
|
|
|
#5 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
11110100100002 Posts |
Quote:
See http://www.mersenneforum.org/showthr...t=18632&page=7 I think there was someone earlier in that thread mentioned he did his own build for an Nvidia card. I contacted the author of mfakto. There is apparently no further development going on. It's also possible that, if it works, the throughput on Intel iGPs might not be worth the memory bandwidth or power budget it costs, which could take away from the cpus' productivity in prime95, mprime, or other. I did an experiment on a modern laptop with GpuOwl, a new LL test application, and saw it run on the iGP, but the hit in prime95 throughput was about 4 times the estimated iGP throughput. It might have been the bandwidth using shared memory, or it might have been package TDP limit throttling the cpu clock rates back. The observed cpu clock rate drop was sufficient to explain the drop in throughput. Either way, it projected _reduced_ total system throughput to about 5/8, using the HD620 present in a laptop designed or built last year. I have not gone back to try it again to rule out "pilot error". See http://www.mersenneforum.org/showpos...&postcount=176 for more detail. I have also tried to run GpuOwl on the igp of an older system but was blocked by an unsupported old OS version there. GpuOwl seemed to do a good job of identifying and reporting what's there and OpenCL capable (although it reported two HD620s on a system with one), so might be useful to test your setup. Again, good luck, and let us know how it goes. |
|
|
|
|
|
|
#6 |
|
Serpentine Vermin Jar
Jul 2014
2×13×131 Posts |
It does occur to me that if HD4000 (and up) Intel GPUs could even be harnessed a bit more for Prime95 factoring, that'd be cool. That'd be up to George I guess, whether or not he thought there was room in there for improving factoring.
Yeah, GPUs are still awesome at factoring, but then again not everyone has a dedicated GPU and just have whatever came with their system and/or CPU. And factoring is still a very useful tool in reducing the LL work needed... if some Intel GPU could be used at the same time as the chip was also doing LL work (no idea) it could even be worth running it a bit in the background, moving the needle on how many bits make sense for TF before LL work begins. Shift that whole curve upwards. The Intel GPUs are unlikely to ever have near what a dedicated GPU does, so I'm just thinking in sheer terms of how many are out there. Of course, doing Intel GPU instructions for factoring and all the FP64 for LL on the same CPU could make you hit the TDP a lot faster and force the clock to throttle, even if memory bandwidth wasn't a problem. |
|
|
|
|
|
#7 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24·3·163 Posts |
Quote:
Even if there is memory contention or TDP tradeoff that would make an igp not useful along with cpu cores, there is a niche left for it. On the laptop where I'm composing this message, I have prime95 set to pause when certain other apps run, and throttled otherwise. There's a setting built into Prime95 for that, and another for throttling back to lower duty cycle, to avoid thermal shutdowns. It might be the igp was a low enough load it could run during those times it is not safe to run the cpus hard on prime95. The general case, Prime95 (or alternate application), running multiple cpu cores and at least one gpu asset, possibly multiple & dissimilar gpus, trying to approach optimal throughput overall, while maintaining high reliability, in parallel, on probably different task types and with different hardware throughputs and tradeoffs, is the most complex and demanding case that occurs to me, both for the hardware, and for the software author. Don't be too willing to write off the HD620; by some measures it's twice the speed of an HD4000. Passmark is not a great indicator of gpu computing performance, but it's a convenient one with numbers available for a huge number of devices. (Passmark for gpus may be single-precision heavy while gpu computing is double precision.) For comparison, https://www.videocardbenchmark.net/high_end_gpus.html GTX 1080 Ti 13357 GTX 1080 11993 Radeon RX Vega 11314 GTX 1070 11025 GTX 1060 8714 AMD Fury X 8350 GTX 1050 Ti 5750 GTX 480 4354 Quadro 4000 1999 Quadro 2000 1282 https://www.videocardbenchmark.net/mid_range_gpus.html intel hd 620 945 https://www.videocardbenchmark.net/m...ange_gpus.html intel hd4000 453 intel HD3000 313 (The Quadro 4000 and GTX 1050Ti are much closer in gpu computing performance than that almost 3:1 passmark ratio would imply; on cudalucas 2.06beta, for 43M double checks, about 3.8 vs. 3 days.) NVIDIA passmark 43m LL days product gtx1070 11025 1.5 16538 gtx1050 Ti 5750 3 17250 GTX480 4354 2.2 9579 quadro4000 1999 3.8 7596 quadro2000 1282 8.5 10897 average product 12372; max/min=2.27 so a rough relative estimate of some igp's ability, big error bars, regard as order-of-magnitude at best, is hd620 12372 / 945 = 13+ days hd4000 12372 / 453 = 27+ days hd3000 12372 / 313 = 40- days For reference, the GTX480 can bang out 149M exponents from 72 bits to 76 bits, three a day; one at 70-71 in under 7 minutes, so even an igp with 3-10% the speed would be helpful; 149M exponent from 70 bits to 71 in roughly under 3.7-1.2 hours. It would take a lot of HDs to make a noticeable difference. And there are a lot of them out there and more coming. George had looked into the HD4000 a bit some time ago, and ran into some obstacles (Intel OpenCl support on linux as I recall). Last fiddled with by kriesel on 2017-08-27 at 15:58 |
|
|
|
|
|
|
#8 | |
|
Mar 2017
17 Posts |
Quote:
Last fiddled with by Yoshi24517 on 2017-08-28 at 04:41 |
|
|
|
|
|
|
#9 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
172208 Posts |
Quote:
Here are some brief notes I took on the thread earlier, the subset relating possibly to the HD4000 or other Intel igp. The numbers at left are the post numbers in the mfakto thread http://mersenneforum.org/showthread.php?t=15646 585 Intel HD4000 attempt 647 looking grim for HD4000 664 notes on possible coding workaround for hd4000 issue 667 special hd4000 code available 674 HD4000 tune parameters 744-748, 750 HD4000 misses some factors, rate depending on tuning parameters; gridsize=0 required to miss none 1076-1107, 1113-1122 George attempting compile for Intel opencl of mfakto, compiler and debugging issues, code edits w/o git use himself 1124 intel opencl driver download link https://software.intel.com/en-us/vcs...ols/opencl-sdk 1299 mfakto on hd4600; sieve on gpu disabled, marginal effect on prime95 (end) Last fiddled with by kriesel on 2017-08-28 at 06:59 |
|
|
|
|
|
|
#10 |
|
Mar 2017
17 Posts |
I guess I'll have to run clinfo and see if that works. If it does, then I don't know the problem.
|
|
|
|
|
|
#11 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24×3×163 Posts |
Quote:
The HD620 will run mfakto 0.15pre6. Like the HD4000 is expected to be, it only produces 18Ghzd/day of results with prime95 also running, 20 if prime95 is not running. |
|
|
|
|