![]() |
[QUOTE=ixfd64;529248]Anyone else having trouble with the [C]-d c[/C] and [C]-d g[/C] options?
On all my computers except a Windows laptop, [C]-d c[/C] gives this error: [CODE]Error -43 (Invalid build options): clBuildProgram ERROR: load_kernels(0) failed[/CODE]I'm unable to get [C]-d g[/C] to work on any computer regardless of operating system. It gives me this error every time: [CODE]Select device - Error: Only <number> platforms found. Cannot use platform 1000 (bad parameter to option -d). ERROR: init_CL(3, 10000) failed[/CODE][/QUOTE]I've not used letters in device selection in mfakto. From the readme file for mfakto 0.15pre5: [CODE]######### # 6 FAQ # ######### Q Does mfakto support multiple GPUs? A No, but using the commandline option "-d <GPU number>" you should be able to specify which GPU to use for each specific mfakto instance. Please read the next question, too. Q Can I run multiple instances of mfakto on the same computer? A Yes, and in most cases this is necessary to make full use of the GPU(s) if sieving with CPU. If the sieve is running on the GPU(default), one instance should fully utilize a single GPU.[/CODE]First digit selects platform, second selects device on that platform. Mfakto help output I saved long ago:[CODE]mfakto 0.15pre6-Win (64bit build) mfakto (mfakto 0.15pre6-Win) Copyright (C) 2009-2014, Oliver Weihe (o.weihe@t-online.de) Bertram Franz (bertramf@gmx.net) This program comes with ABSOLUTELY NO WARRANTY; for details see COPYING. This is free software, and you are welcome to redistribute it under certain conditions; see COPYING for details. Usage: mfakto [options] -h|--help display this help -d <xy> specify to use OpenCL platform number x and device number y in this program -d c force using all CPUs -d g force using the first GPU -v <n> verbosity level: 0=terse, 1=normal, 2=verbose, 3=debug -tf <exp> <min> <max> trial factor M<exp> from 2^<min> to 2^<max> instead of parsing the worktodo file -i|--inifile <file> load <file> as inifile (default: mfakto.ini) -st selftest using the optimal kernel per testcase -st2 selftest using all possible kernels options for debugging purposes --timertest test of timer functions --sleeptest test of sleep functions --perftest [<n>] performance tests, repeat each test <n> times (def: 10) --CLtest test of some OpenCL functions specify -d before --CLtest to test the specified device [/CODE]Perhaps -d c and -d g were still in development. The todo.txt lists [CODE+DONE+ - -d g[/CODE]but does not include the string "-d c". There's a recommendation in the readme to use clinfo to get the platform & device number list for OpenCL devices on a system. Using the cpu for OpenCL trial factoring occurs to me as a waste. It could be running PRP or P-1 or LL. GPUs are much faster at TF. |
[QUOTE=kriesel;529253]There's a recommendation in the readme to use clinfo to get the platform & device number list for OpenCL devices on a system.[/QUOTE]
This reminds me of another problem: if a number greater than 9 is passed to the [C]-d[/C] parameter, then only the last digit is used as the device number. For example, [C]-d 16[/C] will make mfakto look for device 6 on platform 1. Does this mean mfakto would be unable to fully utilize something like this 10-GPU monster? [url]https://exxactcorp.com/Exxact-TS4-264546-E264546[/url] Or does OpenCL group no more than nine GPUs to a platform? |
[QUOTE=ixfd64;529336]This reminds me of another problem: if a number greater than 9 is passed to the [C]-d[/C] parameter, then only the last digit is used as the device number. For example, [C]-d 16[/C] will make mfakto look for device 6 on platform 1.
Does this mean mfakto would be unable to fully utilize something like this 10-GPU monster? [URL]https://exxactcorp.com/Exxact-TS4-264546-E264546[/URL] Or does OpenCL group no more than nine GPUs to a platform?[/QUOTE]Correct, mfakto limits number of OpenCL usable devices per platform to 9, although there are other limits too. In mfakto.cpp:[CODE]*devnumber %= 10; // use only the last digit as device number, counting from 1[/CODE]mfaktc does apparently support larger device numbers:[CODE]devicenumber = strtol(argv[i+1],&ptr,10);[/CODE]It might be power limited first, at 2KW total power supply output for the chassis. Some rigs use multiple supplies to get around that. There are mining rigs up to at least 13 PCIe slots. From what I've read, they are limited to several each of AMD and NVIDIA, so would run a blend of OpenCL and CUDA apps, such as mfaktO and mfaktC. As I recall, it's a driver limitation in each case, at less than 10 devices per driver. But if you'd send me a loaded system, I'd be glad to test it for you, at length. :cool: |
[QUOTE=kriesel;529361]Correct, mfakto limits number of OpenCL usable devices per platform to 9, although there are other limits too.[/QUOTE]
I did some more research and found several sources saying that the AMD drivers don't support more than eight GPUs on Windows. It's not clear whether this is true for newer drivers or other operating systems, but there doesn't seem to be any such limit on Nvidia GPUs. Therefore, the nine-device limit appears to be specific to mfakto and not the OpenCL platform. On that note, I wonder if it would be a good idea to split the platform and device numbers into separate parameters as a way to future-proof mfakto. If so, does anyone know how to test software on a hypothetical machine with 10+ AMD GPUs? I don't have access to any high-end hardware with AMD GPUs and AWS doesn't offer any such instances either. Is there a way to create multiple virtual devices and map them to a single physical device? |
In [URL]https://devtalk.nvidia.com/default/topic/1004967/max-number-of-cuda-devices/[/URL] someoene from NVIDIA states knowledge of people getting 10-16 NVIDIA gpus going in one system, and refers to system BIOS as a possible limitation. Also, dual-gpu cards count as two gpus, as on a K80.
Here, 11 cards, 18 NVIDIA gpus, on linux of some sort: [URL]https://devtalk.nvidia.com/default/topic/649542/18-gpus-in-a-single-rig-and-it-works/[/URL] [URL]https://community.amd.com/thread/158863[/URL] The GPU limit in Windows 7 is 32 GPUs per OS. Three are reserved for remote services and the primary desktop. Interesting, it looks like no such kind of hard limit is set in modern Linux system. Instead, there is a kernel configuration option named "Maximum number of GPUs" under Linux, and we can set it to an arbitrarily large value freely. |
I managed to compile mfakto on Windows on my own and fixed a few issues along the way. I'll update the build instructions when I have time.
@kracker and preda: expect a pull request soon. :smile: |
[QUOTE=ixfd64;529664]I managed to compile mfakto on Windows on my own and fixed a few issues along the way. I'll update the build instructions when I have time.
@kracker and preda: expect a pull request soon. :smile:[/QUOTE] :party: Are you using MSVC or MSYS2? |
I used Visual Studio but did manage to compile mfakto using MinGW as well. I installed MSYS2 but ended up not using it after adding an OS check to the makefile.
For the record, the MinGW package bundled with Code::Blocks didn't work for me, but the official version from SourceForge did. |
The changes are now live in the official mfakto repository. Thanks to kracker for reviewing my pull request. :smile:
I want to update the rest of the documentation next. However, there are a few areas where I would like some input: [QUOTE]This version is tested to provide correct results. But it is preliminary as it contains test code that results in slightly lower performance. This version is intended to provide information to better optimize the final version. To run this test and help improve mfakto, extract the depot and run on an idle machine perftestmfakto.cmd This test will take between one and two hours, during which you should not use the computer - at least nothing that would put measurable load on CPU or GPU.[/QUOTE] There is no [c]perftestmfakto.cmd[/c] file in the source code. I'm also not aware of any performance throttling in the latest version. Was this message meant for a specific test build? [QUOTE]- precompiled version is currently only available for 64-bit (built on SuSE 11.4)[/QUOTE] I wasn't able to find a pre-compiled Linux binary at any mfakto mirror. Unless someone objects, then I would like to remove this line. [QUOTE]Advanced usage (extend the upper limit): Since mfakto works best on long running jobs you may want to extend the upper TF limit of your assignments a little bit. Take a look how much TF is usually done here: [url]http://www.mersenne.org/various/math.php[/url] Lets assume that you've received an assignment like this: Factor=<some hex key>,78467119,65,66 This means that primenet server assigned you to TF M78467119 from 2^65 to 2^66. Take a look at the site noted above, those exponent should be TFed up to 2^71. Primenet will do this in multiple assignments (step by step) but since mfakto runs very fast on modern GPUs you might want to TF up to 2^71 or even 2^72 directly. Just replace the 66 at the end of the line with e.g. 72 before you start mfakto: e.g. Factor=<some hex key>,78467119,65,72 When you increase the upper limit of your assignments it is important to report the results once you've finished up to the desired level. (Do not report partially results before!) [/QUOTE] Does anyone actually do this? I'm pretty sure GPU to 72 and the manual GPU assignments page have made this obsolete. [QUOTE]- mfakto can find factors outside the given range. E.g. './mfakto.exe -tf 66362159 40 41' has a high change to report 124246422648815633 as a factor. Actually this is a factor of M66362159 but it's size is between 2^56 and 2^57! Of course './mfakto.exe -tf 66362159 56 57' will find this factor, too.[/QUOTE] I tried to test these two ranges and got an error. From what I gather, mfakto doesn't yet support low limits like mfaktc does. Can anyone confirm? Any feedback would be appreciated! |
[QUOTE=ixfd64;529877]
Does anyone actually do this?[/QUOTE]Yes. For example, [URL]https://www.mersenne.org/report_exponent/?exp_lo=405000179&exp_hi=&full=1[/URL] was the result of Factor=405000179,76,81. I generally do a TF in either mfaktc or mfakto from the current bit level to gputo72 limit before a P-1 test on an exponent. It takes a lot of P-1 tests at widely spaced exponents to arrive at a sense of limits versus gpu model and application, such as for determining usable limits and run time scaling for CUDAPm1 in [URL]https://www.mersenneforum.org/showpost.php?p=498672&postcount=8[/URL] [URL]https://www.mersenneforum.org/showpost.php?p=498673&postcount=9[/URL] or the beginnings of the equivalent for GpuOwl in [URL]https://www.mersenneforum.org/showpost.php?p=525955&postcount=17[/URL] and even more for a systematic trial in every million bin as in [URL]https://www.mersenneforum.org/showpost.php?p=501182&postcount=7[/URL] |
Hi everybody!
I have a question about using mfakto (Linux) with Ubuntu 18.04. I started running mfakto about 5 years ago and had it comfortably running with the AMD Catalyst drivers. A couple of months ago, I upgraded to 18.04, and the Catalyst drivers were no longer supported, in favor of AMDGPU drivers. I've tried a few times to get mfakto running since then, but the best I've done is to get perftest running -- on the CPU instead of the GPU. Has anyone successfully gotten this working and have any tips about what I need to do beyond installing the drivers? Thanks! [CODE] Ubuntu 18.04.3 LTS 4.15.0-66-generic Radeon R9 290X [/CODE] |
| All times are UTC. The time now is 13:00. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.