![]() |
1 Attachment(s)
[QUOTE=kladner;395921]Norton Internet Security gave the same complaint. I believe that such flags are based on the application being unknown in the Norton Community database. There are no direct heuristics indicating malware aside from the file having very restricted distribution.[/QUOTE]
Huh, now this is really strange. MFAKTC gets the seal of approval from Norton 360's "File Insight" feature (see attachment), while N360 itself has never flagged it on my PC during a scan. So NPE and NIS dislike it, while N360 and File Insight are OK with it. Maybe these various Symantec applications are maintained by rival teams... :smile: Rodrigo |
[QUOTE=Rodrigo;395967]Huh, now this is really strange. MFAKTC gets the seal of approval from Norton 360's "File Insight" feature (see attachment), while N360 itself has never flagged it on my PC during a scan.
So NPE and NIS dislike it, while N360 and File Insight are OK with it. Maybe these various Symantec applications are maintained by rival teams... :smile: Rodrigo[/QUOTE] I only had problems with the 0.21 files. In the first instance, it gave dire warnings, but allowed me to authorize their use. The next time (I was testing 32bit vs 64bit), it horned in and quarantined the file. Through dogged insistence I managed to make NIS disgorge its prey. I played this game a few times before I beat down Norton's resistance. |
Problem processing worktodo.add
I'm working on adding support for worktodo.add to MISFIT and I came across a problem with mfaktc
During startup of mfaktc if WorkToDo.txt has no rows the program exits instead of proactively inbounding rows from WorkToDo.add So if workToDo.txt runs dry it is impossible to get it mfaktc restarted without first manually moving data out of the .Add file. I think during startup mfaktc should check for the .add file and process it if it exists. Scott |
[QUOTE=swl551;396777]I'm working on adding support for worktodo.add to MISFIT and I came across a problem with mfaktc
During startup of mfaktc if WorkToDo.txt has no rows the program exits instead of proactively inbounding rows from WorkToDo.add So if workToDo.txt runs dry it is impossible to get it mfaktc restarted without first manually moving data out of the .Add file. I think during startup mfaktc should check for the .add file and process it if it exists. Scott[/QUOTE] Does the program not have an emergency dump-from-staging-file routine? |
[QUOTE=TheMawn;396778]Does the program not have an emergency dump-from-staging-file routine?[/QUOTE]
The Judger would have to answer, but it appears it does not read from the .add file in an "Emergency" |
Hi Scott (other aswell)!
[LIST][*]Add worktodo.add always to worktodo.txt on startup, yes, why not (read: good idea, I'll do this in the next release)[*]Add worktodo.add to worktodo.txt on [I]"emergency"[/I]? What is an [I]"emergency"[/I]? Processed everything from worktodo.txt? Well, I don't feel comfortable with "add worktodo.add to worktodo.txt" in that case, this will break the whole idea of worktodo.add. Imagin only one exponent left in worktodo.txt and StopAfterFactor=2 (mfaktc.ini), while you edit worktodo.add a factor is found... Same as editing worktodo.txt, isn't it?[/LIST] Oliver |
[QUOTE=swl551;396788]The Judger would have to answer, but it appears it does not read from the .add file in an "Emergency"[/QUOTE]
No, I was talking about Misfit. Does it not dump whatever is in the staging file if worktodo.txt falls below a certain threshold? |
[QUOTE=TheMawn;396819]No, I was talking about Misfit. Does it not dump whatever is in the staging file if worktodo.txt falls below a certain threshold?[/QUOTE]
I am working to implement support for .add where misfit will not load directly into live work files. |
When MISFIT detects a stalled instance, would it be appropriate for it to try to determine if the worktodo.txt is empty, and if it is, automatically transfer work (possibly from worktodo.add) and automatically attempt to restart the instance?
This is assuming the control codes are correctly jiggered. EDIT: I make this suggestion assuming the lack of any better way for MISFIT to determine if that is the reason mfaktx died. EDIT: Again I don't know if this is the kind of functionality we want on the MISFIT level. Perhaps this is something better off being fixed at the mfaktx level (i.e. first thing when the program is run, merge worktodo.add with worktodo.txt) |
[QUOTE=TheMawn;396849]When MISFIT detects a stalled instance, would it be appropriate for it to try to determine if the worktodo.txt is empty, and if it is, automatically transfer work (possibly from worktodo.add) and automatically attempt to restart the instance?
This is assuming the control codes are correctly jiggered. EDIT: I make this suggestion assuming the lack of any better way for MISFIT to determine if that is the reason mfaktx died. EDIT: Again I don't know if this is the kind of functionality we want on the MISFIT level. Perhaps this is something better off being fixed at the mfaktx level (i.e. first thing when the program is run, merge worktodo.add with worktodo.txt)[/QUOTE] A primary goal of MISFIT is to never allow your installations to run out of work. It is possible to configure MISFIT and let everything run for months without human intervention (wait... install those windows patches every month!) If work is running out you have not configured MISFIT "correctly" -- use the work calculator. As for restarting stalled instances.... The ONLY time I have had instances stall is due to overclocking and upon restart I found that the clock speed is always a paltry 420mhz. If MISFIT restarted the instance instead of alarming you could be running crippled and not know it. Also if you have lots of stalls you have something misconfigured with your card or a defective card. It is possible for MISFIT to do more than it does, but coding is a lot of work...... |
Fair enough. To be clear, I am having no issues at all. Haven't had to muck around with the GPUs in weeks. I just saw you mention you were working with MISFIT when you encountered the issue where the empty worktodo.txt would prevent mfaktx from ever running and I was just bouncing ideas around in case you were actively trying to add some functionality to MISFIT to deal with that situation.
It's perfectly fine that you don't. In fact in my (very limited) coding experience, I find the "niche" cases to not be worth dealing with. |
What is the advantage of mfaktc .21 over .20 please?
What is the advantage of 6.5 CUDA over 4.2? Using a GTX 460 running 7.0 (compute capability 2.1) and a 640 running 6.0 (compute capability 3.0). Thanks. |
[QUOTE=vsuite;399291]What is the advantage of mfaktc .21 over .20 please?[/QUOTE][code]version 0.21 (2015-02-17)
- added support for Wagstaff numbers: (2^p + 1)/3 - added support for "worktodo.add" - enabled GPU sieving on CC 1.x GPUs - dropped lower limit for exponents from 1,000,000 to 100,000 - rework selftest (-st and -st2), both now test ALL testcases, -st narrowed the searchspace (k_min < k_factor < k_max) to speedup the selftest. - added random offset for selftest, this might detect bugs in sieve code which a static offset wouldn't find because we always test the same value. - fixed a bug where mfaktc runs out of shared memory (GPU sieve), might be the cause for some reported (but never reproduced?) crashes. This occurs when you - have a GPU with relative small amount of shared memory - have a LOW value for GPUSievePrimes - have a BIG value for GPUSieveSize - fixed a bug when GPUSieveProcessSize is set to 24 AND GPUSieveSize is not a multiple of 3 there was a relative small chance to ignore a factor. - fixed a bug in SievePrimesAdjust causing SievePrimes where lowered to SievePrimesMin for very short running jobs - added missing dependencies to Windows Makefiles - (possible) speedups - funnel shift for CC 3.5 and above - slightly faster integer division for barrett_76,77,79 kernels - lots of cleanups and removal of duplicate code - print per-kernel-stats for selftest "-st" and "-st2"[/code] |
[QUOTE=James Heinrich;399293][code] - slightly faster integer division for barrett_76,77,79 kernels[/code][/QUOTE]
That works out to about 1.5% ± 0.5% from what I've seen. |
[QUOTE=James Heinrich;399293]- enabled GPU sieving on CC 1.x GPUs
- dropped lower limit for exponents from 1,000,000 to 100,000[/QUOTE]The major feature addition for me was GPU sieving below 2[sup]64[/sup], but that didn't make it to the changelog for some reason. |
Thanks much.
My XP Core 2 Quad reports the .21 win32 app as not a valid Win32 application. [Also the 5.5, 6.0 and 6.5 CudaLucas, but not the 5.0 or 4.2 CudaLucas.] It should not be filesize. |
Download again?
[url]http://download.mersenne.ca/mfaktc/mfaktc-0.21/mfaktc-0.21.win.cuda65.zip[/url] |
Thanks again.
|
mkfaktc 0.21 selftest fails, cuda 7.0 on Linux
After compiling 0.21 from source with Cuda toolkit 7.0, I consistently get this self-test failure (always the same number of tests passed/failed at the end). Any hints are appreciated, thanks.
[CODE]mfaktc v0.21 (64bit built) Compiletime options THREADS_PER_BLOCK 256 SIEVE_SIZE_LIMIT 32kiB SIEVE_SIZE 193154bits SIEVE_SPLIT 250 MORE_CLASSES enabled Runtime options SievePrimes 25000 SievePrimesAdjust 1 SievePrimesMin 5000 SievePrimesMax 100000 NumStreams 3 CPUStreams 3 GridSize 3 GPU Sieving enabled GPUSievePrimes 82486 GPUSieveSize 64Mi bits GPUSieveProcessSize 16Ki bits Checkpoints enabled CheckpointDelay 30s WorkFileAddDelay 600s Stages enabled StopAfterFactor bitlevel PrintMode full V5UserID (none) ComputerID (none) AllowSleep no TimeStampInResults no CUDA version info binary compiled for CUDA 7.0 CUDA runtime version 7.0 CUDA driver version 7.0 CUDA device info name GeForce GTX 980 compute capability 5.2 max threads per block 1024 max shared memory per MP 98304 byte number of multiprocessors 16 CUDA cores per MP 128 CUDA cores - total 2048 clock rate (CUDA cores) 1215MHz memory clock rate: 3505MHz memory bus width: 256 bit Automatic parameters threads per grid 1048576 GPUSievePrimes (adjusted) 82486 GPUsieve minimum exponent 1055144 running a simple selftest... ERROR: selftest failed for M49635893 no factor found ERROR: selftest failed for M49635893 no factor found ERROR: selftest failed for M49635893 no factor found ERROR: selftest failed for M49635893 no factor found ERROR: selftest failed for M49635893 no factor found ERROR: selftest failed for M49635893 no factor found ERROR: selftest failed for M49635893 no factor found ERROR: selftest failed for M49635893 no factor found ERROR: selftest failed for M51375383 no factor found ERROR: selftest failed for M51375383 no factor found ERROR: selftest failed for M51375383 no factor found ERROR: selftest failed for M51375383 no factor found ERROR: selftest failed for M51375383 no factor found ERROR: selftest failed for M51375383 no factor found ERROR: selftest failed for M51375383 no factor found ERROR: selftest failed for M51375383 no factor found ERROR: selftest failed for M47644171 no factor found ERROR: selftest failed for M47644171 no factor found ERROR: selftest failed for M47644171 no factor found ERROR: selftest failed for M47644171 no factor found ERROR: selftest failed for M47644171 no factor found ERROR: selftest failed for M47644171 no factor found ERROR: selftest failed for M47644171 no factor found ERROR: selftest failed for M47644171 no factor found ERROR: selftest failed for M51038681 no factor found ERROR: selftest failed for M51038681 no factor found ERROR: selftest failed for M51038681 no factor found ERROR: selftest failed for M51038681 no factor found ERROR: selftest failed for M51038681 no factor found ERROR: selftest failed for M51038681 no factor found ERROR: selftest failed for M51038681 no factor found ERROR: selftest failed for M51038681 no factor found ERROR: selftest failed for M53076719 no factor found ERROR: selftest failed for M53076719 no factor found ERROR: selftest failed for M53076719 no factor found ERROR: selftest failed for M53076719 no factor found ERROR: selftest failed for M53076719 no factor found ERROR: selftest failed for M53076719 no factor found ERROR: selftest failed for M53076719 no factor found ERROR: selftest failed for M53076719 no factor found ERROR: selftest failed for M53123843 no factor found ERROR: selftest failed for M53123843 no factor found ERROR: selftest failed for M53123843 no factor found ERROR: selftest failed for M53123843 no factor found ERROR: selftest failed for M53123843 no factor found ERROR: selftest failed for M53123843 no factor found ERROR: selftest failed for M53123843 no factor found ERROR: selftest failed for M53123843 no factor found ERROR: selftest failed for M3321928703 no factor found ERROR: selftest failed for M3321928703 no factor found ERROR: selftest failed for M3321928703 no factor found ERROR: selftest failed for M3321928703 no factor found ERROR: selftest failed for M3321931973 no factor found ERROR: selftest failed for M3321931973 no factor found ERROR: selftest failed for M3321928619 no factor found ERROR: selftest failed for M3321928619 no factor found Selftest statistics number of tests 107 successfull tests 51 no factor found 56 selftest FAILED! random selftest offset was: 12734519[/CODE] |
It looks like you may be building with code for another card. You need to make sure the makefile has the proper "arch=" and "code=" lines.
[STRIKE]What video card are you using? This will determine what you should have there.[/STRIKE] Duh, it was in your post. Make sure you have --generate-code arch=compute_50,code=sm_50 You could also use compute_52,code=sm_52 as that will work with a 980. |
Hi (and sorry for late reply),
[QUOTE=preda;399753]After compiling 0.21 from source with Cuda toolkit 7.0, I consistently get this self-test failure (always the same number of tests passed/failed at the end). Any hints are appreciated, thanks. [CODE]mfaktc v0.21 (64bit built) [...] GPU Sieving enabled [...] [/CODE][/QUOTE] to debug this issue can you disable GPU sieving (mfaktc.ini) and rerun? Which CUDA version? Which driver version? Any chance to test CUDA toolkit 6.5? Oliver |
[QUOTE=TheJudger;400251]to debug this issue can you disable GPU sieving (mfaktc.ini) and rerun?
Which CUDA version? Which driver version? Any chance to test CUDA toolkit 6.5? Oliver[/QUOTE] I get the same error here (mfaktc 0.20 and 0.21/CUDA toolkit 7.0/Driver versions 346.47, 346.59 and 349.16/Debian 8 and CentOS 7) on a Maxwell card (GTX 970/compute_52/sm_52) but [B]not[/B] on a Kepler card (GTX 650/compute_30/sm_30). Disabling GPU sieving doesn't help. - A compiled binary of mfaktc 0.20 (CUDA toolkit 6.5/Debian 7) worked without problems on the GTX 970 (compute_52/sm_52). - Compiled binaries of mfaktc 0.20 and 0.21 (CUDA toolkit 6.5/CentOS 7) work without problems on the GTX 970 (compute_52/sm_52). - The binary from mersenne.ca (downloading from [URL="http://www.mersenneforum.org"]www.mersenneforum.org[/URL] is blocked) works without problems. |
Hi Ralf,
can you the problematic binary with "-st" (just a few seconds) and tell me whether it fails for specific kernels or does it fail all/"random"? Oliver |
[QUOTE=TheJudger;402368]Hi Ralf,
can you the problematic binary with "-st" (just a few seconds) and tell me whether it fails for specific kernels or does it fail all/"random"? Oliver[/QUOTE] Here is the partial output from a -st run of mfaktc 0.21 on a GTX 970 (Debian 8.0/CUDA Toolkit 7.0). [CODE] Selftest statistics number of tests 26192 successfull tests 15238 no factor found 10954 kernel | success | fail -------------------+---------+------- UNKNOWN kernel | 0 | 0 71bit_mul24 | 2586 | 0 75bit_mul32 | 1021 | 1661 95bit_mul32 | 1024 | 1843 barrett76_mul32 | 1096 | 0 barrett77_mul32 | 1114 | 0 barrett79_mul32 | 0 | 1153 barrett87_mul32 | 1066 | 0 barrett88_mul32 | 1069 | 0 barrett92_mul32 | 0 | 1084 75bit_mul32_gs | 997 | 1423 95bit_mul32_gs | 999 | 1598 barrett76_mul32_gs | 1079 | 0 barrett77_mul32_gs | 1096 | 0 barrett79_mul32_gs | 0 | 1130 barrett87_mul32_gs | 1044 | 0 barrett88_mul32_gs | 1047 | 0 barrett92_mul32_gs | 0 | 1062 selftest FAILED! random selftest offset was: 9507477 [/CODE]Additional Makefile target for compute_52/sm_52 [CODE] Selftest statistics number of tests 26192 successfull tests 15238 no factor found 10954 kernel | success | fail -------------------+---------+------- UNKNOWN kernel | 0 | 0 71bit_mul24 | 2586 | 0 75bit_mul32 | 1021 | 1661 95bit_mul32 | 1024 | 1843 barrett76_mul32 | 1096 | 0 barrett77_mul32 | 1114 | 0 barrett79_mul32 | 0 | 1153 barrett87_mul32 | 1066 | 0 barrett88_mul32 | 1069 | 0 barrett92_mul32 | 0 | 1084 75bit_mul32_gs | 997 | 1423 95bit_mul32_gs | 999 | 1598 barrett76_mul32_gs | 1079 | 0 barrett77_mul32_gs | 1096 | 0 barrett79_mul32_gs | 0 | 1130 barrett87_mul32_gs | 1044 | 0 barrett88_mul32_gs | 1047 | 0 barrett92_mul32_gs | 0 | 1062 selftest FAILED! random selftest offset was: 5153388 [/CODE] |
Hi Ralf,
while you wrote this I was able to reproduce it on my system, too (GTX 980, CUDA 7.0, mfaktc 0.22-pre2). I see [B]exactly[/B] the same numbers of failed and passed selftests (execpt for 71bit_mul24 which is obvious because this kernel is removed in 0.22), so at least the issue is easy to reproduce (and static). -- edit -- [CODE]#define DEBUG_GPU_MATH[/CODE] doesn't show anything... [I]"interesting"[/I] -- edit -- Oliver |
Hi,
did some tests with the barrett79 kernel... seems that inside the main loop the differences are, mostly integer here. Comparing PTX output of CUDA 6.5 vs. 7.0 isn't fun at all... Oliver |
Testcase: M332195503 from 2[SUP]64[/SUP] to 2[SUP]65[/SUP] (there is a known factor in this range 23099992436515618207), hacked the code to for barrett79 usage,
Left side: CUDA 6.5 Right side: CUDA 7.0 [CODE]Mooh Mooh u = 0xCC6E77DC 0718B873 DABCC754 u = 0xCC6E77DC 0718B873 DABCC754 main loop start main loop start tmp96 = 0x00000000 4B0C159B 8DA668D7 tmp96 = 0x00000000 4B0C159B 8DA668D7 a = 0x00000000 4B0C159B 8DA668D7 | a = 0x00000000 4B0C159[B][COLOR="Red"]A[/COLOR][/B] 8DA668D7 b = 0x00000000 1600153B 2D67ACC9 F13EF602 F7C36491 | b = 0x00000000 1600153A 974F8193 D5F22454 F7C36491 [/CODE] [B][COLOR="Red"]WTF?[/COLOR][/B] Srsly? Reminds me [URL="http://mersenneforum.org/showthread.php?p=306728&highlight=carry#post306728"]this[/URL]... Oliver |
OK, indeed a bug with CUDA 7.0 (and/or drivers).
In my latest development version runs a small check on this. CUDA 6.5 + 346.72: [CODE]./mfaktc.exe -v 2 mfaktc v0.22-pre3 (64bit built) [...] CUDA version info binary compiled for CUDA 6.50 CUDA runtime version 6.50 CUDA driver version 7.0 [...] check_subcc_bug() input: mystuff->h_RES[2..0] = 0x33333333 22222222 11111111 output: mystuff->h_RES[5..3] = 0x33333333 22222222 11111111 passed, output == input [...][/CODE] CUDA 7.0 + 346.72: [CODE]./mfaktc.exe mfaktc v0.22-pre3 (64bit built) [...] CUDA version info binary compiled for CUDA 7.0 CUDA runtime version 7.0 CUDA driver version 7.0 [...] check_subcc_bug() input: mystuff->h_RES[2..0] = 0x33333333 22222222 11111111 output: mystuff->h_RES[5..3] = 0x33333333 22222221 11111111 ERROR: output != input could be caused by bad software environment (CUDA toolkit and/or graphics driver) Known bad: - CUDA 5.0.7RC + 302.06.03 with all supported GPUs fixed by driver update after reported this issue to nvidia - CUDA 7.0 + 346.47, 346.59, 346.72 and 349.16 346.72 with Maxwell GPUs [...] [/CODE] [I]check_subcc_bug()[/I] is silent unless[LIST][*]verbosity is greater or equal 2[*]sub.cc bug detected[/LIST] Oliver |
So, Oliver, what do we do? What should we avoid? Who should avoid what?
Do we wait for you to fix it? |
[QUOTE=firejuggler;402552]So, Oliver, what do we do? What should we avoid? Who should avoid what?[/QUOTE]
So it affects CUDA 7.0 + Maxwell class GPUs, just don't use this combination (if you try to do so the builtin selftest will deny productive usage). Right now I see no benefit of CUDA 7.0 over 6.5. [QUOTE=firejuggler;402552]Do we wait for you to fix it?[/QUOTE] Unless nvidia employs me and I learn how to write graphics drivers -> no (nvidia needs to fix!) Oliver |
so, avoid 960/970/980/titan and 7.0 cuda drivers, got it.
|
[QUOTE=firejuggler;402575]so, avoid 960/970/980/titan and 7.0 cuda drivers, got it.[/QUOTE]
CUDA 7.0 capable drivers seem to be OK, mfaktc binary compiled with [B]CUDA Toolkit 7.0[/B] triggers the bug (inside the driver?) At least for Titan X it will be impossible(?) to find a driver not capable of CUDA 7.0. To make a long story short: use CUDA 6.5 binaries and you don't have to care about this bug. Oliver |
[QUOTE=firejuggler;402575]so, avoid 960/970/980/titan [COLOR=Red]X[/COLOR] and 7.0 cuda drivers, got it.[/QUOTE]
FTFY. The plain titans and blacks are ok (in fact, my titan is faster with 5.5, no reason to use 7.0) [edit: just to clarify, titan Z should also be ok, as it is in fact two plain titans fuzed together] |
GTX 960/970/980 and Titan X are OK, too (indeed they are very nice cards!).
Just don't use CUDA Toolkit 7.0 when compiling mfaktc. Oliver |
subcc bug in cuda 7.0
This is a bug confirmed by Nvidia in CUDA Toolkit 7.0, they have a fix that will be released in the next toolkit release after 7.0.
The bug concerns sub with carry -- the carry being sometimes set when it shouldn't. I suspect it may be wrong assembler generated by ptxas (as the PTX is correct). The workaround is to compile with CUDA toolkit 6.5. |
where/when did they confirm? My bugreports last action was just
[QUOTE]Status changed from "Open - pending review" to "Open - in progress"[/QUOTE] I'm not even sure whether this is a bug in the CUDA toolkit or the driver itself, last time they fixed it with an updated driver. Oliver |
subcc bug in cuda 7.0
My bug report to them is here: [url]https://developer.nvidia.com/nvbugs/cuda/edit/1642061[/url] but I suppose only I can view that bug (unfortunately).
Here's a brief on that bug report: The code below generates wrong behavior for the multiword subtraction with borrow sub() routine, where a spurious borrow takes place. Basically: {0, 0, 0, 0, 1, 0} - {0, 0, 0, 0, 0, 0} returns {0, 0xffffffff, 0xffffffff, 0xffffffff, 0, 0}. Here is the source for reproduction: [url]http://pastebin.com/Ab7d8YAh[/url] compile with: nvcc -std=c++11 --gpu-architecture sm_50 bug.cu I ran it on GTX 980. On that bug this is the last update from Nvidia: "This issue has been fixed in our development versions, and the fix would be available for you in the next CUDA release that following of 7.0. Thanks again for the reporting!" |
There is usually some display lag when mfaktc is running. However, my computer is running a lot smoother now even though I did not change any mfaktc settings. Could this be related to the latest driver update? Anyone else seeing the same thing?
|
[QUOTE=ixfd64;402917]There is usually some display lag when mfaktc is running. However, my computer is running a lot smoother now even though I did not change any mfaktc settings. Could this be related to the latest driver update? Anyone else seeing the same thing?[/QUOTE]
Lag still comes and goes for me, with driver 350.12. If I want the screen to be responsive enough for playing video or editing images, I restart the 580 that runs the display with something like 'GPUSieveSize=40'. |
subcc bug persists in CUDA 7.5 RC
CUDA 7.5 RC is available for registered developers.
Using [B]cuda_7.5.7_rc_linux.run[/B] (includes driver 352.07 and nvcc 7.5.6) the problem is [B][U]NOT fixed[/U][/B]. So for now:[LIST][*]don't use CUDA toolkit 7.0 for [B][U]compiling[/U][/B] mfaktc[*]don't use CUDA toolkit 7.5 RC for [B][U]compiling[/U][/B] mfaktc[/LIST] I repeat myself: using a CUDA 7 capable [B]driver[/B] is okay for mfaktc! Oliver |
Hi everyone,
just a little update: It is not only "the subcc"-bug, there are at least two issues with CUDA 7.0 and 7.5RC in regard to mfaktc. Nvidia told me that the subcc stuff is fixed in their internal built and will be included in the final CUDA 7.5 package. Nvidia told me that they have fixed the other issue(s) in their internal built, too. Unfortionally this fix won't be included in CUDA 7.5... They'll fix this in a "feature release". So my previous [URL="http://www.mersenneforum.org/showpost.php?p=405539&postcount=2562"]post[/URL] is still valid and there is a high chance that CUDA 7.5 (final) needs to be added to the list. :sad: The good news are that there is (right now) no need for CUDA 7.x in mfaktc:[LIST][*]no new GPUs supported by CUDA 7.x.[*]I didn't notice any stuff which might increase the performance of mfaktc (such as the "funnel shift" in CUDA 5.0)[/LIST] Oliver |
Hi, can someone tell me why CUDA 6.5 mfaktc doesn't launch on Windows XP 32bit? It says it's an invalid Win32 application. I tried to redownload it but it still doesn't launch, the CUDA 4.2 version launches normally.
Specs: AMD Athlon 64 3200+ Venice OCed to 2.3GHz 1.5GB RAM GeForce GT 630 |
Missing libraries? Do you have cudart 6.5 dlls?
Try opening a command prompt and launch the program from there, which may allow you to see the real error (before the window disappears). |
[url]http://imgur.com/xYNZCbU[/url] This is what happens when I open it.
[url]http://imgur.com/UD0mELG[/url] When I close it it says access denied. |
[QUOTE=LaurV;407431]Missing libraries? Do you have cudart 6.5 dlls?
Try opening a command prompt and launch the program from there, which may allow you to see the real error (before the window disappears).[/QUOTE] When I open it from the command prompt it says that it's not a valid win32 application, when I hit OK it says in the command prompt that access is denied. |
[QUOTE=TheDomis;408101]When I open it from the command prompt it says that it's not a valid win32 application, when I hit OK it says in the command prompt that access is denied.[/QUOTE]
Perhaps need to run cmd.exe as administrator? |
OK I figured it out, apparently Win XP is unable to run VS2012 applications, it has to be compiled with VS2010 or .NET 4.0 I think. .NET 4.5 is unsupported in Win XP.
|
1 Attachment(s)
If anyone has an old rig with WinXP but cannot run the latest version of mfaktc with CUDA 6.5, I have just compiled mfaktc-0.21 x86 CUDA 6.5 with VS2010 instead of VS2012, so it will run on WinXP. I couldn't compile a x64 version because this PC is running 32-bit Windows. Enjoy.
|
You can also create applications for Windows XP using Visual Studio 2013. Check this [URL="https://www.visualstudio.com/en-us/products/visual-studio-2013-compatibility-vs.aspx"]compatibility list for VS 2013[/URL].
|
Yes, but you can't create or launch .NET 4.5 applications in Windows XP which I believe was the culprit.
|
[QUOTE=TheDomis;408103]OK I figured it out, apparently Win XP is unable to run VS2012 applications, it has to be compiled with VS2010 or .NET 4.0 I think. .NET 4.5 is unsupported in Win XP.[/QUOTE]
Thanks for solving it. I was getting that problem with WINXP and several of the win32 downloads (mfaktc and possibly also cudalucas). |
[QUOTE=TheJudger;402624]GTX 960/970/980 and Titan X are OK, too (indeed they are very nice cards!).
Just don't use CUDA Toolkit 7.0 when compiling mfaktc. Oliver[/QUOTE] I installed a Titan Z and compiled a CUDA 7.5 version. Will the built-in self test check if everything is working now? |
Hi Jerry,
[QUOTE=flashjh;410558]I installed a Titan Z and compiled a CUDA 7.5 version. Will the built-in self test check if everything is working now?[/QUOTE] please stay with CUDA [B][U]6.5[/U][/B]. CUDA 7.0 and CUDA 7.5 are broken for mfaktc (nvidia confirmed that it is a CUDA toolkit fault and will be fixed in future versions. Actually the fixed the subcc bug again in CUDA 7.5 but there are additional issues.). On the other hand your Titan Z is a Kepler chip, the issues mentioned above affect "only" Maxwell generation. The good news is that even the "simple selftest" executed each time on startup fails on Maxwell with CUDA 7.0/7.5. Oliver |
Thanks. The 7.5 version is faster, I just didn't want to use it with the Z without double checking with you. I'm going to build a few others and do some testing.
Also, though CUDA 7.5 (possibly lower) supports MSVS 2013, the programs do not run when compiled. 2012 does work. Any thoughts? |
Hi Jerry,
for all builts I recommend to run the long selftest (mfaktc.exe -st) on at least the target architecture. If it passes the long selftest on your Kepler card(s) I should be OK. But always keep in mind once you upgrade to a kepler card the binary will fail. I haven't spent much time on testing mfaktc with CUDA 7.0/7.5 because of Maxwell issues. Do you have more details than "doesn't work" in regard to MSVS 2013? Oliver |
[QUOTE=TheJudger;410672]Do you have more details than "doesn't work" in regard to MSVS 2013?
Oliver[/QUOTE] Sorry Oliver, I knew better than that! Here is the output of a self test. This happens after building with MSVS 2013 on the 580 and the Titan Z [CODE]########## testcase 1/2867 ########## Starting trial factoring M50804297 from 2^67 to 2^68 (0.59 GHz-days) Using GPU kernel "75bit_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Sep 17 18:40 | 3387 0.1% | 0.001 n.a. | n.a. 82485 n.a.% ERROR: cudaGetLastError() returned 8: invalid device function[/CODE] If I try to just run an exponent instead of self test is just errors: [CODE]running a simple selftest... ERROR: cudaGetLastError() returned 8: invalid device function[/CODE] I had the same problem with 2013 on CUDA 6.5, but I never did much with it since 2012 works. I don't want anyone to spend time fixing it (at least for now), just didn't know if you had seen anything similar. Jerry |
Hi Jerry,
one possibility to trigger "invalid device function" is the wrong compute capability. But I guess you already know this and did a proper built with e.g. CC 1.1, 2.0, 3.0, 3.5 and 5.0. Oliver |
Yes, used the same makefile for 2013 and 2012. CUDA 7.5 does not support 1.x anymore, not that it matters here.
|
Hi Jerry,
[LIST][*]do you have "--ptxas-options=-v" set in Makefile? If so any differences in output (MSVC 2012 vs. 2013) during compile?[*]can you try to compile e.g. only for CC 2.0 and test it on your GTX 580 (or only CC 3.5 for your TITAN Z)?[/LIST] Oliver |
Just now, when I submitted some TF results, all the entries had the red line seen below-
[QUOTE]processing: TF no-factor for [URL="http://www.mersenne.org/M75535921"]M75535921[/URL] (274-275) [COLOR=Red][B]Notice: Undefined index: log_anyway in C:\inetpub\www\manual_result\manual_result.inc.php on line 120 [/B][/COLOR] CPU credit is 50.6520 GHz-days.[/QUOTE]Does this indicate something misconfigured? Is it me or the server? There is no "\www" directory in inetpub on my machine, only "\wwwroot". |
My bad, should be fixed now. The problem should not affect any of the results you submitted.
|
[QUOTE=kladner;429137]Is it me or the server? [/QUOTE]
It's the server. Think about it -- it will be a security nightmare if the server could just reach into your hard disk just like that! |
[QUOTE=Prime95;429142]My bad, should be fixed now. The problem should not affect any of the results you submitted.[/QUOTE]
The credit and completion came through, no problem. [U] axn[/U]- I figured that was the case, but it seemed tactless to state that as fact. |
I was factoring an exponent from 73 to 74 bits when the computer completely froze, requiring a hard reboot. After starting up mfaktc again, I noticed that the factoring had started over, suggesting that the save file had been corrupted. I thought about uploading a copy of it here in case someone wanted to investigate the issue, but the file had already been overwritten. Just putting this out there.
On the subject of which, maybe it would be a good idea to give mfaktc the ability to create backup save files like Prime95 does? |
Hi,
[QUOTE=ixfd64;431080]I was factoring an exponent from 73 to 74 bits when the computer completely froze, requiring a hard reboot. After starting up mfaktc again, I noticed that the factoring had started over, suggesting that the save file had been corrupted. I thought about uploading a copy of it here in case someone wanted to investigate the issue, but the file had already been overwritten. Just putting this out there. On the subject of which, maybe it would be a good idea to give mfaktc the ability to create backup save files like Prime95 does?[/QUOTE] Short answer: No! Longer answer: No! The checkpoints are most likely a atomic write to the filesystem (actually fopen(), a single fprintf() (less than 512 bytes) and a fclose()). Because the fprintf() is atomic it is very unlikely that this will yield a corrupted checkpoint. It could be an empty checkpoint file but that isn't very likely, too. If such things happens I would fix the computer before doing anything useful. Maybe there was just no checkpoint because prior the system lockup there wasn't much work done on that step? Oliver |
[QUOTE=TheJudger;431306]The checkpoints are most likely a atomic write to the filesystem (actually fopen(), a single fprintf() (less than 512 bytes) and a fclose()). Because the fprintf() is atomic it is very unlikely that this will yield a corrupted checkpoint. It could be an empty checkpoint file but that isn't very likely, too.[/QUOTE]
That doesn't sound atomic. If something goes wrong between fopen and fprintf, or more likely, if the OS hasn't actually propagated the write from memory to disk even after fopen->fprintf->fclose has completed, you'll end up with empty checkpoint. It is rare, but can happen, even if everything is working exactly as expected. Hence the advantage of multiple checkpoint files -- even if one fails, the other one(s) will be there, and loss of work can be minimized. |
[QUOTE=TheJudger;431306]atomic write to the filesystem (actually fopen(), a single fprintf() (less than 512 bytes) and a fclose())[/QUOTE][QUOTE=axn;431312]That doesn't sound atomic.[/QUOTE]A relatively minor change of writing to [i]Mxxxxx.tmp[/i] and then renaming over [i]Mxxxxx.ckp[/i] after the write is complete would atomize it. You'll either have the new checkpoint, or in worst case if a crash happens during the checkpoint-write process you'll have the previous checkpoint and a temp file (which may or may not be correctly written).
|
This is not rocket science, lets keep it simple. One [B]could[/B] argue that if you can't write a simple checkpoint reliable on your machine you won't trust the main calculation, too.
Oliver |
[QUOTE=TheJudger;431306]Maybe there was just no checkpoint because prior the system lockup there wasn't much work done on that step?[/QUOTE]
When I started up mfaktc before the crash, the assignment was already about 37% done, so I'm pretty sure there was a checkpoint. On the subject of which, is there any way to tell mfaktc to start at a certain class other than by hacking the save file? |
[QUOTE=ixfd64;431328]On the subject of which, is there any way to tell mfaktc to start at a certain class other than by hacking the save file?[/QUOTE]
Expect hacking the code? No (for obvious reasons). Oliver |
Windows 7 had a nice feature, called “Previous Versions” (Windows 8 and later have it replaced with something called “File History” which is not as good).
You just right click on a file and you can see or restore a previous version. This functionality is also usually available with NAS configured to make automatic snapshots. Or, at the very least, checkpoint files can be restored from daily backups, manually. Last time I checked, mfaktc was open source. The encryption routine is just a few lines long. Modifying checkpoint files is useful when one wants to split a bit level among several GPUs. |
Hi,
[QUOTE=TObject;431343]Last time I checked, mfaktc was open source. The encryption routine is just a few lines long. Modifying checkpoint files is useful when one wants to split a bit level among several GPUs.[/QUOTE] it is [B]not encryption[/B], it is just a [B]checksum[/B]. Is there any good reason why you would split a single assignment through multiple GPUs on a regular basis? I'm afraid this discussion leads to a "howto forge false results" even if not intended by you. Oliver |
We are with Oliver here.
We used mfaktc for years and never had problems with checkpoint files. [edit: we do checkpoint every 30 minutes, or so] Also, if really needed, for assignment that would take ages, splitting one expo over many cards is no problem, one simple pari or perl script can create the checkpoint file to start with some predetermined class. [edit: you still have to watch them to know when to stop each of them, except the last who stops by itself after the last class] |
0.21 for a Mac
Has anyone compiled mfaktc 0.21 for a Mac?
|
I compiled mfaktc using CUDA 8.0 and computer 6.1 for the pascal 1080 Founders Edition, but it looks like the bugs in 7.0 and 7.5 are still present. 43/107 self tests failed.
|
[QUOTE=airsquirrels;435955]I compiled mfaktc using CUDA 8.0 and computer 6.1 for the pascal 1080 Founders Edition, but it looks like the bugs in 7.0 and 7.5 are still present. 43/107 self tests failed.[/QUOTE]
You're talking about CUDA 8.0 RC I guess... right and wrong. At least they fixed the sub_cc bug on Maxwell which allows me to analyze the remainig bug(s) at least... Oliver P.S. reopened the bugreport already |
Can you run an unmodified mfaktc 0.21 selftest (mfaktc.exe -st) on your Pascal GPU?
CUDA 8.0RC doesn't look that bad for me on my GTX980 (compute 5.2): [CODE] kernel | success | fail -------------------+---------+------- UNKNOWN kernel | 0 | 0 75bit_mul32 | 0 | 2682 95bit_mul32 | 0 | 2867 barrett76_mul32 | 1096 | 0 barrett77_mul32 | 1114 | 0 barrett79_mul32 | 1153 | 0 barrett87_mul32 | 1066 | 0 barrett88_mul32 | 1069 | 0 barrett92_mul32 | 1084 | 0 75bit_mul32_gs | 0 | 2420 95bit_mul32_gs | 0 | 2597 barrett76_mul32_gs | 1079 | 0 barrett77_mul32_gs | 1096 | 0 barrett79_mul32_gs | 1130 | 0 barrett87_mul32_gs | 1044 | 0 barrett88_mul32_gs | 1047 | 0 barrett92_mul32_gs | 1062 | 0 [/CODE] Much better than 7.0 and 7.5 with the subcc bug... Oliver |
Here is 8.0 with compute 6.1, I will redo with the same compute as you:
UPDATE: compute less than 6.1 throws 'ERROR: cudaGetLastError() returned 8: invalid device function' [CODE] Selftest statistics number of tests 26192 successfull tests 13434 no factor found 12758 kernel | success | fail -------------------+---------+------- UNKNOWN kernel | 0 | 0 71bit_mul24 | 2586 | 0 75bit_mul32 | 0 | 2682 95bit_mul32 | 0 | 2867 barrett76_mul32 | 1096 | 0 barrett77_mul32 | 1114 | 0 barrett79_mul32 | 1153 | 0 barrett87_mul32 | 1066 | 0 barrett88_mul32 | 1069 | 0 barrett92_mul32 | 1084 | 0 75bit_mul32_gs | 0 | 2420 95bit_mul32_gs | 0 | 2597 barrett76_mul32_gs | 1079 | 0 barrett77_mul32_gs | 1096 | 0 barrett79_mul32_gs | 0 | 1130 barrett87_mul32_gs | 1044 | 0 barrett88_mul32_gs | 1047 | 0 barrett92_mul32_gs | 0 | 1062 [/CODE] |
:sad: :sad: :sad: :sad: :sad:
Looks like they (nvidia) has a big issue with subcc... looks like the same bug is not fixed yet for Pascal... Nvidia doesn't like me/mfaktc! Oliver |
Hi,
thanks to David we know[LIST][*]that you can't run mfaktc one Pascal (GTX 1070/1080) [B][U]today[/U][/B] (likely a bug in CUDA 8.0RC)[*]the issue is the same as with Maxwell (but with Maxwell you can go for CUDA 6.5) (CUDA 7.0 and 7.5 have even worse bugs related to [I]subcc[/I]).[/LIST] For now [B][U]I guess[/U][/B] that nvidia didn't fix the subcc bug completly. :sad: [B][U]For now[/U][/B] I can't recommend to buy a Pascal GPU if the (only) purpose is running mfaktc! That is sad because the performance numbers are really sweet... over 1THz equivalent (Davids GTX 1080) with less than 200W. That is more than 5GHz equivalent per watt! Oliver |
[B][I][U]should[/U][/I][/B] be fixed in final CUDA 8.0.
Oliver |
[QUOTE=TheJudger;436892][B][I][U]should[/U][/I][/B] be fixed in final CUDA 8.0.
Oliver[/QUOTE] :bow: |
failwell enhancement for checkpoint write error
It would be useful following a 'WARNING, could not write checkpoint file "M#########.ckp"' for the checkpoint to be output to stdout so that the file can be manually created if necessary. Ideally this could be enabled for every checkpoint with the introduction of an additional mfaktc.ini parameter or a new switch.
[QUOTE=LaurV;431381]We are with Oliver here. We used mfaktc for years and never had problems with checkpoint files. [edit: we do checkpoint every 30 minutes, or so] Also, if really needed, for assignment that would take ages, splitting one expo over many cards is no problem, one simple pari or perl script can create the checkpoint file to start with some predetermined class. [edit: you still have to watch them to know when to stop each of them, except the last who stops by itself after the last class][/QUOTE] Do you have the script available? Or are you able to generate the checksum for class 1808 of a multi-bit range factoring of ^77 to ^81 for M332347303? |
Can you post the contents of [U]any[/U] checkpoint file? (make one, copy paste the text here).
So I can adjust the checksum to match yours, there are different calculus for different mfaktc versions. I don't have access to my computer at home right now, (something is wrong there, I am at job, lunch break), but I can put together a small C program to do that, by shamelessly copying from Oliver's code, which is public, on the web. The "checkpoint.c", in mfaktc distribution, first two functions are all you need, ctrl+c, ctrl+v in your favorite IDE, then add a "main" and here you go. [CODE] #include "stdafx.h" #define CHECKPOINT_FILE "mfaktc.ckp" #define NUM_CLASSES 4620 #define MFAKTC_VERSION "0.20" unsigned int checkpoint_checksum(char *string, int chars) /* generates a CRC-32 like checksum of the string */ { unsigned int chksum = 0; int i, j; for (i = 0; i<chars; i++) { for (j = 7; j >= 0; j--) { if ((chksum >> 31) == (((unsigned int)(string[i] >> j)) & 1)) { chksum <<= 1; } else { chksum = (chksum << 1) ^ 0x04C11DB7; } } } return chksum; } // writes the checkpoint file void checkpoint_write(unsigned int exp, int bit_min, int bit_max, int cur_class, int num_factors) { FILE *f; char buffer[100], filename[20]; unsigned int i; sprintf_s(filename, "M%u.ckp", exp); fopen_s(&f, filename, "w"); if (f == NULL) { printf("WARNING, could not write checkpoint file \"%s\"\n", CHECKPOINT_FILE); } else { sprintf_s(buffer, "%u %d %d %d %s: %d %d", exp, bit_min, bit_max, NUM_CLASSES, MFAKTC_VERSION, cur_class, num_factors); i = checkpoint_checksum(buffer, strlen(buffer)); fprintf(f, "%u %d %d %d %s: %d %d %08X", exp, bit_min, bit_max, NUM_CLASSES, MFAKTC_VERSION, cur_class, num_factors, i); fclose(f); } } int _tmain(int argc, _TCHAR* argv[]) { unsigned int exp; int bmin, bmax, cls; char ch; printf("Exponent : "); scanf_s("%u", &exp); printf("From bitlevel : "); scanf_s("%d", &bmin); printf("To bitlevel : "); scanf_s("%d", &bmax); printf("Current class : "); scanf_s("%d", &cls); checkpoint_write(exp, bmin, bmax, cls, 0); //assume no factors were found by former runs printf("\nDone. Use it at your own risk...\nPress a key to exit."); ch=_getch(); return 0; } [/CODE] Assuming you can't compile, and assuming my code is right, and assuming you use version 0.20 of the code, this is what is generated for your data: [CODE]332347303 77 81 4620 0.20: 1808 0 A60FF311[/CODE] |
[QUOTE=LaurV;438881]Can you post the contents of [U]any[/U] checkpoint file? (make one, copy paste the text here).
So I can adjust the checksum to match yours, there are different calculus for different mfaktc versions. [/QUOTE] Thanks LaurV, I didn't consider the checksum might differ between versions, since they don't with prime95. Here is an earlier checkpoint: [CODE]M332347303 77 81 4620 0.21: 1805 0 57B2DB5F[/CODE] [QUOTE=LaurV;438881]Assuming you can't compile, and assuming my code is right, and assuming you use version 0.20 of the code, this is what is generated for your data: [CODE]332347303 77 81 4620 0.20: 1808 0 A60FF311[/CODE][/QUOTE] I did test the checkpoint you generated, but as you noted since I'm using a different version (0.21) mfakto didn't recognise it. |
FYI, I have completed the lost TF work and the checkpoint reads:
[CODE]M332347303 77 81 4620 0.21: 1808 0 5FCDA1FC[/CODE] |
Ok, sorry I didn't have time to revisit this topic, in fact I didn't consider it priority anymore, because I saw you redone the work anyhow. Just to do a knot on the lose ends, here is the code that does the checksum for version 0.21, also copied from Oliver's code which is available on web, I only replaced scanf/open/etc with their "safe" versions to avoid vc++ making a big scandal of it...
Version 0.21 added that "M" in front, to distinguish from "W" when mfaktc is used for Wagstaff numbers. Therefore the difference in the file. This code [U]does[/U] generate checksums as you expect (and matching what you posted here, I tested it). To generate checksums for wagstaff numbers, you have to modify the define (or define WAGSTAFF). [CODE] #include "stdafx.h" #define NUM_CLASSES 4620 #define MFAKTC_VERSION "0.21" #ifdef WAGSTAFF #define NAME_NUMBERS "W" #else /* Mersennes */ #define NAME_NUMBERS "M" #endif unsigned int checkpoint_checksum(char *string, int chars) /* generates a CRC-32 like checksum of the string */ { unsigned int chksum = 0; int i, j; for (i = 0; i<chars; i++) { for (j = 7; j >= 0; j--) { if ((chksum >> 31) == (((unsigned int)(string[i] >> j)) & 1)) { chksum <<= 1; } else { chksum = (chksum << 1) ^ 0x04C11DB7; } } } return chksum; } // writes the checkpoint file void checkpoint_write(unsigned int exp, int bit_min, int bit_max, int cur_class, int num_factors) { FILE *f; char buffer[100], filename[20]; unsigned int i; sprintf_s(filename, "%s%u.ckp", NAME_NUMBERS, exp); fopen_s(&f, filename, "w"); if (f == NULL) { printf("WARNING, could not write checkpoint file \"%s\"\n", filename); } else { sprintf_s(buffer, "%s%u %d %d %d %s: %d %d", NAME_NUMBERS, exp, bit_min, bit_max, NUM_CLASSES, MFAKTC_VERSION, cur_class, num_factors); i = checkpoint_checksum(buffer, strlen(buffer)); fprintf(f, "%s%u %d %d %d %s: %d %d %08X", NAME_NUMBERS, exp, bit_min, bit_max, NUM_CLASSES, MFAKTC_VERSION, cur_class, num_factors, i); fclose(f); } } //======================================================= int _tmain(int argc, _TCHAR* argv[]) { unsigned int exp; int bmin, bmax, cls; char ch; printf("Exponent : "); scanf_s("%u", &exp); printf("From bitlevel : "); scanf_s("%d", &bmin); printf("To bitlevel : "); scanf_s("%d", &bmax); printf("Current class : "); scanf_s("%d", &cls); checkpoint_write(exp, bmin, bmax, cls, 0); //assume no factors were found by former runs printf("\nDone. Use it at your own risk...\nPress a key to exit."); ch=_getch(); return 0; } [/CODE] [CODE]M332347303 77 81 4620 0.21: 1808 0 5FCDA1FC[/CODE] |
Feature request: -tf extention, resume bit-range from particular class
Feature request:
Expansion of -tf switch to include support for beginning from a particular class. This feature has at least two real world applications:[LIST=1][*]Resuming from the last checked class following a checksum write error [*]Resuming from a particular class following the successful discovery of a factor, in order to complete the bit-range[/LIST] Additionally, if it is trivial to implement, then the ability to resume from the bit-range and class in which a factor exists. I'm not sure how this would work along-side compound factors. An example for this usage would be when attempting to complete any remaining factorisation of an exponent such as [URL="http://www.mersenne.org/report_exponent/?exp_lo=9100919&full=1"]M9100919[/URL], where no bit-ranges have been included with factor submissions. |
Hi,
[QUOTE=mattmill30;439953] Resuming from a particular class following the successful discovery of a factor, in order to complete the bit-range[/QUOTE] Short: not possible! Long: not possible, because we don't know which application reported the factor, which settings where used, etc. Prime95 splits the search space in residue classes mod 96(?) over the factor candidates (FCs) while mfaktc can do residue classes mod 420 or 4620 over the k in FC = 2kp+1. Oliver |
[QUOTE=TheJudger;439954]Prime95 splits the search space in residue classes mod 96(?)[/QUOTE]
mod 120 |
Thank you for correction. It was too late yesterday. I know the numbers for mfaktc and I know Prime95 uses somewhat less residues classes but had the wrong number in my mind.
|
ERROR: cudaGetLastError() returned 8: invalid device function
Hi,
I've encounter a error, can anyone help me to solve it? Thanks! D:\mfaktc>mfaktc-win-64.exe mfaktc v0.21 (64bit built) Compiletime options THREADS_PER_BLOCK 256 SIEVE_SIZE_LIMIT 32kiB SIEVE_SIZE 193154bits SIEVE_SPLIT 250 MORE_CLASSES enabled Runtime options SievePrimes 25000 SievePrimesAdjust 1 SievePrimesMin 5000 SievePrimesMax 100000 NumStreams 3 CPUStreams 3 GridSize 3 GPU Sieving enabled GPUSievePrimes 82486 GPUSieveSize 64Mi bits GPUSieveProcessSize 16Ki bits Checkpoints enabled CheckpointDelay 30s WorkFileAddDelay 600s Stages enabled StopAfterFactor bitlevel PrintMode full V5UserID (none) ComputerID (none) AllowSleep no TimeStampInResults no CUDA version info binary compiled for CUDA 6.50 CUDA runtime version 6.50 CUDA driver version 8.0 CUDA device info name GeForce GTX 1070 compute capability 6.1 max threads per block 1024 max shared memory per MP 98304 byte number of multiprocessors 15 clock rate (CUDA cores) 1708MHz memory clock rate: 4004MHz memory bus width: 256 bit Automatic parameters threads per grid 983040 GPUSievePrimes (adjusted) 82486 GPUsieve minimum exponent 1055144 running a simple selftest... ERROR: cudaGetLastError() returned 8: invalid device function D:\mfaktc> |
i tried this on my 1080 and i get the error
"ERROR: Cudagetlasterror() returned: 8 invalid device function" Can you upgrade the program to function with pascal and the new CUDA architecture? |
You must compile for the specific compute version and CUDA version of the card you are using. In this case the 8.0 RC and compute 6.1. Each generation of GPUs requires a separate build
|
[QUOTE=airsquirrels;442058]You must compile for the specific compute version and CUDA version of the card you are using. In this case the 8.0 RC and compute 6.1. Each generation of GPUs requires a separate build[/QUOTE]
??? I have used old binaries with my 750Ti |
[QUOTE=henryzz;442060]???
I have used old binaries with my 750Ti[/QUOTE] Some of the older cards/CUDA versions supported multiple compute versions and architectures, but Maxwell and Pascal both seem to required specific builds. |
[QUOTE=airsquirrels;442058]You must compile for the specific compute version and CUDA version of the card you are using. In this case the 8.0 RC and compute 6.1. Each generation of GPUs requires a separate build[/QUOTE]
can you please tell me how to do this for windows x64. |
| All times are UTC. The time now is 22:00. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.