![]() |
My website is also updated with Brian's latest binary.
Jeff. |
[QUOTE=Brian Gladman;346581]I have found the issue - I was compiling for a UNICODE character set so the ASCII file names were not being recognised. I will shortly pass a corrected version to Jeff.
Thanks to WraithX for his help in finding this. I have added the x64 i7 version below as a ZIP file. Brian[/QUOTE] Hello Brian, I'm glad you were able to find and fix this problem. However, I have run into a new problem. I first started off by running your new binary on L1911 C167. It read in all relations and was able to create a matrix. However, there seems to be something wrong with the multi-threaded matrix solving. On L1911 I used "-t 16", and I got this error message: [CODE] commencing linear algebra read 8834485 cycles cycles contain 25231340 unique relations read 25231340 relations using 20 quadratic characters above 2147480922 building initial matrix memory use: 3433.3 MB read 8834485 cycles matrix is 8834307 x 8834485 (2692.4 MB) with weight 833609903 (94.36/col) sparse part has weight 599790147 (67.89/col) filtering completed in 2 passes matrix is 8830805 x 8830983 (2692.1 MB) with weight 833487913 (94.38/col) sparse part has weight 599756073 (67.91/col) matrix starts at (0, 0) matrix is 8830805 x 8830983 (2692.1 MB) with weight 833487913 (94.38/col) sparse part has weight 599756073 (67.91/col) saving the first 48 matrix rows for later matrix includes 64 packed rows matrix is 8830757 x 8830983 (2587.2 MB) with weight 664939915 (75.30/col) sparse part has weight 589918881 (66.80/col) using block size 8192 and superblock size 1966080 for processor cache size 20480 kB commencing Lanczos iteration (16 threads) memory use: 2110.1 MB line 225 - Failed to obtain a task from the jobs queue. line 274 - Warning an error has occurred when trying to obtain a worker task. line 275 - The worker thread has exited. line 225 - Failed to obtain a task from the jobs queue. line 274 - Warning an error has occurred when trying to obtain a worker task. line 275 - The worker thread has exited. line 225 - Failed to obtain a task from the jobs queue. line 274 - Warning an error has occurred when trying to obtain a worker task. line 275 - The worker thread has exited. line 225 - Failed to obtain a task from the jobs queue. line 274 - Warning an error has occurred when trying to obtain a worker task. line 275 - The worker thread has exited. line 225 - Failed to obtain a task from the jobs queue. line 274 - Warning an error has occurred when trying to obtain a worker task. line 275 - The worker thread has exited. [/CODE] And then msieve was still running, but using 0% cpu. I tried this out with the C120 and ran into a similar problem. If you run the C120 with the command: msieve -v -nc -t 2 It will read in all relations, build a matrix, it will say "commencing Lanczos iteration" but then it will just sit there, not doing anything. When you have some time, can you try to troubleshoot this new issue? |
Note that SVN920 contains a fix for a nasty problem with MPI+threads, and I can't say for sure that threads-only code is unaffected by the bug.
|
[QUOTE=WraithX;346648]Hello Brian, I'm glad you were able to find and fix this problem. However, I have run into a new problem. I first started off by running your new binary on L1911 C167. It read in all relations and was able to create a matrix. However, there seems to be something wrong with the multi-threaded matrix solving. On L1911 I used "-t 16", and I got this error message:
[CODE] commencing linear algebra read 8834485 cycles cycles contain 25231340 unique relations read 25231340 relations using 20 quadratic characters above 2147480922 building initial matrix memory use: 3433.3 MB read 8834485 cycles matrix is 8834307 x 8834485 (2692.4 MB) with weight 833609903 (94.36/col) sparse part has weight 599790147 (67.89/col) filtering completed in 2 passes matrix is 8830805 x 8830983 (2692.1 MB) with weight 833487913 (94.38/col) sparse part has weight 599756073 (67.91/col) matrix starts at (0, 0) matrix is 8830805 x 8830983 (2692.1 MB) with weight 833487913 (94.38/col) sparse part has weight 599756073 (67.91/col) saving the first 48 matrix rows for later matrix includes 64 packed rows matrix is 8830757 x 8830983 (2587.2 MB) with weight 664939915 (75.30/col) sparse part has weight 589918881 (66.80/col) using block size 8192 and superblock size 1966080 for processor cache size 20480 kB commencing Lanczos iteration (16 threads) memory use: 2110.1 MB line 225 - Failed to obtain a task from the jobs queue. line 274 - Warning an error has occurred when trying to obtain a worker task. line 275 - The worker thread has exited. line 225 - Failed to obtain a task from the jobs queue. line 274 - Warning an error has occurred when trying to obtain a worker task. line 275 - The worker thread has exited. line 225 - Failed to obtain a task from the jobs queue. line 274 - Warning an error has occurred when trying to obtain a worker task. line 275 - The worker thread has exited. line 225 - Failed to obtain a task from the jobs queue. line 274 - Warning an error has occurred when trying to obtain a worker task. line 275 - The worker thread has exited. line 225 - Failed to obtain a task from the jobs queue. line 274 - Warning an error has occurred when trying to obtain a worker task. line 275 - The worker thread has exited. [/CODE] And then msieve was still running, but using 0% cpu. I tried this out with the C120 and ran into a similar problem. If you run the C120 with the command: msieve -v -nc -t 2 It will read in all relations, build a matrix, it will say "commencing Lanczos iteration" but then it will just sit there, not doing anything. When you have some time, can you try to troubleshoot this new issue?[/QUOTE] Hi David, I can confirm that it hangs but I have no idea why. It is waiting on line 555 of thread.c: if (pthread_mutex_lock(&(pool->mutex))) { but the lock never gets released. Maybe Jason might have an idea about why this is happening. Brian |
Oh dear :sad:
[code]=> "..//msieve" -s w64_351.dat -l ggnfs.log -i w64_351.ini -v -nf w64_351.fb -t 6 -nc1 Msieve v. 1.52 (SVN 917) Fri Jul 19 12:22:58 2013 random seeds: 32b6b2c8 7c7747f5 factoring 47179640899638361901376647020473549225378524025587881178401036218602944031199204750028727953293287374137883311870909006957005666750423557281003600949961871343428044757397222781318592936824461360214323 (200 digits) no P-1/P+1/ECM available, skipping ... begin with 15463094 relations and 16277530 unique ideals reduce to 15397308 relations and 15260488 ideals in 8 passes max relations containing the same ideal: 185 relations with 0 large ideals: 4782 relations with 1 large ideals: 174 relations with 2 large ideals: 4337 relations with 3 large ideals: 50509 relations with 4 large ideals: 321406 relations with 5 large ideals: 1218626 relations with 6 large ideals: 2854997 relations with 7+ large ideals: 10942477 commencing 2-way merge reduce to 9133936 relation sets and 8997117 unique ideals ignored 1 oversize relation sets commencing full merge Return value 4. Terminating... [1]+ Exit 255 ../factMsieve.pl w64_351.poly [/code]This on a 64-bit Fedora system. I'll first see whether the latest SVN (923) is any better. If not, I'll revert to 900 which seems pretty solid. Paul |
[QUOTE=Brian Gladman;346652]I can confirm that it hangs but I have no idea why. It is waiting on line 555 of thread.c:
if (pthread_mutex_lock(&(pool->mutex))) { but the lock never gets released[/QUOTE] Hi Brian, What version of the msieve svn were you using for this test? Have you tried svn923? Also, speaking of svn versions, I have noticed that several of your binaries do not have svn information in them. (when you run msieve, the svn version is printed next to the msieve version) Are you just visiting the SourceForge web page and downloading a snapshot? If so, this method does not get the svn version info into msieve. If you download msieve via svn, it will put the svn info into msieve. If you don't have an svn client, I would recommend the win32svn on SourceForge, here: [url]http://sourceforge.net/projects/win32svn/[/url] I don't download their installer, I just download a zip ([URL="http://sourceforge.net/projects/win32svn/files/1.8.0/apache22/svn-win32-1.8.0.zip/download"]svn-1.8.0[/URL]) that contains all the relevant exe's and dll's and I put them into a win32svn folder in Program Files. Then I put that win32svn\bin into the path and you can use svn from the command line. The command to get the latest trunk of msieve is: svn checkout svn://svn.code.sf.net/p/msieve/code/trunk <directory> Where <directory> is where you want to store the latest svn. You can run this command whenever you want, and any changes on SourceForge will change the contents of that <directory>. I put that command inside of a bat file, say in an msieve directory like X:\msieve\checkout.bat Then I set <directory> = svn. This will download the latest changes from SourceForge and put them into X:\msieve\svn\ , and then when I want to compile a version, or make changes and compile a version, I copy that folder over to a new directory and I label it with the version of svn that it is, like X:\msieve\svn_0917\ All that to say: I was wondering would you be willing to change how you download msieve so that we can have svn version information in the binaries you create? |
[QUOTE=WraithX;346740]Hi Brian, What version of the msieve svn were you using for this test? Have you tried svn923?
Also, speaking of svn versions, I have noticed that several of your binaries do not have svn information in them. (when you run msieve, the svn version is printed next to the msieve version) Are you just visiting the SourceForge web page and downloading a snapshot? If so, this method does not get the svn version info into msieve. If you download msieve via svn, it will put the svn info into msieve. All that to say: I was wondering would you be willing to change how you download msieve so that we can have svn version information in the binaries you create?[/QUOTE] I am using the msieve SVN - the automatic SVN output doesn't work in Visual Studio. Brian |
[QUOTE=xilman;346735]Oh dear :sad:
[code]=> "..//msieve" -s w64_351.dat -l ggnfs.log -i w64_351.ini -v -nf w64_351.fb -t 6 -nc1 Msieve v. 1.52 (SVN 917) Fri Jul 19 12:22:58 2013 random seeds: 32b6b2c8 7c7747f5 factoring 47179640899638361901376647020473549225378524025587881178401036218602944031199204750028727953293287374137883311870909006957005666750423557281003600949961871343428044757397222781318592936824461360214323 (200 digits) no P-1/P+1/ECM available, skipping ... begin with 15463094 relations and 16277530 unique ideals reduce to 15397308 relations and 15260488 ideals in 8 passes max relations containing the same ideal: 185 relations with 0 large ideals: 4782 relations with 1 large ideals: 174 relations with 2 large ideals: 4337 relations with 3 large ideals: 50509 relations with 4 large ideals: 321406 relations with 5 large ideals: 1218626 relations with 6 large ideals: 2854997 relations with 7+ large ideals: 10942477 commencing 2-way merge reduce to 9133936 relation sets and 8997117 unique ideals ignored 1 oversize relation sets commencing full merge Return value 4. Terminating... [1]+ Exit 255 ../factMsieve.pl w64_351.poly [/code]This on a 64-bit Fedora system. I'll first see whether the latest SVN (923) is any better. If not, I'll revert to 900 which seems pretty solid. Paul[/QUOTE] Paul, On the off chance this might help, I was also getting a hang in the full merge and the fix turned out to be deleting the "-march=core2" in the Makefile. (The machine was an AMD Phenom II). Jonathan |
[QUOTE=xilman;346735]Oh dear :sad:
[code]=> "..//msieve" -s w64_351.dat -l ggnfs.log -i w64_351.ini -v -nf w64_351.fb -t 6 -nc1 Msieve v. 1.52 (SVN 917) ... reduce to 9133936 relation sets and 8997117 unique ideals ignored 1 oversize relation sets commencing full merge Return value 4. Terminating... [1]+ Exit 255 ../factMsieve.pl w64_351.poly [/code]This on a 64-bit Fedora system. I'll first see whether the latest SVN (923) is any better. If not, I'll revert to 900 which seems pretty solid. Paul[/QUOTE]Neither 923 nor 900 were any better. 803M has this to report, which may be informative. [code]Msieve v. 1.51 (SVN 803M) Fri Jul 19 18:14:19 2013 random seeds: d6e6a838 ac24fba7 ... commencing linear algebra read 4420474 cycles cycles contain 13751353 unique relations read 13751353 relations using 20 quadratic characters above 536870564 building initial matrix memory use: 1741.8 MB read 4420474 cycles matrix is 4415335 x 4420474 (1291.9 MB) with weight 382224019 (86.47/col) sparse part has weight 294455891 (66.61/col) filtering completed in 2 passes matrix is 4394925 x 4153193 (1279.9 MB) with weight 381288711 (91.81/col) sparse part has weight 293985414 (70.79/col) matrix starts at (0, 0) matrix is 4394925 x 4153193 (1279.9 MB) with weight 381288711 (91.81/col) sparse part has weight 293985414 (70.79/col) matrix needs more columns than rows; try adding 2-3% more relations [/code] I'll do some more sieving and try the latest SVN as the Lanczos is much better than at 803. Paul |
[QUOTE=jcrombie;346747]Paul,
On the off chance this might help, I was also getting a hang in the full merge and the fix turned out to be deleting the "-march=core2" in the Makefile. (The machine was an AMD Phenom II). Jonathan[/QUOTE]Thanks. I just RTFgccM and saw the -march=native option. Although I'm on a Phenom II like you, and could have tried something more specific, perhaps jasonp may wish to consider this for the default option in Makefile. Let's see what happens after more sieving has taken place. Paul |
This has come up fairly often. -march=native is only available in gcc around v4.2, which sounds old but is still newer than the gcc used by default in many distributions, otherwise I would have changed the canonical command line arg a long time ago. It still is not available in the Apple version of gcc I have access to.
At least the matrix starts off with the correct aspect ratio, so it's less likely to be the error that others have reported in much larger jobs. |
| All times are UTC. The time now is 15:23. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.