mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2010-01-23, 09:37   #100
BigBrother
 
Feb 2005
The Netherlands

21810 Posts
Default

Quote:
Originally Posted by moebius View Post
This has so basically works, but I had to compile with the /TP option because of the error message
referring to unresolved external "_sieve_candidates" in ......


Now, still enters the following "syntax error" in mfaktc.cu.
..
I had to replace "inline" with "__inline" on three functions in sieve.c, but that's not related to sieve_candidates(). You do have "sieve.h" included in mfaktc.cu?
BigBrother is offline   Reply With Quote
Old 2010-01-23, 12:10   #101
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Liverpool (GMT/BST)

3·23·89 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hi David,

No, the current version doesn't support exponents <1M. But this limit is artificial. Actually I haven't even tried such low exponents.


Is there any usage for TF on such low exponents?


Oliver
AFAIK trial factoring is the only way that people can prove there are no factors below a certain limit.
I think with the speeds your program can reach people may want to extend the limits of the small exponents so we can finally say all candidates are factored up to 64bits
henryzz is offline   Reply With Quote
Old 2010-01-23, 16:51   #102
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

1106710 Posts
Default

If the exponent is say: 43,333,333 would not all factors be greater than 1M? Or are you talking about the k? On a 100M digit number you are talking about a bit depth of 49.2 for that k. Which is virtually nothing.

And for 50,000,000 (which is the bottom end of the current TF range. It is a bit depth of 46.5 for a k=1,000,000.
Everything is at that level long time ago.

And further, looking at Chasall's data, the only area where there are any expos below 60 bits is down in the below 20,000.
And everything below 8,300,000 is at 64.

Last fiddled with by Uncwilly on 2010-01-23 at 17:16
Uncwilly is offline   Reply With Quote
Old 2010-01-23, 20:58   #103
S485122
 
S485122's Avatar
 
"Jacob"
Sep 2006
Brussels, Belgium

2×977 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
And everything below 8,300,000 is at 64.
And everything ABOVE 8,300,000 is at 64.
S485122 is offline   Reply With Quote
Old 2010-01-23, 21:44   #104
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

5·223 Posts
Default

Hi,

the limit is the exponent itself.
I don't know if exponents well below 1M work or not. I think everythink above 1000 is OK but I haven't checked. I thought that in this range no TF is needed so I have choosen the easiest way: Don't allow low exponents (without knowledge if they work or not). Exponents below 33 _WILL_ fail! ;)

The k's are limited to 1..(2^64-2^24) which is not really a limit since the factor limit of 2^71 kicks in first for exponents above 2^6 (64).

Oliver

Last fiddled with by TheJudger on 2010-01-23 at 21:44
TheJudger is offline   Reply With Quote
Old 2010-01-24, 00:11   #105
kjaget
 
kjaget's Avatar
 
Jun 2005

3×43 Posts
Default

I'm running with the .03 code ported to windows (thanks for the hints provided in the thread). It ran through the self-test and also found a few of the factors I've found recently using P95. Given that amount of testing, and that it's a few percent faster than the previous binary posted, I figured I'd put it up here.

mfaktc-hack.zip

It's doing ~27M/sec on an 8800GT fed by 1 thread of a 3.3GHz C2D 4400. I'm running P95 on the other core, so that's probably slowing me a bit. I did try various options in params.h and nothing sped it up (lots of options did slow it down, though). So I guess that 1 core is enough to keep my card saturated. Time to speed up the GPU code again :)

I've added code to have this run at idle priority so it shouldn't kill the responsiveness of the machine it's loading up. Otherwise it should be the same as the source code posted.

Last fiddled with by kjaget on 2010-01-24 at 00:17
kjaget is offline   Reply With Quote
Old 2010-01-24, 00:53   #106
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

5·223 Posts
Default

Hi kjaget,

27M/s is exactly what I would expect far a 8800GT.
3.3GHz C2D? What is your SIEVE_PRIMES setting? My default setting 50000? Try values at ~100000. This might reduce the overall runtime (while the raw rate should remain at 27M/s).

Oliver

Last fiddled with by TheJudger on 2010-01-24 at 00:57
TheJudger is offline   Reply With Quote
Old 2010-01-24, 03:52   #107
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

3·7·17·31 Posts
Default

Quote:
Originally Posted by S485122 View Post
And everything ABOVE 8,300,000 is at 64.
oops
Uncwilly is offline   Reply With Quote
Old 2010-01-24, 14:16   #108
kjaget
 
kjaget's Avatar
 
Jun 2005

100000012 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hi kjaget,

27M/s is exactly what I would expect far a 8800GT.
3.3GHz C2D? What is your SIEVE_PRIMES setting? My default setting 50000? Try values at ~100000. This might reduce the overall runtime (while the raw rate should remain at 27M/s).

Oliver
Thanks - I get it now. You're balancing how much you can sieve out on the CPU side while still keeping the GPU busy. So you have to look at both the rate and the time per class.

Upping to 100,000 gave an 8% speedup. Going to 125,000 gave a slight improvement on top of that, but nothing major. So it looks like things are working just as you expect.

So far the testing is still going well. I'm finding factors with this code that I've found previously using P95 and no false positives so far.
kjaget is offline   Reply With Quote
Old 2010-01-24, 16:43   #109
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

5·7·139 Posts
Default Pointers needed

Hi guys,

Would you please direct me to what is needed to (compile and) run CUDA avare software?

- Is it enough to download and install CUDA SDK to have the proper libraries installed?
- Is the documentation available on the Nvidia website enough?
- Are the executables in this thread directory- or installation-dependant?
- Is a model-check routine that optimizes the parameters in the code to be executed for each board version in the to-do list?

I will be more than pleased to run tests and benchmarks on my system (i5 750 @2.66 GHz, 4 cores, GTX 275) as soon as my platform will be CUDA-ready.

Thank you for your help

Luigi

Last fiddled with by ET_ on 2010-01-24 at 16:46
ET_ is offline   Reply With Quote
Old 2010-01-24, 18:00   #110
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

21338 Posts
Default

Hi Luigi,

you're talkting about Linux, right?

Go here: http://www.nvidia.com/object/cuda_get.html
You'll need the CUDA driver (display driver) and the CUDA toolkit. The CUDA SDK is optional (some code examples, verify your installation, ...)

I recommend a 64bit Linux which is supported by CUDA, I'm using openSUSE 11.1 (11.2 isn't supported by CUDA right now and has some problems with it (gcc 4.4.x))

- install the driver
- install the toolkit
- setup some environment variables (as noted by the toolkit)

Documentation is available aswell on the nvidia website.

The executables just need some shared libraries (e.g. cudart), on a proper configured system it doesn't matter in which directory the are.

Automatic parameter selection is at least on my whishlist but not a major objective at the moment. Actually the most intersting parameter is SIEVE_PRIMES. It depends on the exponent aswell.
M66xxxxxx => my GPU can do ~54M/sec
M3312xxxxxx => my GPU can do ~41M/sec
The siever does NOT depend on the exponent size! So as the exponents get bigger it removes the pressure from the siever and more sieving is possible (increase SIEVE_PRIMES).


Oliver
TheJudger is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1724 2023-06-04 23:31
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 42 2022-12-18 05:59
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 14:21.


Fri Jul 7 14:21:32 UTC 2023 up 323 days, 11:50, 0 users, load averages: 0.91, 1.13, 1.20

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔