mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Conjectures 'R Us (https://www.mersenneforum.org/forumdisplay.php?f=81)
-   -   Sieving drive Riesel base 6 n=1M-2M (https://www.mersenneforum.org/showthread.php?t=13567)

gd_barnes 2010-06-28 21:26

Sieving drive Riesel base 6 n=1M-2M
 
1 Attachment(s)
[COLOR=black][FONT=Verdana]This is a sieving drive for the final 2 k's remaining on Riesel base 6 for n=1M-2M. Lennart has kindly gotten us started. The k's remaining and their weights are:[/FONT][/COLOR]
[FONT=Verdana][COLOR=black]k=1597; weight 272[/COLOR][/FONT]
[FONT=Verdana][COLOR=black]k=36772; weight 583[/COLOR][/FONT]

This is a very exciting base where k's have fallen quickly. We now have a very good chance of proving it over the next 1-2 years.

The optimum sieve depth is probably in the P=170T-175T range. We will be using sr2sieve but because the 2 k's are so different weight, we will possibly eventually sieve them to different depths using sr1sieve. It is recommended that you run 64-bit sr2sieve on a 64-bit machine. Let us know if you need the executable or more detailed instructions on using it. Here is an example of the command to execute at the command prompt:

sr2sieve -p 50e12 -P 60e12 -i sieve-riesel-base6-1M-2M.txt

The above would be if you were sieving P=50T-60T. The file is listed after the "-i" command and is the actual file name that is attached below. Feel free to name it something shorter if you want or use the "srwork" older convention where you don't have to specify a file name.

When complete, you should have a factors.txt file. Just post the file here in this thread or if it is too big, please Email the file to me at:
gbarnes017 at gmail dot com

A P=1T range should take ~1-2 CPU days on one core of a modern 64-bit machine. Please reserve ranges in multiples of P=1T and plan to reserve no more than ~7 days of work at a time. When making reservations, please post your estimated completion date. This can be seen in sr2sieve about one minute after you start your sieve.

Attached is the latest sieve file. All factors up to P=100T have been removed. We will remove additional factors as the drive progresses to slightly speed up sieving.

Reservations:
[code]
P-range reserved by status est. completion date
0T- 40T Lennart complete
40T- 45T Flatlander complete
45T- 70T Lennart complete
70T- 80T Dougal complete
80T-180T gd_barnes complete
[/code]

[B]The sieving drive is now complete.[/B]

All help is greatly appreciated! :smile:


Thank you,
Gary

Lennart 2010-06-28 22:29

[quote=gd_barnes;220098]Would anyone like to get a sieving drive started for n=1M-2M? Although sieving the 2 k's together using sr2sieve should be about as fast as sieving them separately using sr1sieve, I think sieving them separately would be better because they will have much different optimum sieve depths. From the beginning of this drive, k=1597 has been by far the lowest weight k. The weight of the final 2 k's now remaining is:
k=1597: 272
k=36772: 583[/quote]


I can start the sieve.

Lennart

gd_barnes 2010-06-28 22:40

[quote=Lennart;220104]I can start the sieve.

Lennart[/quote]

Great, after you have sieved to some nominal depth, perhaps P=1T or something like that, just send me the 2 files and I'll start a formal drive.

k=1597 would be higher priority right now. When the sieving for it is complete, we can start searching n>1M.

Lennart 2010-06-29 21:40

[quote=gd_barnes;220108]Great, after you have sieved to some nominal depth, perhaps P=1T or something like that, just send me the 2 files and I'll start a formal drive.

k=1597 would be higher priority right now. When the sieving for it is complete, we can start searching n>1M.[/quote]


I am soon done to 13T

I use sr2sieve 1.8.11 it is a little faster when you have a few sequences.

Lennart

Lennart 2010-06-29 23:24

1 Attachment(s)
Her is the file in abcd format. Sieved to 13T

I reserve 13T-22T ( they are started )

Lennart

mdettweiler 2010-06-29 23:54

I've split off the posts regarding n=1M-2M sieving to a new thread so they don't clutter up the main drive thread. Gary, I'll leave it up to you to set up a reservations table and whatnot since I don't know exactly at what point you were going to start that.

philmoore 2010-06-30 03:36

I really don't understand the logic of sieving these two sequences separately. The optimum sieving depth for sieving together should be greater than the optimum depth for separate sieving, shouldn't it?

gd_barnes 2010-06-30 05:58

[quote=philmoore;220240]I really don't understand the logic of sieving these two sequences separately. The optimum sieving depth for sieving together should be greater than the optimum depth for separate sieving, shouldn't it?[/quote]

No. In theory it should be the same. With recent versions of sr2sieve, it has been shown that 2 instances of sr1sieve will give about the same total throughput as 2 instances of sr2sieve. Although here, Lennart is saying that he is getting slightly faster with sr2sieve. And that is only a recent phenomenon. In the past, 2 instances of sr1sieve far outpaced 2 instances of sr2sieve. It was only at 3 k's that 3 instances of sr2sieve became faster than 3 instances of sr1sieve.

I'd like to run a few tests myself because the calculations are quite complex with 2 extremely different weights. Since these remaining k's are so different in weight, one is nearly double the weight of the other, I was advocating sieving them separately.

Personally, I would like to run them separately at some point but for the time being, we'll go ahead and run them together. Perhaps once we reach P=~50T or so, we should do a detailed analysis of what works best. In other words, we may find that THIS:

low-weight k sieved to P=300T
high-weight k sieved to P=500T

Is more effective as for total sieving and testing CPU time than this:

Both k's sieved to P=400T.

It will be a tricky calculation but one that I feel is worth it at such a high n-level.

One other big reason that I was thinking we would like to sieve them separately is that the high weight k is only tested to n=600K right now whereas the low weight k is already at n=1M. We may not need the high weight k sieve file at all.

Lennart, because we don't have k=36772 searched to n=1M yet, what do you think about separating them at about P=50T and only sieve k=1597 for a while using sr1sieve?

Based on the circumstances, what do others think?

Lennart 2010-06-30 07:13

Reserving 22T-40T


I shall start a test on the low k and compare p/sec.

at 13T 1M-2M I have

k 1597 5921 candidates left
k 36772 12229 left.

I will now start a test on sr1sieve on each k and give you p/sec and one with both on sr2sieve

I will use one computer and use -t8

Lennart

gd_barnes 2010-06-30 07:44

I have done a calculation running sr1sieve vs. sr2sieve:

Set up: Intel I7 @ 2.94 Ghz. 3 other LLR/PFGW processes running in background so as to not affect the throughput in any way.

Test 1:
Sr2sieve on both k's for P=1G-5G. Time: 349 secs.

Test 2:
Sr1sieve on k=1597 for P=1G-5G. Time: 155 secs.
Sr1sieve on k=36772 for P=1G-5G. Time: 202 secs.
Total time: 357 secs.

Sr2sieve is 357 / 349 - 1 = ~2.3% faster.

Sr1sieve tests were run in batch so as to avoid any extra CPU time from manually starting each of them.

One could conclude this intuitively if one knew that sr1sieve and sr2sieve use the exact same types of calculations, which they apparently do now. Why? Because the further apart the weights are, the greater difference in time taken between the 2 and the less efficient it becomes to sieve the lower weight k. That is, if there were only 10 terms remaining on k=1597 and all remaining terms were on k=36772, it would be highly inefficient to sieve only the 10 terms. BUT...it was not clear to me if the 2 programs were using the exact same processes. I am now convinced of such. Therefore: It always makes sense to run sr2sieve to at least the optimal depth of the lower weight k, if that depth is calculated while running sr1sieve -and then- run sr1sieve on the higher weight k until it is at optimal using that program.

Conclusion: We should definitely use sr2sieve up to some, as of yet, unknown depth. That depth should be based on 3 things:

1. The optimal depth of k=1597 run only by itself using sr1sieve.
2. The chance of k=36772 having a prime for n=600K-1M.
3. Can we have and will we be ready to start testing k=1597 for n>1M before testing of k=36772 is complete to n=1M?

#'s 1 and 2 can be calculated, although deciding how the chance in #2 should reduce the optimal depth of running both k's together is not very clear. #3 is very tough to decide upon.

This seems unlikely to happen but:
If we cannot have k=1597 fully sieved for n=1M-2M before k=36772 testing is done to n=1M, it makes sense to just sieve both k's using sr2sieve to nearly the optimal depth of k=1597 (computing it's optimum using sr1sieve), and then run k=36772 to it's optimal depth running sr1sieve. That's the way I would approach it if k=36772 was already tested to n=1M.

Lennart, as of yet, I haven't turned it into a formal sieving drive. When I do, I'll edit the 1st posting for reserved ranges and the such and sticky the thread. If you would just like to keep on running both k's with sr2sieve up to P=50T or 100T as your resources allow, I think that is quite reasonable. We don't have any huge demand for testing k=1597 for n>1M at this moment.

One more thing: Because max n / min n = 2, as demonstrated by VBCurtis at RPS, no ranges should be broken off. We'll sieve the entire n-range to the same depth (for each k, although different depths for the 2 k's) and it should be sievied until the removal rate equals the test time of a candidate at approximately 55-60% of the n-range, i.e. n=1.55M-1.6M. I have an exact calculation on that from some work I've done on some of my personal efforts that I can demonstrate later on.


Gary

Lennart 2010-06-30 08:14

Here are my timings.

sr1sieve 1.41
1597 sr1sieve 133514792 p/sec
36772 sr1sieve 111216565 p/sec

Both k using sr2sieve 1.8.11

71742478 p/sec

Lennart

gd_barnes 2010-06-30 08:24

[quote=Lennart;220258]Here are my timings.

sr1sieve 1.41
1597 sr1sieve 133514792 p/sec
36772 sr1sieve 111216565 p/sec

Both k using sr2sieve 1.8.11

71742478 p/sec

Lennart[/quote]

I am also using 1.4.1 and 1.8.11. That's strange. The average of the 2 k's divided by 2, ~12.2M/2=6.1M makes sr2sieve ~15% faster for you. For me, they were identical, i.e.

Sr1sieve:
1597 26M P/sec
36772 20M P/sec
average 23M. Adjust for comparison to sr2sieve: 23M / 2 = 11.5M.

Sr2sieve on both k's:
11.5M P/sec

This was only the 4th core running out of 8 cores on an I7 so as to not get any slowdown or interruption from other processes. It was at a sieve depth of P=4G so would be quite a bit slower than P=20T or wherever you happen to be at. Adjust higher for that and lower for the more cores and it probably comes in close to yours.

In order to determine which one was better, I had to do the more detailed process, which showed sr2sieve as ~2% faster.

I might suggest that you do the type of test that I did. Sometimes a snapshot of the P-rate can be inaccurate. My 10 mins. test on both of them eliminated any possibility of a temporary process affecting anything too much.

Later on, I'll test it with all 8 cores running sieving. I cannot imagine that will have an impact on the percentage of difference between the 2 programs since they apparently use the same calculations / processes, but one never knows.


Gary

gd_barnes 2010-06-30 08:33

[quote=Lennart;220251]
I will use one computer and use -t8

Lennart[/quote]


I forgot to ask. What is -t8 ? I don't see it talked about in the help or readme. Should I be using that on an 8-core I7 ?

Perhaps that is the speedup that you are getting for sr2sieve and I am not getting it.

Lennart 2010-06-30 08:58

[quote=gd_barnes;220260]I forgot to ask. What is -t8 ? I don't see it talked about in the help or readme. Should I be using that on an 8-core I7 ?

Perhaps that is the speedup that you are getting for sr2sieve and I am not getting it.[/quote]

I start srxsieve like this.

./sr2sieve -p22e12 -P23e12 -iinput.txt -ffactors.txt -q -t8

-t8 means it will use all 8 core when i start this. I don't need to start one instance on each core. I only start one instance and it use all 8 core.

On a quad you use -t4 and on a duo -t2

sometime i work on a computer and need some CPU power to other work then I use to start -t7 to have one core free.

Lennart

gd_barnes 2010-06-30 09:15

All of that is what I use except the -f and -t8 switches. The default is for it to write factors to factors.txt so I don't use that unless I'm sieving multiple bases in batch and need different file names.

Thanks. I'll try the -t8 switch and see what happens. I thought the I7 would utilize all 8 cores anyway unless I tell it specific CPUs to use (which I do not) but the switch is worth a try.

Lennart 2010-06-30 09:26

[quote=gd_barnes;220263]All of that is what I use except the -f and -t8 switches. The default is for it to write factors to factors.txt so I don't use that unless I'm sieving multiple bases in batch and need different file names.

Thanks. I'll try the -t8 switch and see what happens. I thought the I7 would utilize all 8 cores anyway unless I tell it specific CPUs to use (which I do not) but the switch is worth a try.[/quote]

When I run on a i7 with -t8 using sr2sieve both k's I use to do 77-80Mp/sec
so I think you should get about 70-80Mp/sec if you run 8 core on a i7.

Lennart

EDIT: This is only working on Linux.

Flatlander 2010-06-30 11:16

Taking 40-45T

(ETA 6th July)

henryzz 2010-06-30 15:43

[quote=gd_barnes;220263]All of that is what I use except the -f and -t8 switches. The default is for it to write factors to factors.txt so I don't use that unless I'm sieving multiple bases in batch and need different file names.

Thanks. I'll try the -t8 switch and see what happens. I thought the I7 would utilize all 8 cores anyway unless I tell it specific CPUs to use (which I do not) but the switch is worth a try.[/quote]
The -t switch only works for linux and AFAIK your i7 is windows.

gd_barnes 2010-06-30 17:48

[quote=henryzz;220296]The -t switch only works for linux and AFAIK your i7 is windows.[/quote]

That is correct. I also confirmed that both sr1sieve/sr2sieve as well as all other programs automatically use all 8 CPUs so I doubt any switch will make any difference.

I will test it with all 8 cores sieving as soon as a couple of things finish up within 2-3 days.

All of the rest of my machines are Linux quads; only 5 of which are good sievers. But the I7 smokes them all for sieving throughput.

mdettweiler 2010-06-30 18:03

[quote=gd_barnes;220307]That is correct. I also confirmed that both sr1sieve/sr2sieve as well as all other programs automatically use all 8 CPUs so I doubt any switch will make any difference.[/quote]
Eh...excuse me? Are you saying that running ONE instance of sr1sieve/sr2sieve fills up ALL 8 of your i7's cores automatically? I would think that you'd need to run 8 instances in order to do that.

What the -t switch does is have the program do [i]automatic[/i] multithreading. That is, it handles splitting up the range into small chunks itself, distributes them to the specified # of cores, collects the results, etc. The result is a small decrease in efficiency over running separate instances (due to overhead from communication between the cores), though often the gain in human-time savings is worth it.

gd_barnes 2010-06-30 18:52

[quote=mdettweiler;220313]Eh...excuse me? Are you saying that running ONE instance of sr1sieve/sr2sieve fills up ALL 8 of your i7's cores automatically? I would think that you'd need to run 8 instances in order to do that.

What the -t switch does is have the program do [I]automatic[/I] multithreading. That is, it handles splitting up the range into small chunks itself, distributes them to the specified # of cores, collects the results, etc. The result is a small decrease in efficiency over running separate instances (due to overhead from communication between the cores), though often the gain in human-time savings is worth it.[/quote]

Excuse yourself. :-) Did I [I]say[/I] that it [I]fills[/I] them up? lol No. I said that it [I]uses[/I] them. It utilizies all 8 CPUs as it needs. When I click on task manager and pull up the properties of sr1sieve or sr2sieve, it shows that all 8 CPUs are checked even though I have not used the -t switch. Clearly that is the Windows default for all programs; well, at least the ones that we use for prime searching because LLR, PFGW, and srxsieve all default to "using" all 8 CPUs for each instance.

I don't understand why this -t switch is even needed for Windows. What it appears to do is what Windows does automatically; that is utilize all 8 cores. Why is it needed? Perhaps Linux is not as sophisticated.

mdettweiler 2010-06-30 19:03

[quote=gd_barnes;220329]Excuse yourself. :-) Did I [I]say[/I] that it [I]fills[/I] them up? lol No. I said that it [I]uses[/I] them. It utilizies all 8 CPUs as it needs. When I click on task manager and pull up the properties of sr1sieve or sr2sieve, it shows that all 8 CPUs are checked even though I have not used the -t switch. Clearly that is the Windows default for all programs; well, at least the ones that we use for prime searching because LLR, PFGW, and srxsieve all default to "using" all 8 CPUs for each instance.

I don't understand why this -t switch is even needed for Windows. What it appears to do is what Windows does automatically; that is utilize all 8 cores. Why is it needed? Perhaps Linux is not as sophisticated.[/quote]
Ah, I think I see where the misunderstanding is coming in. The -t switch does not specify [I]which[/I] core that instance of sr*sieve should use, but rather [I]how many[/I] cores that instance should use. That is, if you use -t 2, then one instance of sr*sieve will fill up two cores (just like two instances without the -t switch).

On either Windows or Linux, one can still use the "old fashioned" method of dividing up p-ranges over multiple cores manually and running separate instances. However, this offers an alternative that automates that process somewhat.

Lennart 2010-06-30 20:38

Reserving 45T-46T Lennart

Note: In ~3 hr i will upload a new file sieved to 40T

Lennart 2010-06-30 22:51

Reserving 46T-70T Lennart

Lennart 2010-06-30 23:39

1 Attachment(s)
Here are the new sievefile. sieved to 40T


Lennart

gd_barnes 2010-07-01 04:20

[quote=mdettweiler;220331]Ah, I think I see where the misunderstanding is coming in. The -t switch does not specify [I]which[/I] core that instance of sr*sieve should use, but rather [I]how many[/I] cores that instance should use. That is, if you use -t 2, then one instance of sr*sieve will fill up two cores (just like two instances without the -t switch).

On either Windows or Linux, one can still use the "old fashioned" method of dividing up p-ranges over multiple cores manually and running separate instances. However, this offers an alternative that automates that process somewhat.[/quote]

I'm still confused. So if you specify -t8 and there are no other processes running on the machine, one instance of either sr1sieve or sr2sieve will run 8 times as fast because it FILLS UP all 8 cores? That seems to be what you are implying but it seems incorrect to me.

I still don't get it. Don't you have to run 8 instances of srxsieve? Or can you just run one instance and have it run 8 times as fast by using the -t8 switch? I would test this myself except my I7 is busy with 3 things that I don't want to stop right now.

(Well, technically, it would run ~5-6 times as fast since the 5-6 cores is the full multithreading equivalent.)

mdettweiler 2010-07-01 04:46

[quote=gd_barnes;220361]I'm still confused. So if you specify -t8 and there are no other processes running on the machine, one instance of either sr1sieve or sr2sieve will run 8 times as fast because it FILLS UP all 8 cores? That seems to be what you are implying but it seems incorrect to me.

I still don't get it. Don't you have to run 8 instances of srxsieve? Or can you just run one instance and have it run 8 times as fast by using the -t8 switch? I would test this myself except my I7 is busy with 3 things that I don't want to stop right now.

(Well, technically, it would run ~5-6 times as fast since the equivalent of 5-6 cores is the full hyperthreading equivalent.)[/quote]
You are correct--running with -t 8 fills up all 8 cores and utilizes them to run (theoretically--in real life there's a slight performance hit) 8 times as fast.

Normally when you want to run a range of size x on y cores, you divide it into y chunks of size x/y, and run one instance of sr*sieve on each core, each running one of the smaller chunks. Each sr*sieve process is a rather straightforward process that can only talk to one core at a time, referred to as "single-threaded" in programming parlance. But when the -t flag is used, it goes into "multi-threaded" mode: when you use -t y, for instance, it splits into y+1 "subprocesses", one coordinating process and y workers. The coordinating process takes a small chunk of work (say, a p-range of 32000) and divides it up into y chunks, of p=32000/y size apiece. Each worker thread is given one of these. When it completes, it reports back to the main thread and tells it which factors it found. After each worker has finished its respective chunk, the coordinating process divides up the next p=32000 chunk similarly and begins again. Etc., etc. until the overall p-range is done.

The end effect is that (for instance) when sr*sieve is run with -t 4 on a quad, it keeps all four cores busy, and runs about four times as fast as a "regular" single instance would. The small performance hit I mentioned before is the trade-off to be considered against the extra effort of dividing up p-ranges manually and running 4 separate single-threaded instances of sr*sieve.

Does that make sense now? :smile:

gd_barnes 2010-07-01 04:49

[quote=mdettweiler;220363]Does that make sense now? :smile:[/quote]

No, not at all. Please reword it completely.

:missingteeth::missingteeth:

Thanks for the detailed explanation. I will utilize it in the future if the performance hit isn't too large.

mdettweiler 2010-07-01 05:06

[quote=gd_barnes;220364]No, not at all. Please reword it completely.

:missingteeth::missingteeth:

Thanks for the detailed explanation. I will utilize it in the future if the performance hit isn't too large.[/quote]
Note that this only works under Linux because Geoff (the guy who wrote sr*sieve) couldn't figure out how to get mutlithreading to work on Windows yet. Unfortunately, IIRC he's essentially put sr*sieve development on hold lately for lack of time--so this may not be remedied any time soon.

Of late a possible replacement for srsieve, has emerged in the form of ppsieve, which is based on tpsieve (a twin prime sieve used over at TPS that's the current state of the art and which is in turn based on sr1sieve). Currently it only supports k*2^n+-1 (so only base 2 and power-of-2 bases), but I believe it supports mutithreading on both Windows and Linux. I'm not sure how it compares to sr2sieve speed-wise but I believe it's at least about as fast. There's also a GPU version in beta that PrimeGrid has used to great effect with their Proth Prime Search recently. (I mentioned this before to you in an email when we were discussing GPUs--this would be useful at NPLB but of limited use for CRUS.)

Lennart 2010-07-01 20:53

1 Attachment(s)
Here are all factors for 45T-70T

Lennart

gd_barnes 2010-07-14 10:39

Chris,

Can you post your factors for P=40T-45T? I'll then remove all factors to P=70T from the file. Thanks.

Flatlander 2010-07-14 11:26

1 Attachment(s)
Sorry, I thought I had.

" Just post the file here in this thread or if it is too big..." lol

gd_barnes 2010-07-15 05:47

All factors to P=70T have now been removed from the sieve file in the 1st post.

Dougal 2010-08-07 18:03

reserving 70-75T ,should be done by friday

Dougal 2010-08-14 09:44

1 Attachment(s)
70-75T complete,34 factors

[ATTACH]5564[/ATTACH]

taking 75-80T

Dougal 2010-08-19 16:50

1 Attachment(s)
75-80T complete, 47 factors
[ATTACH]5581[/ATTACH]

gd_barnes 2010-10-14 04:10

We're closing in on completion of R6 to n=1M so it's time to get the sieving moving again.

Reserving P=80T-90T.

gd_barnes 2010-10-16 08:12

P=80T-90T is complete.

Reserving P=90T-100T.

I've calculated optimum sieve depth somewhere near the P=170T-175T range. It is lower than originally expected due to the recent increased speed of LLR/PFGW.

Any help would be greatly appreciated to complete the sieving shortly after we complete testing to n=1M. I'm able to sieve about P=1T per day per core.

I've attached an updated file with factors removed up thru P=90T in the 1st posting.

gd_barnes 2010-10-18 07:07

P=90T-100T is complete.

gd_barnes 2011-01-11 06:45

It's time to put all 8 cores of the I7 to heavy sieving use:

Reserving P=100T-180T.

ETC is ~Jan. 21st.

Afaik, this will be all the sieving that is needed here but I'll need to check that when I'm complete and have removed all of the factors.

Next I'll get hacking on continued sieving of the Riesel-Sierp base 2 even k's for n=1M-2.5M followed by Riesel-Sierp base 16 for n=250K-500K.

gd_barnes 2011-01-22 08:10

P=100T-180T is complete. Final calculations showed that the optimum sieve depth is in the P=170T-175T range so the sieving drive is now complete.

Thanks to all who contributed! :smile:


All times are UTC. The time now is 10:18.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.