View Single Post
Old 2020-04-24, 03:18   #4
rogue's Avatar
Apr 2003
Between here and the

22×33×5×11 Posts

Originally Posted by Citrix View Post
For #1) I was referring to all the various srXsieveY versions. They try to implement various different algorithms/optimizations that most users do not need or unnecessarily makes the program slow. I agree with making the code as straight forward as possible.
Which is why most of that is gone in srsieve2.

We need two versions/logic paths of srsieve2 - one for heavy weight and large N ranges searches and one for light weight and small N range searches. These require different optimizations. One size fits all does not work here.

Most projects/people need the light weight/small N range version but srsieve was designed for heavy weight/large N sequences. All the CRUS work is low weight.

For a heavy weight and large N range the overhead might be justified for optimizations but not for the light weight k and smaller N range as it only slows the program down.

There are 3 main optimizations/algorithms that need to be tinkered:
1. Deciding on the best Q
2. Power residue
3. Legendre symbols
Note:- There is overlap between 2 and 3.
The Legendre table is easy to generate, but for larger k (> 1e9) it can take more time and a lot of memory to build the table.

Optimizations will be different when many k's are being sieved simultaneously.
Optimizations will be different for 2-3k sieve and when 100k are being sieved. These require different logic paths.

The program needs to have different logic optimization paths based on number of k and size of range and weight of k.

#3) I think we are taking about different things. Even though I called it a hashmap - I meant more of a bitmap. Can you explain what you implemented in the past.
What is implemented in srsieve2 is a C++ friendly version of what was in srsieve. It is very likely that something faster could be written. I don't recall what I coded, only that it was slower. I never dug into why.

#4) This would be easy to implement for 1-2 k and low weight/small N range as memory requirements are minimal.
(I am not sure how to do this for large number of k, with large N range and heavy weight K using the BSGS on GPU).

I will stick with improving srsieve2.exe first. We might be able to make other sieves faster as well (I haven't looked at the code).

Anyone else has any suggestions?
AFAIAC, srsieve2 is intended to replace srsieve, sr1sieve, and sr2sieve. It is faster than srsieve for most ranges I have worked with. It is about as fast as sr2sieve (when not using Legendre tables). I suspect that it can be faster than sr2sieve without all of the complexities of sr2sieve. Granted I have the advantage of only targeting 64-bit CPUs and have AVX at my disposal or most modern CPUs, although AVX only gives a speed boost to certain sieves. I haven't played with AVX2 since I have one computer (at most) that supports it.

I'm thinking that a faster implementation of the HashTable class would be a good start.
rogue is offline   Reply With Quote