![]() |
The most efficient approach would be to simply grab whatever submissions any user wanted to make and stuff them directly into the to-be-processed queue and tell the user "thanks for your data, it will be processed soon". But users tend to want to see that what they submitted was actually processed and what credit they got and all that. And, of course, if there's any errors then it's hard to notify the user if submissions are batch-processed later.
|
[QUOTE=James Heinrich;381033]The most efficient approach would be to simply grab whatever submissions any user wanted to make and stuff them directly into the to-be-processed queue and tell the user "thanks for your data, it will be processed soon". But users tend to want to see that what they submitted was actually processed and what credit they got and all that. And, of course, if there's any errors then it's hard to notify the user if submissions are batch-processed later.[/QUOTE]
It occurred to me too that the process flow, if I were to design it today from scratch, would be: 1) User submits factor-found results manually or via API 2) Basic "sanity" check to see if the factor is actually a divisor of 2^p-1 (how long does that take anyway?) 3) Factor is stored "as is" 4) User can see their work credit 5) On some regular basis, those factors can then be checked for compositeness, with any resulting smallest divisor stored in a separate DB column. That part takes longer and should in no way delay the user experience since it's not essential to their user experience. One thing I noticed when benchmarking the various factoring apps at hand (msieve, yafu, pari) was that they all tend to work better if given a batch of numbers instead of being executed once for each number. Just the OS overhead in launching the program itself took *more* time than the program actually spent sieving a number that happened to be prime anyway. I think that's why YAFU was slower when checking prime factors because the executable itself is just designed weird and takes a while to get going, but once it's running it's very fast. That was my take on it anyway. I also see some value in storing the factor actually reported in. There's a history of the transactions from the user where the reported factor can be found but it's not easily cross-referenced with anything (it's the text string you would see from the results file basically, so it would take some parsing to get the factor out of it). At any rate, good web design dictates that user experience is everything, so if there's something that can be done later and still give a user a warm fuzzy screen in response to some input, then do that...that's what you want and that's what they want. :smile: If, for some reason, we discovered that users expect to see the prime factors of a factor they submitted, then the case could be made that doing it at the time of submission was a good thing, but a) I don't think they expect that and b) it doesn't show that anyway, does it? I don't use the manual reporting page so I don't know. LOL One other thing struck me when looking at the API, and that was how it could be improved on the client and server side alike with a more standard format like XML or JSON. As it is, results are reported by the clients via URL parameters and a simple HTTP GET. If a client is requesting work from the server, again, it does a GET with some parameters indicating such, and the server responds with it's own custom lingo. Of course all of that was designed ages ago before XML became a common way of organizing, and before JSON was on the scene or that popular. POSTing data to the server seems like a more efficient way for this type of thing even if there's actually something kind of simple and elegant about having all client activity in a GET with nothing more than a long obfuscated URI. :) But the client communication method is neither here nor there... I point it out only because the way manual results in particular is handled makes it a little interesting to extract the data, like the user id, exponent, any factors, curve settings, whatever else. There's some fancy regex that all works but it seems like there should be a better way, and that's with a structured format that could still be human readable XML/JSON but with clearly delineated data. Call that a suggestion for API v6. :smile: |
[QUOTE=Madpoo;381036]3) Factor is stored "as is"
4) User can see their work credit 5) ... since it's not essential to their user experience.[/QUOTE]The only caveat to processing factors as-is, is that if a user submits a composite factor (benignly or maliciously) then their PrimeNet credit is based on the composite factor. It would have to be said that displayed user credit is taking submitted data at face value, and that credit (and indeed whether a factor is in fact new at all) may be adjusted after the fact once the submitted factors have been further authenticated. |
You could use one program to check if the submitted factor is prime and another to factor it if it's composite.
YAFU is probably the fastest way to factor numbers in this size range. I don't know what the fastest way to check if it's prime is. If you award credit based on the size of a factor then composite factors get *far* to much credit. See the "Did *GOD* join GIMPS" thread for an example. Chris |
[QUOTE=James Heinrich;381037]The only caveat to processing factors as-is, is that if a user submits a composite factor (benignly or maliciously) then their PrimeNet credit is based on the composite factor. It would have to be said that displayed user credit is taking submitted data at face value, and that credit (and indeed whether a factor is in fact new at all) may be adjusted after the fact once the submitted factors have been further authenticated.[/QUOTE]
I don't understand so much about how credits are assigned for things like that so I'll take your word for it. I guess there could be some trade-off made where the user gets their warm fuzzy up front but if it turns out the factoring credit needs adjusting as a result of something, that could simply be done later, notes added saying adjustments were made. I would imagine adjustments are made in other cases anyway, like if a double/triple check find that an original result had a problem? Or does the first time LL check still get credit even though it was wrong? |
[QUOTE=chris2be8;381042]You could use one program to check if the submitted factor is prime and another to factor it if it's composite.
YAFU is probably the fastest way to factor numbers in this size range. I don't know what the fastest way to check if it's prime is. If you award credit based on the size of a factor then composite factors get *far* to much credit. See the "Did *GOD* join GIMPS" thread for an example. [/QUOTE] YAFU definitely seems like the best all-around choice... fastest at factoring composites. George is using the SIQS(number) method... I don't know how that's different than just FACTOR(number) except it's using a specific method instead of FACTOR() which seems to try different things based on the size? Just require clients to ensure any factors it finds are actually prime. It's something they should be doing anyway, I would have thought. Still double-check that reported factors are prime but if they're not, then don't give 'em credit for it, punish those lazy factoring apps that can't do that one little step. :smile: (I think I'm kidding...but it kind of makes sense) |
[QUOTE=Madpoo;381043]I don't understand so much about how credits are assigned...
Or does the first time LL check still get credit even though it was wrong?[/QUOTE]Credit for factors depends on the effort spent to find the factor. So if a user reports a 100-bit TF factor (that's a composite of smaller factors) they'd get a bazillion GHz-days credit. To be later rescinded when it's found that all those factors were already known. LL credit is the same whether it's LL/DC/TC, and whether it was "right" or not upon later verification. Only blatant forgeries are purged (people spamming forged results to try and boost their credit). |
[QUOTE=Xyzzy;381023]Why does it matter if the factor is a prime factor or a composite factor?[/QUOTE]
Because of the credit, first of all. See the case of Axon, or how was that user called, who used to take known factors, make their product, and send the result as a new (huge) P-1 factor. If the server only checks "if it is a factor", such user will get a lot of "genuine" P-1 credit, for free. The server has to make sure the factors are "new". It is not really necessary to check if they are prime, and tentative division by the former known factors (or viceversa if the newer factor is smaller) will do it, but the fact is that these factors are never so big, they are all of the form 2kp+1, so checking their primality is piece of cake (miliseconds with pfgw), and factoring them effectively is also easy. Who the hack is sending 140 digits factors? P-1 records are 150 bits, which is close to 50 digits, and in the TF range there are no factors with more than 80 bits, which both yafu or pari can factor in a blink of an eye. So I suggest to keep checking and factoring "at submission time". This keeps the user happy, and discourage the fakers, they see immediately that the submission is refuted. [edit you make the "queued check" and I promise I will feed your queues with fake factors :razz:, or if not me, some cmd :cmd: (remember him??) will take care...] |
[QUOTE=James Heinrich;381046]Credit for factors depends on the effort spent to find the factor. So if a user reports a 100-bit TF factor (that's a composite of smaller factors) they'd get a bazillion GHz-days credit. To be later rescinded when it's found that all those factors were already known.[/QUOTE]
OK, but I would argue that this should be a considered balance between system efficiency and "user experience". If the "Is this a real factor of MCp" (MCp being "Mersenne [Prime] Candidate p") run is inexpensive, then do it at submission time. If the "Is this factor composite" run is more expensive then batch it and adjust credit if needed. (Perhaps at the same time fix the "GPU TF result assumed to be via P-1" bug.) Conversely, perhaps with the new server hardware this will no longer be an issue. |
[QUOTE=chalsall;381055]If the "Is this a real factor of MCp" (MCp being "Mersenne [Prime] Candidate p") run is inexpensive, then do it at submission time. If the "Is this factor composite" run is more expensive then batch it and adjust credit if needed.
(Perhaps at the same time fix the "GPU TF result assumed to be via P-1" bug.)[/QUOTE]The new manual results page (coming sometime soon to a PrimeNet near you!) will fix the false-result-type issues. I will consider the issue of how practical it is to defer prime-testing the submitted factors and later adjusting credit; it's desirable but adds complexity. |
[QUOTE=James Heinrich;381046]Credit for factors depends on the effort spent to find the factor. So if a user reports a 100-bit TF factor (that's a composite of smaller factors) they'd get a bazillion GHz-days credit. To be later rescinded when it's found that all those factors were already known.[/QUOTE]
It makes sense I guess. I'm guessing that normal factoring is likely to find the smallest prime factor, and it's the ECM and P-1 methods that might find a factor that could be composite... would that be generally true? Maybe that's a dumb question... I haven't actually thought about the workings of all of this in years and years. I looked at the data for all of the P-1 factors, 9,787 of them, and running them all through YAFU. Some are prime factors, some are composite. The actual table where factors are stored seems like they all have the prime factors in there. I still think it might be useful to store the reported factor in that table too because some inferences could be made, like if certain users consistently reported composites for some reason, or certain methods/CPUs/etc. That kind of info would help determine the best way to check certain types of factor found results to help system resource usage. That's a DB design decision for George though, so just tossing the idea out there. Maybe now with some elbow room, there's more potential for doing some DB redesigns and not worrying about running out of disk space. But I'll probably PM George and mention some other DB ideas if he was interested at all. [QUOTE=James Heinrich;381046]LL credit is the same whether it's LL/DC/TC, and whether it was "right" or not upon later verification. Only blatant forgeries are purged (people spamming forged results to try and boost their credit).[/QUOTE] Makes sense too. If someone repeatedly submitted LL results that ended up being wrong, then you'd know something was up. FYI, factoring those 9,787 P-1 factors with YAFU took 31 minutes, 31.5 seconds. :) Quite a few were composite. |
| All times are UTC. The time now is 23:02. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.