![]() |
|
|
#1 |
|
Aug 2002
2·32·13·37 Posts |
We expect the answer to this is beyond our comprehension, but maybe we'll get lucky…
Given two strings of numbers, is there a way to determine which is more random? Thanks! PS - This question arose from reading this: http://www.sciencenews.org/articles/20041204/bob9.asp |
|
|
|
|
|
#2 |
|
Dec 2003
Hopefully Near M48
2·3·293 Posts |
Could you be more precise with your question please? What do you mean by the "randomness" of a number?
Do you mean: "How do you distinguish between a number generated by a random or pseudorandom event?" Last fiddled with by jinydu on 2004-12-04 at 01:30 |
|
|
|
|
|
#3 |
|
Aug 2002
2×32×13×37 Posts |
We mean:
1, 2, 3, 4, 5 1, 4, 3, 5, 2 1, 1, 1, 1, 1 Which sequence is more random? In that article they mention that some kinds of random generators are not as random as others. How do they determine which is more random? Maybe we're asking the wrong question!
|
|
|
|
|
|
#4 |
|
Dec 2003
Hopefully Near M48
2·3·293 Posts |
I think you need a more precise definition of random, such as (passes statistical test x). That is apparently the definition that the article uses.
|
|
|
|
|
|
#5 |
|
Jul 2004
Potsdam, Germany
3×277 Posts |
I think a common criteria for randomness could be P(i=n) = P(i=n|i-1=m) with i being the i-th digit and n,m numbers.
note: A more sophisticated (though harder to understand) version would be P(i=n) = P(i=n|PS([i1=m_1, i2=m_2, ..., ij=m_j, ik=m_k])) with PS as the power set, ij as the j-th digit, 0 < j =< k, j != i, k = length(numberString) and m_j as the value of the j-th digit, so that the percentage of i being n is completely independent from all number assigments to the other digits. I'm not completely sure this is all what's needed to be completely random, though... When this is the case, one could make a test (I currently don't know which one - T-test, p-test, ...) to get the likeliness of this randomness being the case. The more likely it gets, the more random the string of numbers is... Last fiddled with by Mystwalker on 2004-12-04 at 03:00 |
|
|
|
|
|
#6 |
|
Aug 2002
26×5 Posts |
There are many, many randomness tests. I've got a bit of experience with them (implemented a test suite). If you want some specific tests, ask.
(Most of them can't say anything useful unless you have lots and lots of input.) |
|
|
|
|
|
#7 |
|
Aug 2002
Portland, OR USA
2·137 Posts |
As Coldfury said, lots of input is important. If you are given a relatively small subset of a very large range of possible values, it is difficult to measure the randomness.
Let's start at the opposite extreme. Consider the set of three digit numbers (100-999). Given ten different ordered lists of the set, it is fairly easy to measure the relative randomness of a host of distribution criteria. Which criteria are important for random lists? I don't know, anybody? What kind of things do your tests look for Coldfury? Is your answer likely to resemble the glossary of a statistics textbook? |
|
|
|
|
|
#8 |
|
Aug 2002
26·5 Posts |
Here's a number of common tests for randomness:
Some of these go by different names, and even worse, different tests sometimes go by the same name. First of all, we treat the random sequence as an undifferentiated string of bits, which we can partition anyway we see fit. 1) Frequency test - Break the bit string into N-bit blocks and interpret them as numbers from 0 to (2^N)-1. Count the occurances of each number and perform a chi-square test to compare them with an ideal uniform distribution. 2) Overlapping frequency test - Same as above, but overlap the bits. The distribution calculations are signifigantly more complicated with this, since the probabilities are no longer independent. Usually only done for a fixed block size of 2 bits. 3) Auto-correlation test - Perform the auto-correlation of the bit stream with itself shifted a number of bits. An ideal random sequence should be perfectly uncorrelated no matter the shift amount. 4) Runs test - Count the number of runs of 0s and 1s in the sequence. A run is a sequence of a block of the same character. 1111 would be a run of 4 1s. Perform a chi-square based on the expected distribution of runs. (Exponential I think? I don't remember off the top of my head) 5) Universal statistical test - A bit hard to explain. Measures the entropy of the sequence. In practice not too useful, it requires massive and massive amounts of bits to detect anything. 6) Poker test - Make a "deck" of 2^N different types of cards. Break the bit sequence into N-bit blocks. Treat each block like a card. Deal out a "hand" of R cards. Count the number of pairs in your hand and record it. Repeat this many many times and compare the distribution using chi-square. (The probabilities are a bit nasty and use stirling's second number). 7) Coupon Collector's test - Break the bit string into N-bit blocks. Define a set of 2^N "coupons". Read each block one by one and mark a coupon everytime you find one you haven't seen before. Count how many times it takes before you collect the full set. Repeat this process and record the number of attempts it takes into a table. These attempt numbers should form a known distribution you can use the chi-square to compute. 8) Permutations test - Break the bit string into N-bit blocks. Define a 2^N-tuple of N-bit blocks. Read the blocks in order into a tuple. If there's a repeat of a block in the tuple, discard the tuple, and repeat. If you have a tuple of no repeats, then the tuple if a permutation. Record it. Repeat and record the occurances of each permutation. There are (2^N)! such permutations. Use chi-square to compare to known distribution. 9) Maximum T-test - Break the bit string into N-bit blocks and read into R-tuples. Record the maximum number in each tuple. The distribution of maximum numbers should form a known distribution. 10) Birthday spacings test - Make an abstract "year" of 2^N days. Break the bit string into N-bit blocks and interpret each block as a day in this year. Read R such "birthdays". Then compute the distance between each birthday. Repeat over and form a distribution of distances. They should form a known distribution. There are even more tests (which I haven't implemented personally), like the Monkey and Gorilla tests. Also things like the matrix rank test or the linear independent sequence test, or the linear shift register test. In addition there are many more ad-hoc tests, such as running any number of compression schemes on the bit string. An ideal random sequence should barely compress at all. The probability of two random numbers being relatively prime is something like (Pi/6) (maybe squared(?)), so take two random numbers out of the sequence and compute their GCD. Increment a counter if they're relatively prime. After a long number of trials the ratio of successes to attempts should approach Pi/6. There are many, many others. Look at the DIEHARD tests in the literature for more ideas. |
|
|
|
|
|
#9 |
|
"Richard B. Woods"
Aug 2002
Wisconsin USA
22×3×641 Posts |
It may be of interest here to note that Eric Weisstein's MathWorld entry for "random" at http://mathworld.wolfram.com/Random.html is still essentially blank. :)
|
|
|
|
|
|
#10 | |
|
Aug 2002
Portland, OR USA
2·137 Posts |
Quote:
It can't be a good sign that my response to reading ColdFury's list of test was "Cool, these are impressive tests." ![]() Or that I recognized several of the test names. And even the new ones, the name gave me a pretty good idea what it was before I read it. |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Randomness | ShiningArcanine | Math | 12 | 2008-05-22 21:52 |
| More trouble in randomness | Orgasmic Troll | Math | 11 | 2005-04-15 15:26 |
| What is Randomness and does relative randomness make sense? | sakreha | Data | 1 | 2005-02-04 23:29 |
| Measuring the total power consumed by a computer | eepiccolo | Hardware | 13 | 2003-08-23 23:55 |
| Temp measuring question. | Deamiter | Hardware | 5 | 2002-10-04 02:57 |