mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Math

Reply
 
Thread Tools
Old 2004-12-03, 23:17   #1
Xyzzy
 
Xyzzy's Avatar
 
Aug 2002

2·32·13·37 Posts
Default Measuring randomness?

We expect the answer to this is beyond our comprehension, but maybe we'll get lucky…

Given two strings of numbers, is there a way to determine which is more random?

Thanks!

PS - This question arose from reading this:

http://www.sciencenews.org/articles/20041204/bob9.asp
Xyzzy is offline   Reply With Quote
Old 2004-12-04, 01:27   #2
jinydu
 
jinydu's Avatar
 
Dec 2003
Hopefully Near M48

2·3·293 Posts
Default

Could you be more precise with your question please? What do you mean by the "randomness" of a number?

Do you mean: "How do you distinguish between a number generated by a random or pseudorandom event?"

Last fiddled with by jinydu on 2004-12-04 at 01:30
jinydu is offline   Reply With Quote
Old 2004-12-04, 02:08   #3
Xyzzy
 
Xyzzy's Avatar
 
Aug 2002

2×32×13×37 Posts
Default

We mean:

1, 2, 3, 4, 5
1, 4, 3, 5, 2
1, 1, 1, 1, 1

Which sequence is more random?

In that article they mention that some kinds of random generators are not as random as others. How do they determine which is more random?

Maybe we're asking the wrong question!

Xyzzy is offline   Reply With Quote
Old 2004-12-04, 02:24   #4
jinydu
 
jinydu's Avatar
 
Dec 2003
Hopefully Near M48

2·3·293 Posts
Default

I think you need a more precise definition of random, such as (passes statistical test x). That is apparently the definition that the article uses.
jinydu is offline   Reply With Quote
Old 2004-12-04, 02:55   #5
Mystwalker
 
Mystwalker's Avatar
 
Jul 2004
Potsdam, Germany

3×277 Posts
Default

I think a common criteria for randomness could be P(i=n) = P(i=n|i-1=m) with i being the i-th digit and n,m numbers.

note:
A more sophisticated (though harder to understand) version would be P(i=n) = P(i=n|PS([i1=m_1, i2=m_2, ..., ij=m_j, ik=m_k])) with PS as the power set, ij as the j-th digit, 0 < j =< k, j != i, k = length(numberString) and m_j as the value of the j-th digit, so that the percentage of i being n is completely independent from all number assigments to the other digits. I'm not completely sure this is all what's needed to be completely random, though...

When this is the case, one could make a test (I currently don't know which one - T-test, p-test, ...) to get the likeliness of this randomness being the case. The more likely it gets, the more random the string of numbers is...

Last fiddled with by Mystwalker on 2004-12-04 at 03:00
Mystwalker is offline   Reply With Quote
Old 2004-12-04, 04:50   #6
ColdFury
 
ColdFury's Avatar
 
Aug 2002

26×5 Posts
Default

There are many, many randomness tests. I've got a bit of experience with them (implemented a test suite). If you want some specific tests, ask.

(Most of them can't say anything useful unless you have lots and lots of input.)
ColdFury is offline   Reply With Quote
Old 2004-12-04, 14:23   #7
Maybeso
 
Maybeso's Avatar
 
Aug 2002
Portland, OR USA

2·137 Posts
Default

As Coldfury said, lots of input is important. If you are given a relatively small subset of a very large range of possible values, it is difficult to measure the randomness.

Let's start at the opposite extreme. Consider the set of three digit numbers (100-999). Given ten different ordered lists of the set, it is fairly easy to measure the relative randomness of a host of distribution criteria.

Which criteria are important for random lists? I don't know, anybody? What kind of things do your tests look for Coldfury? Is your answer likely to resemble the glossary of a statistics textbook?
Maybeso is offline   Reply With Quote
Old 2004-12-05, 06:41   #8
ColdFury
 
ColdFury's Avatar
 
Aug 2002

26·5 Posts
Default

Here's a number of common tests for randomness:

Some of these go by different names, and even worse, different tests sometimes go by the same name.

First of all, we treat the random sequence as an undifferentiated string of bits, which we can partition anyway we see fit.

1) Frequency test - Break the bit string into N-bit blocks and interpret them as numbers from 0 to (2^N)-1. Count the occurances of each number and perform a chi-square test to compare them with an ideal uniform distribution.

2) Overlapping frequency test - Same as above, but overlap the bits. The distribution calculations are signifigantly more complicated with this, since the probabilities are no longer independent. Usually only done for a fixed block size of 2 bits.

3) Auto-correlation test - Perform the auto-correlation of the bit stream with itself shifted a number of bits. An ideal random sequence should be perfectly uncorrelated no matter the shift amount.

4) Runs test - Count the number of runs of 0s and 1s in the sequence. A run is a sequence of a block of the same character. 1111 would be a run of 4 1s. Perform a chi-square based on the expected distribution of runs. (Exponential I think? I don't remember off the top of my head)

5) Universal statistical test - A bit hard to explain. Measures the entropy of the sequence. In practice not too useful, it requires massive and massive amounts of bits to detect anything.

6) Poker test - Make a "deck" of 2^N different types of cards. Break the bit sequence into N-bit blocks. Treat each block like a card. Deal out a "hand" of R cards. Count the number of pairs in your hand and record it. Repeat this many many times and compare the distribution using chi-square. (The probabilities are a bit nasty and use stirling's second number).

7) Coupon Collector's test - Break the bit string into N-bit blocks. Define a set of 2^N "coupons". Read each block one by one and mark a coupon everytime you find one you haven't seen before. Count how many times it takes before you collect the full set. Repeat this process and record the number of attempts it takes into a table. These attempt numbers should form a known distribution you can use the chi-square to compute.

8) Permutations test - Break the bit string into N-bit blocks. Define a 2^N-tuple of N-bit blocks. Read the blocks in order into a tuple. If there's a repeat of a block in the tuple, discard the tuple, and repeat. If you have a tuple of no repeats, then the tuple if a permutation. Record it. Repeat and record the occurances of each permutation. There are (2^N)! such permutations. Use chi-square to compare to known distribution.

9) Maximum T-test - Break the bit string into N-bit blocks and read into R-tuples. Record the maximum number in each tuple. The distribution of maximum numbers should form a known distribution.

10) Birthday spacings test - Make an abstract "year" of 2^N days. Break the bit string into N-bit blocks and interpret each block as a day in this year. Read R such "birthdays". Then compute the distance between each birthday. Repeat over and form a distribution of distances. They should form a known distribution.

There are even more tests (which I haven't implemented personally), like the Monkey and Gorilla tests. Also things like the matrix rank test or the linear independent sequence test, or the linear shift register test.

In addition there are many more ad-hoc tests, such as running any number of compression schemes on the bit string. An ideal random sequence should barely compress at all. The probability of two random numbers being relatively prime is something like (Pi/6) (maybe squared(?)), so take two random numbers out of the sequence and compute their GCD. Increment a counter if they're relatively prime. After a long number of trials the ratio of successes to attempts should approach Pi/6. There are many, many others.

Look at the DIEHARD tests in the literature for more ideas.
ColdFury is offline   Reply With Quote
Old 2004-12-06, 19:26   #9
cheesehead
 
cheesehead's Avatar
 
"Richard B. Woods"
Aug 2002
Wisconsin USA

22×3×641 Posts
Default

It may be of interest here to note that Eric Weisstein's MathWorld entry for "random" at http://mathworld.wolfram.com/Random.html is still essentially blank. :)
cheesehead is offline   Reply With Quote
Old 2004-12-06, 20:05   #10
Maybeso
 
Maybeso's Avatar
 
Aug 2002
Portland, OR USA

2·137 Posts
Default

Quote:
Originally Posted by cheesehead
It may be of interest here to note that Eric Weisstein's MathWorld entry for "random" at http://mathworld.wolfram.com/Random.html is still essentially blank. :)
Hmm, shouldn't the entry be different each time you look at it?

It can't be a good sign that my response to reading ColdFury's list of test was "Cool, these are impressive tests."

Or that I recognized several of the test names. And even the new ones, the name gave me a pretty good idea what it was before I read it.
Maybeso is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Randomness ShiningArcanine Math 12 2008-05-22 21:52
More trouble in randomness Orgasmic Troll Math 11 2005-04-15 15:26
What is Randomness and does relative randomness make sense? sakreha Data 1 2005-02-04 23:29
Measuring the total power consumed by a computer eepiccolo Hardware 13 2003-08-23 23:55
Temp measuring question. Deamiter Hardware 5 2002-10-04 02:57

All times are UTC. The time now is 16:12.


Fri Jul 7 16:12:07 UTC 2023 up 323 days, 13:40, 0 users, load averages: 1.91, 1.48, 1.27

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔