LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 11-22-2017, 09:13 AM   #1
Linux_Kidd
Member
 
Registered: Jan 2006
Location: USA
Posts: 737

Rep: Reputation: 78
/dev/urandom Question


centOS 6.8 (32bit running in virtual box)

so, i needed some random numbers 10 digits wide

i ran this
Code:
cat /dev/urandom |tr -dc '0-9' |fold -w 10 -n 1000000 >> rand.out
a million rand numbers 10 digits wide.

the rand pool is a 10 billion (minus 1) big (all 9's, etc), and out of a million rand i get 51 dupes.

1million out of 10billion seems like a small amount, and 51 dupes in 1million seems like a lot.

is this urandom rand issue, or am i seeing dupes in my final output due to tr ?
 
Old 11-22-2017, 12:29 PM   #2
teckk
LQ Guru
 
Registered: Oct 2004
Distribution: Arch
Posts: 5,137
Blog Entries: 6

Rep: Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826
Examples:

1 million random numbers
Code:
for i in {1..1000000}:; do echo "$(((RANDOM % 10) * 1234567891))"; done >> rand.txt
Not so great
Code:
uniq -d rand.txt | wc -l
89820
Code:
for i in {1..1000000}:; do echo "$(((RANDOM % 123) * 123456789))"; done >> rand.txt
A little better
Code:
uniq -d rand.txt | wc -l
7915
2.5 million random numbers
Code:
for i in $(od -An -tu4 -N10000000 /dev/urandom); do echo $i >> rand.txt; done
Much better
Code:
uniq -d rand.txt | wc -l
706
You could make them all unique with

Code:
sort -u rand.txt > rand2.txt

uniq -d rand2.txt | wc -l
0
that leaves 2499283 unique numbers

You could then randomize that list with
Code:
sort -R rand2.txt > rand3.txt
 
Old 11-22-2017, 01:58 PM   #3
Linux_Kidd
Member
 
Registered: Jan 2006
Location: USA
Posts: 737

Original Poster
Rep: Reputation: 78
thanks teckk for the reply.

i did remove all numbers from my list that we dupes (51x2 as each dupe had a twin, etc), so i removed 102 numbers, now my list is completely unique.

what system did you run that on?


but my question still remains, do i see dupes from urandom (doesnt seem likely), or do they get created when using fold (i meant tr or fold in 1st post)?
 
Old 11-22-2017, 04:00 PM   #4
teckk
LQ Guru
 
Registered: Oct 2004
Distribution: Arch
Posts: 5,137
Blog Entries: 6

Rep: Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826
You'll get duplicate numbers if you run a random set of numbers
Try a few ways.

Examples:

Code:
while :; do echo "${RANDOM: -1}"; sleep .5; done

while :; do echo $RANDOM | cut -c $((${#RANDOM}-1)); sleep .5; done

while :; do shuf -i 1-10 -n 1; sleep .5; done

while :; do tr -cd 0-9 < /dev/urandom | head -c 1; sleep .5; done

while :; do grep -m1 -ao '[0-9]' /dev/urandom | head -n1; sleep .5; done

while :; do echo $RANDOM | awk '{print substr($0,length,1)}'; sleep .5; done

while :; do awk -v min=1 -v max=10 'BEGIN{srand(); print int(min+rand()*(max-min+1))}'; sleep .5; done
 
Old 11-22-2017, 04:56 PM   #5
michaelk
Moderator
 
Registered: Aug 2002
Posts: 25,699

Rep: Reputation: 5895Reputation: 5895Reputation: 5895Reputation: 5895Reputation: 5895Reputation: 5895Reputation: 5895Reputation: 5895Reputation: 5895Reputation: 5895Reputation: 5895
I don't claim to be an expert and only have a fundamental understanding of how /dev/random works.

In a nutshell pseudo random number generators are not totally random. If you generate enough numbers the sequence will eventually be repeated and they typically use a seed so that that sequence is not the same. I would say that is why the first method has the most duplicates in addition to being a small range.

As to why /dev/urandom generates duplicates maybe due to the running out of entropy caused by the large amount of numbers being generated.

https://en.wikipedia.org/wiki/Pseudo...mber_generator
https://stackoverflow.com/questions/...ropy-pool-work
https://lwn.net/Articles/261804/
 
Old 11-22-2017, 08:54 PM   #6
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,659
Blog Entries: 4

Rep: Reputation: 3940Reputation: 3940Reputation: 3940Reputation: 3940Reputation: 3940Reputation: 3940Reputation: 3940Reputation: 3940Reputation: 3940Reputation: 3940Reputation: 3940
It is actually normal to wind up with a few duplicate numbers if you generate enough of them, and especially if the numeric range of the numbers is small. "51 dupes in a million" is very fine.

What you don't want – and will not get, unless you reset the seed – is a duplicated sequence.
 
Old 11-23-2017, 11:23 PM   #7
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 529

Rep: Reputation: 319Reputation: 319Reputation: 319Reputation: 319
By this test, your numbers are random. With perfectly random numbers, we expect about 50 duplicates. This is an example of the "Birthday Problem."

http://mathworld.wolfram.com/BirthdayProblem.html

Here's how it works: The probability of any two numbers matching is 1 in 10 billion. That sounds pretty low, but the set of one million numbers defines roughly 500 billion pairs.

(Pair the first number with any of the next 999999 numbers, then pair the second number with any of the remaining 999998 numbers, and so on, until the second-to-last number is paired with the last number. This gives a total of 1000000*999999/2=499999500000 pairs.)

We can estimate the number of duplicates by multiplying the number of pairs sampled by the probability of one pair matching: (1000000*999999/2)*(1/10000000000)=49.99995.

Because it is random, the exact count will vary -- it is possible, though not likely, to have no matches, and likewise it is possible but extremely unlikely that all 1 million numbers will be the same -- but the average number of duplicates will be about 50.

Last edited by Beryllos; 11-23-2017 at 11:47 PM. Reason: corrected a small inaccuracy in the explanation
 
  


Reply

Tags
random



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Dude necessary time to do dd if=/dev/urandom of=/dev/sdx Gugur Slackware 49 02-07-2018 07:42 PM
dd if=/dev/urandom of=/dev/hda2 bs=1M doesn't work Melsync Linux - General 16 05-19-2014 06:36 PM
[SOLVED] Stupidly ran "cat /dev/urandom > /dev/mem", worried I broke firmware crosstalk Linux - Hardware 2 10-25-2010 05:27 PM
[SOLVED] wiping HDD using /dev/urandom versus /dev/zero, a theoretical question H_TeXMeX_H Linux - General 6 06-29-2009 06:55 AM
/dev/random and /dev/urandom pool(s)? kpeirce Linux - Software 2 01-31-2006 06:54 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 02:08 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration