ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
i have a problem, actually, it is research that i am conducting. i want to use python (because i am trying to learn it) to count numbers, from 0xffffffffffff to 0xffffffffffffffffffff and print them in a list, in order, in hex only (1A4D instead of 0x1A4D) with capital letters to a text file, separated by commas. it does not matter how big the file is going to be, as long as they are separated by commas (or anything really, they are going to be used as input for another program).
i have some code that finally conts, then prints the numbers to a file, but it prints them in decimal, not hex and i can't figure out how to get the separators in. some have told me that it cannot work due to being too much data, soething like millions of years to calculate or using 800 8 terabyte hard drives for everyone on earth to store the file. now, i am not a mathematecian, but when i take 12 f's from 20 f's, that leaves 8 f's which convert to just under 4.3 billion numbers. i don't know how many it could write to the file per second, but it would have to be at least 5 to be reasonable. that would be (by my math) about 27 years or so. not millions, but by using more than one computer, it should be feasible. anyway, here is the code i have so far:
Code:
#!/usr/bin/python
#count in hex from 12 f's to 20 f's and write to comma
#delimited file to create a dictionary
def count_hex():
x = 0xffffffffffff
while x <= 0xffffffffffffffffffff:
x += 0x1
s = str(x)
s.upper()
with open("dictionary.txt", "a") as diction:
diction.write(s)
count_hex()
any help is appreciated. i just need to get the code to work, i may be able to get access to a cluster for the actual work, if the file is not too big, more than a few terabytes.
Last edited by sfzombie13; 01-09-2015 at 11:34 PM.
Reason: typo
i have some code that finally conts, then prints the numbers to a file, but it prints them in decimal, not hex and i can't figure out how to get the separators in. some have told me that it cannot work due to being too much data, soething like millions of years to calculate or using 800 8 terabyte hard drives for everyone on earth to store the file. now, i am not a mathematecian, but when i take 12 f's from 20 f's, that leaves 8 f's which convert to just under 4.3 billion numbers. i don't know how many it could write to the file per second, but it would have to be at least 5 to be reasonable. that would be (by my math) about 27 years or so. not millions, but by using more than one computer, it should be feasible. anyway, here is the code i have so far:
def count_hex():
x = 0xffffffffffff
while x <= 0xffffffffffffffffffff:
x += 0x1
with open("dictionary.txt", "a") as diction:
diction.write(hex(x))
diction.write(',')
astrogeek: i was counting the actual decimal number difference. i took 20 f's and put them into a hex converter and came up with the 4.3 odd billion in decimal. when i took a number of that size and put it into a gedit file, it was 32 bytes. i then added another number of equal length and came up with 64 bytes. then i deduced that it takes around 32 bytes to store a number of the largest value, and this is where i stopped. i need to sit down and finish the equation, but ran out of time. however, simply the fact that i can put the digits into a document and save them, shows me that it is possible for the computer to handle the size, and possibly save them. i know there is a disconnect between theory and practice, i just don't know where it lies.
just a comment, and probably I'm wrong, but this structure:
Code:
with open("dictionary.txt", "a") as diction:
diction.write(hex(x))
diction.write(',')
will open and close filehandle for every and each number you want to write, which is a huge overhead, that will cause a much longer execution. But it is not really important because 10^15 or 10^16 years is exactly the same for me (not to speak about the life of the hardware you use).
i took 20 f's and put them into a hex converter and came up with the 4.3 odd billion in decimal.
Maybe your hex converter uses 32 bit numbers internally so it only goes up to 2^32 - 1 = 4 294 967 295 = 0xFFFF FFFF.
Quote:
when i took a number of that size and put it into a gedit file, it was 32 bytes.
That's counting the size of the text (raw will be smaller) representation in decimal (hex will be smaller) of a number, it also includes any whitespace you added by e.g. pressing <enter>.
My calculations put you over 300 TB of data (ascii) tops.The limiting factor will be hd speed. I think a fast hard drive these days is likely 200 MB/sec. or 0.0002 TB. On a system that could manage that size of data and with those numbers I come up with under 18 days.
astrogeek: i was counting the actual decimal number difference. i took 20 f's and put them into a hex converter and came up with the 4.3 odd billion in decimal. when i took a number of that size and put it into a gedit file, it was 32 bytes. i then added another number of equal length and came up with 64 bytes. then i deduced that it takes around 32 bytes to store a number of the largest value, and this is where i stopped. i need to sit down and finish the equation, but ran out of time. however, simply the fact that i can put the digits into a document and save them, shows me that it is possible for the computer to handle the size, and possibly save them. i know there is a disconnect between theory and practice, i just don't know where it lies.
The main disconnect here is that your first calculation is not correct! Your hex converter lied to you! Everything that followed was wrong!
20 f's is NOT 4.3 odd billion in decimal! 4.3 odd billion is the limit of 32-bit unsigned integers, so your hex converter simply truncated your 20 f's down to ffff ffff and did not tell you it was doing so! It was probably written by someone who also ignored the importance of the math!
But in fairness to your hex converter, did your math teacher not teach you to check your results?! This is an obvious error on the order of 10^15 overflow! If you aspire to be a researcher then these things should be obvious to you!
Quote:
Originally Posted by SoftSprocket
My calculations put you over 300 TB of data (ascii) tops...
Your number may be correct, but your "tera" is in the wrong place!
30,223,145,483,328,854,949,888,000
..looks like
30,223,145,483,328 TB to me!
That is 30+ Tera-Tera-Bytes!
This is not rocket science, this is basic math, and it is basic computer math to boot (pun accidental)!
I re-read the original problem conditions to be sure I had understood it correctly the first time, and I think I did...
Quote:
Originally Posted by sfzombie13
i have a problem, actually, it is research that i am conducting. i want to use python (because i am trying to learn it) to count numbers, from 0xffffffffffff to 0xffffffffffffffffffff and print them in a list, in order, in hex only (1A4D instead of 0x1A4D) with capital letters to a text file, separated by commas. it does not matter how big the file is going to be, as long as they are separated by commas (or anything really, they are going to be used as input for another program).
So you want to count from ffff ffff ffff to ffff ffff ffff ffff ffff, and store those numbers as comma separated text representation of hexadecimal values, to a text file.
So the number of numbers you want to count is the difference between those two, which is:
You can store those more compactly, but let's use your original requirement to store them as comma separated ascii representations of hexadecimal values and use SoftSprockets value of 25 bytes each.
We can see from our calculation above that this requires 30+ Tera-Tera-Bytes of storage.
Now, how long does it take?
Let's stay with your original guess of 5 numbers per second, which is...
Code:
1,208,925,819,333,154,197,995,520/5 = 241,785,163,866,000,000,000,000 seconds (with some rounding error on the low end)
31,557,600 seconds in a year, gives...
7,661,709,504,720,000 years
The universe is generally accepted to be 13,700,000,000 years old.
So with 30+ Tera-Tera-Bytes of storage and approximately a million times the age of the universe, you should be good to go!
Your number may be correct, but your "tera" is in the wrong place!
30,223,145,483,328,854,949,888,000
..looks like
30,223,145,483,328 TB to me!
That is 30+ Tera-Tera-Bytes!
Let's stay with your original guess of 5 numbers per second, which is...
Code:
1,208,925,819,333,154,197,995,520/5 = 241,785,163,866,000,000,000,000 seconds (with some rounding error on the low end)
31,557,600 seconds in a year, gives...
7,661,709,504,720,000 years
The universe is generally accepted to be 13,700,000,000 years old.
So with 30+ Tera-Tera-Bytes of storage and approximately a million times the age of the universe, you should be good to go!
See my first post...
Quite so -I checked my numbers again:
30223145483328 TB
or ~16,000,000 years of disk writing at current technology.
5 per second might have been correct for punch cards but a modern hard drive writes at a considerably faster rate ... not that it will help.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.