LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Read large text files (~10GB), parse for columns, output (https://www.linuxquestions.org/questions/programming-9/read-large-text-files-%7E10gb-parse-for-columns-output-717217/)

int0x80 04-07-2009 09:20 AM

Quote:

Originally Posted by jglands (Post 3501078)
He should use windows, because you get what you pay for. If it's free it must be junk.

I have heard of this thing called the "10% Rule". Basically, you have to be smarter than 10% of all 4-year-olds to be able to use Linux.

jglands 04-07-2009 09:24 AM

That comes from a pimple faced teenager who has no life. Use windows and you don't have to spend your nights home alone.

int0x80 04-07-2009 09:28 AM

This coming from an MCSE who gets down on his knees and prays to his gods: Ballmer and Gates. Use Linux and you don't have to spend your weekends re-installing your relatives' computers. Antivirus 2009 LOLOLOLOL.

jglands 04-07-2009 09:31 AM

I got my MCSE in six months and at least I can pronounce my guys name. What kind of guy has a name of the thumbsucker off of peanuts?

int0x80 04-07-2009 09:34 AM

Oh I know let's re-use the same horrible kernel over and over and just put a different UI over it. Leave the real computer science to the computer scientists and enjoy your sheltered existence at the help desk.

jglands 04-07-2009 09:37 AM

If it works why create a new kernel? At least I have a job. Most companies don't use linux and if they do they use it because they have no real budget. So how well does McDonalds pay?

int0x80 04-07-2009 09:39 AM

McDonalds is a multinational corporation with more locations than whatever lame .NET fail company you work for. Which business will survive the recession?

Telemachos 04-07-2009 09:41 AM

@ int0x80: jglands has posted only to this thread and only to troll. Please stop feeding him.

jglands 04-07-2009 09:42 AM

Microsoft has billions in the bank and sells their products for a good profit. How much profit do you get for a Linux download/big mac? More Linux companies have went under then Microsoft has sold in copies of Windows.

.NET is setting the standard out there. If the original poster was smart he would use that over C or PERL. It's so much better!

ghostdog74 04-07-2009 09:42 AM

where's the moderator?

int0x80 04-07-2009 09:44 AM

If the OP paid for a Microsoft OS/compiler/rip-off, would they solve his query for free? Or would they try to nickel and dime even more money out of him? More like Windows Genuine FAILAGE, imo.

sundialsvcs 04-07-2009 09:44 AM

:rolleyes: Stick to the subject, please... "Cheap beer and forums do not mix."

No, it probably won't be "better than awk."

"awk" is a very well-written program that is specialized for doing what you are doing.

All of the delays associated with this task will be mechanical ones: disk I/O times and network time. But "awk" knows to tell the operating-system that the file is being read sequentially, and therefore the operating system will know how to line-up lots of file buffers and other tricks to streamline the operation as much as the hardware will allow.

If the time required to do this task is problematic to the business, then there are various things that you can do:
  1. Invest in fast storage-hardware... SATA, FireWire.
  2. Instead of using the disk controllers built into the motherboard, buy a controller card. An inexpensive unit can make a dramatic difference.
  3. Put the input file and the output file on different disk volumes.
  4. Do not follow the siren that says, "put it all in memory..." Abandon all hope, ye who enter there!
Face it: when you're dealing with 10 gigabytes of data, "some things take time." If you're doing the task in "awk," and doing it well, then you are using a robust tool that was specifically designed for the task. You have not erred in the approach that you are using right now. "Diddling with it" will not improve it.

Telemachos 04-07-2009 09:45 AM

For the record, it would be unfortunate to lock the whole thread. The question (How do I deal with a mega-sized file and the associated I/O problems?) is a serious one and deserves some discussion.

jglands 04-07-2009 09:50 AM

Well he would at least have support? What does he have from linux now? Some pimple faced kids telling him he is wrong instead of helping him.

int0x80 04-07-2009 09:50 AM

Quote:

Originally Posted by sundialsvcs (Post 3501222)
:rolleyes: Stick to the subject, please... "Cheap beer and forums do not mix."

Sorry, I just get frustrated when people reply with stupid responses that are irrelevant to the original issue ("use an interpreted language", "perl can do regex", "windows > linux", etc). The last one strikes a nerve as you can imagine ;]


All times are GMT -5. The time now is 02:55 AM.