LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices



Reply
 
Search this Thread
Old 03-11-2004, 12:50 AM   #1
eph
LQ Newbie
 
Registered: Jan 2004
Location: Muntinlupa, Philippines
Distribution: Mandrake
Posts: 8

Rep: Reputation: 0
Exclamation need help on processing large data files


how do i process large text files? (in the gigabyte range)

i need to do it efficiently. should i use parallel processing? can anyone help on this. thanks
 
Old 03-11-2004, 01:04 AM   #2
Qzukk
Member
 
Registered: Jun 2003
Posts: 132

Rep: Reputation: 15
Depends on what you are doing. If you can say with certainty that no piece of data in the file depends on any other data in the file, then you can split the file up into chunks and process them distributed in parallel. If you end up doing this as a single giant file, you'll want to look into using mmap() to map the file into memory for access to it, rather than using the regular read/write commands.
 
Old 03-11-2004, 01:24 AM   #3
eph
LQ Newbie
 
Registered: Jan 2004
Location: Muntinlupa, Philippines
Distribution: Mandrake
Posts: 8

Original Poster
Rep: Reputation: 0
i'm going to process the log files of TCPdump. i'm thinking of using PERL and Beowulf for clustering. would it make any difference? and would PERL work on this?

i'm just a newbie to this.
 
Old 03-11-2004, 05:56 AM   #4
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: FreeBSD, Debian, Mint, Puppy
Posts: 3,314

Rep: Reputation: 175Reputation: 175
I'd try perl first. if it's too slow, then think about C.

I've made a hash array in perl once with millions of code=>price
pairs. it loaded quite slow but worked very fast.

(but i ended up using DBM and C!)

billy
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Data Processing joelhop Linux - General 8 01-01-2006 09:08 PM
HP laserjet 6p stalls with large amount of data simjii Mandriva 0 11-10-2005 04:32 PM
Data Processing Server peter72 Linux - Software 1 06-14-2005 12:17 PM
large data trasfer problem mos definitely General 2 12-27-2004 06:23 PM
Large data files on CD dema Linux - Newbie 1 01-26-2002 11:30 PM


All times are GMT -5. The time now is 05:47 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration