LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-10-2004, 11:50 PM   #1
eph
LQ Newbie
 
Registered: Jan 2004
Location: Muntinlupa, Philippines
Distribution: Mandrake
Posts: 8

Rep: Reputation: 0
Exclamation need help on processing large data files


how do i process large text files? (in the gigabyte range)

i need to do it efficiently. should i use parallel processing? can anyone help on this. thanks
 
Old 03-11-2004, 12:04 AM   #2
Qzukk
Member
 
Registered: Jun 2003
Posts: 132

Rep: Reputation: 15
Depends on what you are doing. If you can say with certainty that no piece of data in the file depends on any other data in the file, then you can split the file up into chunks and process them distributed in parallel. If you end up doing this as a single giant file, you'll want to look into using mmap() to map the file into memory for access to it, rather than using the regular read/write commands.
 
Old 03-11-2004, 12:24 AM   #3
eph
LQ Newbie
 
Registered: Jan 2004
Location: Muntinlupa, Philippines
Distribution: Mandrake
Posts: 8

Original Poster
Rep: Reputation: 0
i'm going to process the log files of TCPdump. i'm thinking of using PERL and Beowulf for clustering. would it make any difference? and would PERL work on this?

i'm just a newbie to this.
 
Old 03-11-2004, 04:56 AM   #4
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: Mint, Armbian, NetBSD, Puppy, Raspbian
Posts: 3,515

Rep: Reputation: 239Reputation: 239Reputation: 239
I'd try perl first. if it's too slow, then think about C.

I've made a hash array in perl once with millions of code=>price
pairs. it loaded quite slow but worked very fast.

(but i ended up using DBM and C!)

billy
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
HP laserjet 6p stalls with large amount of data simjii Mandriva 2 04-10-2020 08:52 PM
Data Processing joelhop Linux - General 8 01-01-2006 08:08 PM
Data Processing Server peter72 Linux - Software 1 06-14-2005 11:17 AM
large data trasfer problem mos definitely General 2 12-27-2004 05:23 PM
Large data files on CD dema Linux - Newbie 1 01-26-2002 10:30 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:17 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration