LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 09-28-2011, 08:45 PM   #1
vamsiv
LQ Newbie
 
Registered: Sep 2011
Posts: 5

Rep: Reputation: Disabled
Exclamation Need scripting help configuring raw data into required format (perl or shell)


Hi All

Thanks in advance for your replies.

My issue is as follows.

I have to do a mass insert of almost 2gb worth of similar data into my database and I need to configure the raw data in such a way that I can do a block insert into my db.

The file in its original format is as follows

1




1992-09-01
10:59:32
02002









1992-09-01
10:59:32


1
NA
ENT
55
N
S/Holder Details Change in Substantial (S.43)

------------------------------

I simply need to make it look like

ENT,1992-09-01,10:59:32,55, S/Holder Details Change in Substantial (S.43)


Any help would be greatly appreciated. I have tried various scripts and all but the regex operation is beyond me. The space in the original format exists and needs to be taken into consideration. I have tried various things but have failed miserably.
 
Old 09-28-2011, 09:15 PM   #2
snooly
Member
 
Registered: Sep 2011
Posts: 124

Rep: Reputation: Disabled
Perl can do that for sure. What program have you written so far?

It should be fairly easy to write a perl program which:

* reads the file

* parses what it gets from the file

* figures out the start and end of each record

* outputs records in the format you want to a new file.
 
Old 09-28-2011, 09:21 PM   #3
vamsiv
LQ Newbie
 
Registered: Sep 2011
Posts: 5

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by snooly View Post
Perl can do that for sure. What program have you written so far?

It should be fairly easy to write a perl program which:

* reads the file

* parses what it gets from the file

* figures out the start and end of each record

* outputs records in the format you want to a new file.
Hey thanks for the reply. I haven't got anything so far. I can open the file and thats it, not sure how to manipulate the data at all.
 
Old 09-28-2011, 09:34 PM   #4
snooly
Member
 
Registered: Sep 2011
Posts: 124

Rep: Reputation: Disabled
Have you got perl installed at least? If so, you could read "man perlfunc" and "man perlopentut" to get some idea. The perl functions that you would probably use are open, close, print, and some sort of regex matching.

If the problem is urgent, you probably should hire a programmer to write something for you. If it isn't urgent, you can figure out how to write a program to do it maybe in a few days.
 
Old 09-28-2011, 09:40 PM   #5
vamsiv
LQ Newbie
 
Registered: Sep 2011
Posts: 5

Original Poster
Rep: Reputation: Disabled
I do have perl installed and I will give it a go. It's not urgent but this is something that I want to get done today so I can worry about the more important thing which is writing a web service
 
Old 09-28-2011, 09:51 PM   #6
snooly
Member
 
Registered: Sep 2011
Posts: 124

Rep: Reputation: Disabled
Here's a basic outline of how you might approach solving the problem:

* make a small input file with say 10 to 20 records for testing purposes

* write a perl program which:
* opens the input file for reading
* opens the output file for writing
* reads the input file line by line, analysing the data to determine where the records start and stop
* save the data into variables ready for output
* once you have read a whole record, create a line of output using the data in the variables
* print the line of output to the output file
* when the input file has been completely read, close it and close the output file too
* test importing the data into a test database
* fix any problems
* have a beer, you earned it
 
Old 09-28-2011, 09:57 PM   #7
vamsiv
LQ Newbie
 
Registered: Sep 2011
Posts: 5

Original Poster
Rep: Reputation: Disabled
I understand the following

Quote:
Originally Posted by snooly View Post
Here's a basic outline of how you might approach solving the problem:

* make a small input file with say 10 to 20 records for testing purposes

* write a perl program which:
* opens the input file for reading
* opens the output file for writing
but when you say
Quote:
Originally Posted by snooly View Post
Here's a basic outline of how you might approach solving the problem:
* reads the input file line by line, analysing the data to determine where the records start and stop
* save the data into variables ready for output
* once you have read a whole record, create a line of output using the data in the variables
* print the line of output to the output file
* when the input file has been completely read, close it and close the output file too
I am totally lost

Not really a perl programmer, worked with java and c++ only unfortunately
 
Old 09-28-2011, 09:58 PM   #8
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,240

Rep: Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324
Good Perl tutorials
http://perldoc.perl.org/
http://www.perlmonks.org/?node=Tutorials
 
Old 09-28-2011, 10:02 PM   #9
snooly
Member
 
Registered: Sep 2011
Posts: 124

Rep: Reputation: Disabled
Quote:
Originally Posted by vamsiv View Post
Not really a perl programmer, worked with java and c++ only unfortunately
You could write it in java or c++, but it would be harder and less likely to work, and more bugs. But since you're not a perl programmer, maybe you should use java or c++.

On the other hand, perl isn't very hard to learn. If you're already a programmer, you should be able to figure it out. You could start with simple stuff like writing a program which simply reads the input file line by line, and copies it to the output file with no changes. Then try modifying that program to make it detect the start and end of records. Then modify it again so that it produces the output you want.

The perl functions you are likely to use may include: open, close, <>, m//, die, and print. That probably looks like gibberish, but once you understand it, you'll be well on the way to solving the problem.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
i want a shell script to print system processess in required format. vpradeep Linux - Newbie 4 07-04-2008 05:35 AM
Excel Module - perl scripting - specifying encoding format kshkid Programming 2 11-06-2007 11:52 PM
Help Required With Shell Scripting bigbadbo Programming 4 03-20-2007 06:30 AM
Help Required With Shell Scripting bigbadbo Linux - General 3 03-19-2007 08:11 PM


All times are GMT -5. The time now is 01:08 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration