LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-23-2007, 10:31 AM   #1
stephencassidy
LQ Newbie
 
Registered: Jan 2006
Posts: 4

Rep: Reputation: 0
File splitting script


Hi,

I have got a text file which contains lots of information about PDF files that are stored in a directory on our server.

The format of the file is like this

Filename:
ABC123.PDF
InfoKey: Author
InfoValue: JOAN
InfoKey: Series
InfoValue: 1A07
InfoKey: Spec
InfoValue: 8650
InfoKey: Producer
InfoValue: Acrobat Distiller 6.0.1 (Windows)
InfoKey: ModDate
InfoValue: D:20070307085652Z
InfoKey: Keywords
InfoValue: This is a Sample file, testing keywords.
InfoKey: Comp
InfoValue: TT05

I need to read the file and extract the filename and the keywords from the file and put them into another file seperate the values with a comma, this other file will then be used to populate a database. How can I achieve this?

Thanks

Stephen
 
Old 03-23-2007, 11:28 AM   #2
cfaj
Member
 
Registered: Dec 2003
Location: Toronto, Canada
Distribution: Mint, Mandriva
Posts: 221

Rep: Reputation: 31
Quote:
Originally Posted by stephencassidy
Hi,

I have got a text file which contains lots of information about PDF files that are stored in a directory on our server.

The format of the file is like this

Filename:
ABC123.PDF
InfoKey: Author
InfoValue: JOAN
InfoKey: Series
InfoValue: 1A07
InfoKey: Spec
InfoValue: 8650
InfoKey: Producer
InfoValue: Acrobat Distiller 6.0.1 (Windows)
InfoKey: ModDate
InfoValue: D:20070307085652Z
InfoKey: Keywords
InfoValue: This is a Sample file, testing keywords.
InfoKey: Comp
InfoValue: TT05

I need to read the file and extract the filename and the keywords from the file and put them into another file seperate the values with a comma, this other file will then be used to populate a database. How can I achieve this?

Code:
awk '
/^Filename:/ { if ( record ) print record
               record = ""
               new=1
             }
new == 1 { record = $0; new = 0 }

sub( /^InfoValue: */, "" ) {
               ## Escape commas in field values
               gsub( ",", "\\,")
               record = record "," $0
              }

END { if ( record ) print record }' "$FILE"
Or, quoting field values:

Code:
awk '
/^Filename:/ { if ( record ) print record
               record = ""
               new=1
             }

new == 1 { record = $0; new = 0 }

sub( /^InfoValue: */, "" ) { record = record "," "\"" $0 "\"" }

END { if ( record ) print record }' "$FILE"
 
Old 03-23-2007, 11:49 AM   #3
stephencassidy
LQ Newbie
 
Registered: Jan 2006
Posts: 4

Original Poster
Rep: Reputation: 0
Chris, thank you so much, I will give it a try.

Stephen
 
Old 03-23-2007, 12:09 PM   #4
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
if you have Python, here's an alternative
Code:
#!/usr/bin/python

val=[]
key=[]
f = open("file")
f.readline()
filename = f.readline().strip()
for line in f:
    line = line.strip()
    if "InfoKey" in line:
        key.append(line.split(":")[1].strip())
    elif "InfoValue" in line:
        val.append(line.split(":")[1].strip())

print filename  , ' '.join(val)
output:
Code:
# ./test.py 
ABC123.PDF JOAN 1A07 8650 Acrobat Distiller 6.0.1 (Windows) D This is a Sample file, testing keywords. TT05

Last edited by ghostdog74; 03-23-2007 at 12:12 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
splitting a tar file s_hcl Linux - General 3 09-07-2006 01:45 PM
Script: splitting lines in multiple files and joining them timmay9162 Programming 28 04-14-2006 08:52 AM
Splitting a File Slayer097 Linux - Newbie 4 02-22-2005 07:15 PM
Bash Script String Splitting MurrayL Linux - Newbie 1 09-21-2004 03:20 AM
file splitting???? spideywebsling Linux - General 4 07-19-2004 06:42 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:22 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration