LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices



Reply
 
Search this Thread
Old 09-15-2010, 02:48 PM   #1
user_28
LQ Newbie
 
Registered: Aug 2010
Posts: 27

Rep: Reputation: 0
String match in perl


Hi,
Can someone help me with the below requirement.

One of my application generates a text file with an XML output in it. I need to read that log files and if the output does not match to a string in couple of tags it should create a log file with the file name and the the tag name.

The two tags where the string should match is:

identity format="JPEG File Interchange Format" mimetype="image/jpeg">
<well-formed toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">true</well-formed>
<valid toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">true</valid>


Identity format tag should always be JPEG , well- formed and valid status tags should be true.

sample output file:

<?xml version="1.0" encoding="UTF-8" ?>
- <fits xmlns="http://hul.harvard.edu/ois/xml/ns/fits/fits_output" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://hul.harvard.edu/ois/xml/ns/fits/fits_output http://hul.harvard.edu/ois/xml/xsd/fits/fits_output.xsd" version="0.4.2" timestamp="9/13/10 6:04 PM">
- <identification>
- <identity format="JPEG File Interchange Format" mimetype="image/jpeg">
<tool toolname="Jhove" toolversion="1.5" />
<tool toolname="file utility" toolversion="5.03" />
<tool toolname="Exiftool" toolversion="7.74" />
<tool toolname="Droid" toolversion="3.0" />
<tool toolname="NLNZ Metadata Extractor" toolversion="3.4GA" />
<version toolname="Jhove" toolversion="1.5">1.02</version>
<externalIdentifier toolname="Droid" toolversion="3.0" type="puid">fmt/44</externalIdentifier>
</identity>
</identification>
- <fileinfo>
<size toolname="Jhove" toolversion="1.5">807537</size>
<creatingApplicationName toolname="Jhove" toolversion="1.5" status="CONFLICT">desc</creatingApplicationName>
<creatingApplicationName toolname="Exiftool" toolversion="7.74" status="CONFLICT">desciwtpt.[.bkpt.o.kTRC..textCopyright 1999 Adobe Systems Incorporateddesc.Gray Gamma 2.2XYZ T...XYZ .!.?z.</creatingApplicationName>
<lastmodified toolname="Exiftool" toolversion="7.74" status="SINGLE_RESULT">2010:09:09 16:40:25-04:00</lastmodified>
<filename toolname="OIS File Information" toolversion="0.1" status="SINGLE_RESULT">C:\Documents and Settings\user\Desktop\Test Data\IMDDBATCH_001\IMD025350802\IMAGES\IMD025350802_000002.jpg</filename>
<md5checksum toolname="OIS File Information" toolversion="0.1" status="SINGLE_RESULT">e8d10b18be1865513f4e175d48d2a902</md5checksum>
<fslastmodified toolname="OIS File Information" toolversion="0.1" status="SINGLE_RESULT">1284064825671</fslastmodified>
</fileinfo>
- <filestatus>
<well-formed toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">true</well-formed>
<valid toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">true</valid>
</filestatus>

Thanks,
 
Old 09-15-2010, 07:16 PM   #2
user_28
LQ Newbie
 
Registered: Aug 2010
Posts: 27

Original Poster
Rep: Reputation: 0
Hi,
Can anyone help me with the above problem. Running on a dead line so..

Thanks,
 
Old 09-15-2010, 07:50 PM   #3
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,693

Rep: Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988
So as this is so urgent, I am assuming that one of the 1.7 million hist that google returns did not work for you?

Please let us know what you have tried and where you are running into problems?

Also, it may be a deadline for you but this is a free site so no one here will go any faster for your deadline.
 
Old 09-15-2010, 09:25 PM   #4
kurumi
Member
 
Registered: Apr 2010
Posts: 223

Rep: Reputation: 45
Code:
ruby -ne 'BEGIN{c=0;a=[]};$_=~/identity.*format.*JPEG|well-formed.*true|valid toolname.*true/&&c+=1;a<<$&;END{ print a.join() if c!=3}'  file
 
Old 09-15-2010, 09:54 PM   #5
user_28
LQ Newbie
 
Registered: Aug 2010
Posts: 27

Original Poster
Rep: Reputation: 0
Hi,

Thank you so much for the reply, but I am looking something in perl. I tried a way of matching a string and if there is match that file will be copied to a new file.

Right now in my code I am matching only one string , but I need to match one more string 'JPEG' and when there is no match then print the filename to the log. The code I as trying till now.


#!/usr/local/bin/perl

$myfile='C:\Documents and Settings\user\Desktop\output\ IMDB025350802\IMDB025350802_000001.xml';

$string="true";

$new_myfile="newfile.txt";

open(FILEHANDLER,$myfile)|| die("Cannot Open File $myfile");

open (NEW_FILEHANDLER,">$new_myfile");
my $found = 0;
my @lines = <FILEHANDLER>;
if(scalar(@lines == 0)){
print "File empty! nothing to fetch from file.\n";
}
else{
foreach (@lines){
if(/$string/){
$found++;
}
else { print NEW_FILEHANDLER $_; }
}

if($found == 0){
print "Did not find string in the file \n";
}
else {print "Found the string in the file\n";}
}
close (NEW_FILEHANDLER);
close(FILEHANDLER);
 
Old 09-15-2010, 10:28 PM   #6
estabroo
Senior Member
 
Registered: Jun 2008
Distribution: debian, ubuntu, sidux
Posts: 1,095
Blog Entries: 2

Rep: Reputation: 111Reputation: 111
based on what you have and what you seem to be going for you could just change that if to
$found++ if /true|JPEG/;
 
Old 09-16-2010, 12:03 PM   #7
user_28
LQ Newbie
 
Registered: Aug 2010
Posts: 27

Original Poster
Rep: Reputation: 0
Hi,
while matching a string I am running into problems. It is creating a new file for both true and false because somewhere in my file i have a word false. So I tried another approach of reading the XML and compare the attribute values and if there is a match print match else not match.

But for some reasons it is printing not match even when there is match. If possible let me know where I am doing wrong.

#!/usr/local/bin/perl-w

use XML::Simple;

my $xml = XML::Simple->new;
my $file = $xml->XMLin('IMD025350802_000001.xml');

if ($file->{well-formed} eq 'true')

{

print "match"

}

else
{
print "not match"

}


Part of the XML code I am reading:

<filestatus>
<well-formed toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">true</well-formed>
<valid toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">true</valid>
</filestatus>

Thanks,
 
Old 09-16-2010, 01:37 PM   #8
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Just don't (re)invent XML pasers. Use http://search.cpan.org/search?query=XML+parser&mode=all instead.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
reg expr: match a string A-Z, a-z, * or a blank ( but string can not be all blanks) matt007 Programming 4 12-22-2009 09:55 AM
perl string match issue knockout_artist Linux - Newbie 11 01-13-2009 06:53 PM
iptables string match htb Linux - Networking 2 08-30-2006 03:37 PM
iptables string match kahpeetan Linux - Security 3 11-09-2003 07:36 PM
how to grep only one string pr match gummimann Linux - General 3 11-06-2003 10:40 AM


All times are GMT -5. The time now is 01:32 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration