LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 11-26-2010, 07:16 AM   #1
ernieball
LQ Newbie
 
Registered: Apr 2010
Posts: 5

Rep: Reputation: 0
advanced text-sorting


Hi.

I have a text-file with html-code, which goes like this:

<!***"Item A" >
<Item A html code ...
.../>

<More Item A html code ...
.../>

<!***"Item G" >
<Item G html code ...
.../>

<More Item G html code ...
.../>

..and so on, and I wish to sort alphabetically by the commented "items". Each comment is 7 lines apart (the three asterixes are actually part of the comment). How can I do this and also keep the following code lines where it belongs?

Thanks for any help
 
Old 11-26-2010, 08:40 AM   #2
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
bash & grep to alphabetically rearrange HTML file

You *really* should be using some sort of good parser for HTML for this, such as PERL for example. I don't know PERL myself.. So, here's some bash shell code that does what you're asking. As implied, this is not the right tool for the job, and I would probably not expect high performance, especially if the input file is very large. But anyway, it is something to toy with until someone suggests a better method:
Code:
#!/bin/bash

grep '^<!' htmlfile | sort | while read ITEM; do
    printing="off"
    cat htmlfile | while read LINE; do
        if [ "$printing" = "off" ]; then
           if [ "${LINE}" = "${ITEM}" ]; then
               printing="on"; echo "${LINE}"
           fi
        else
           if ! echo "${LINE}" | grep -q '^<!'; then
               echo "${LINE}"
           else
               break
           fi
        fi
    done
done
This does not depend on the 7 line spacing, so that doesn't matter. It does depend somewhat on the formatting of the input - it won't tolerate any variation in the comment lines.
Note that I have highlighted in bold the two places where you need to put the filename of your actual input file.

Good luck! I'll look forward to hopefully seeing some better solutions than this.

Last edited by GrapefruiTgirl; 11-26-2010 at 09:24 AM. Reason: minor code improvement
 
1 members found this post helpful.
Old 11-26-2010, 09:15 AM   #3
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,396
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
You haven't said what programming language you want to use to implement this, so only general strategy can be provided. You have a file containing blocks of data. Each block is delimited by a regular pattern. You want to sort the blocks, so you need a method to compare them, which you can supply as the method to a sort() function, which is usually implemented in popular programming languages. Many programming languages that are well suited to this kind of task provide a way to define the delimiters used to read data on a record-at-time basis (Perl, AWK, Bash). So, define the delimiter (probably the string '<!***"Item '), read the file as an array of records, and pass the array to the sort function. Create a record comparison function that returns -1, 0, or 1 based on the comparison of a specified pair of records, and pass that function as an argument to the sort() function. Finally, print the sorted array back to a file or to standard output.

When you have some code to test, come back here for help with the details.

--- rod.
 
Old 11-26-2010, 11:53 AM   #4
ernieball
LQ Newbie
 
Registered: Apr 2010
Posts: 5

Original Poster
Rep: Reputation: 0
Hello, guys!

@theNbomr
It makes sense, I guess, but I'm still far too inexperienced with scripting to come up with something myself on this. Anyway, thanks for your attention

@GrapefruiTgirl
Luckily, my file is pretty simple. Just follows that routine as I stated. I just copy/pasted your script, and it did exactly what I needed

Thank you very much to both of you! This was great help!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
sorting text file on different columns in different orders arshadul Linux - Newbie 1 10-19-2009 03:15 PM
Sorting large text files tmaxx AIX 14 02-19-2009 07:32 PM
Can VI Do This?? Delete before a character in text - Maybe advanced. tbeehler Linux - Software 9 05-07-2007 06:02 PM
sorting text file - sort command man_linux Linux - General 16 08-09-2006 05:58 PM
Semi-Advanced Text Editor cspos Linux - Software 9 10-27-2004 12:31 PM


All times are GMT -5. The time now is 12:23 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration