Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
03-17-2011, 01:27 PM
|
#1
|
LQ Newbie
Registered: Oct 2009
Posts: 27
Rep:
|
Any shell scripts for cutting and pasting part of data?
Hi,
I have a tab-delimited txt file as below. It is part of the original file.
Quote:
##Hello
##Welcome
#C1 C2 C3
1 1 1
2 2 2
3 3 3
3 3 3
|
I want to cut the lines starting with "3" in column1 and paste them before the lines starting with "1" in column 1. So I will get
Quote:
##Hello
##Welcome
#C1 C2 C3
3 3 3
3 3 3
1 1 1
2 2 2
|
Anyone knows any simple shell scripts to do that? The original file is too big so I want to just use shell scripts to process that data.
Thanks
-C
|
|
|
03-17-2011, 02:46 PM
|
#2
|
Senior Member
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278
|
|
|
|
03-17-2011, 08:42 PM
|
#3
|
LQ Guru
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,039
|
Depending on how large the file is you may be able to do it with something like awk. The warning is it will store the information in memory so
if file is large you may need an alternate solution:
Code:
awk '$1 == 3{a=0;b=1}$1 == 1{a=1}a{store=(store)?store"\n"$0:$0;next}b && $1 != 3{b=0;print store}1;END{if(b)print store}' file
This should also work for the following scenario:
Code:
##Hello
##Welcome
#C1 C2 C3
1 1 1
2 2 2
3 3 3
3 3 3
4 4 4
And supply output of:
Code:
##Hello
##Welcome
#C1 C2 C3
3 3 3
3 3 3
1 1 1
2 2 2
4 4 4
|
|
1 members found this post helpful.
|
03-18-2011, 11:05 AM
|
#4
|
LQ 5k Club
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
|
I concur with grail; this is a job better suited to a more complete programming language such as Awk or Perl. Is that an option?
--- rod.
|
|
|
03-18-2011, 11:31 AM
|
#5
|
Bash Guru
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852
|
Quote:
Originally Posted by szboardstretcher
|
Unfortunately, the sed expression I gave in the above thread isn't really suitable for this problem. In that case we only had to concatenate a couple of lines, while here we have to shift whole blocks of lines around. sed just isn't really designed for major multi-line editing, so I agree that awk or perl would be best here.
It might be fun to try writing a bash-only script that can do the same thing, but it would probably end up being too complex to be worth the effort. 
|
|
1 members found this post helpful.
|
03-18-2011, 11:45 AM
|
#6
|
LQ 5k Club
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
|
Quote:
Originally Posted by David the H.
I agree that awk or perl would be best here.
|
In particular, the Perl 'splice' function seems to be ideally suited to most parts of the task.
--- rod.
|
|
|
03-18-2011, 07:13 PM
|
#7
|
Member
Registered: Apr 2010
Posts: 228
Rep:
|
Quote:
Originally Posted by theNbomr
In particular, the Perl 'splice' function seems to be ideally suited to most parts of the task.
--- rod.
|
how? may i ask. do you read the whole file into memory?
|
|
|
03-18-2011, 08:34 PM
|
#8
|
LQ 5k Club
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
|
Quote:
how? may i ask. do you read the whole file into memory?
|
Untested...
Code:
#! /usr/bin/perl -w
use strict;
# Open data file and swallow whole
#
open( DATAFILE, "/your/data/file.name" );
my @datafile = <DATAFILE>;
close DATAFILE;
my $records = @datafile;
my $record1;
for( my $i = 0; $i < $records; $i++ ){
# Assuming only one of these...
if( $datafile[$i] =~ m/1 1 1/ ){
# remember where to insert the '3 3 3' records.
$record1 = $i;
}
elsif( $datafile[$i] =~ m/3 3 3/ ){
push @records3, $datafile[$i];
splice( @datafile, $i, 1 );
}
}
splice @datafile, $record1, 0, @records3;
open( DATAFILE, ">/your/data/file.newname" );
print DATAFILE @datafile;
close DATAFILE;
exit 0;
I wasn't going to write the whole thing, but what the heck....
--- rod.
|
|
|
03-18-2011, 09:12 PM
|
#9
|
Member
Registered: Apr 2010
Posts: 228
Rep:
|
Quote:
Originally Posted by theNbomr
Untested...
Code:
#! /usr/bin/perl -w
use strict;
# Open data file and swallow whole
#
open( DATAFILE, "/your/data/file.name" );
my @datafile = <DATAFILE>;
close DATAFILE;
|
well i do not know whether he meant the file is too big to post here, or whether it is in fact a very huge file, but
Quote:
Originally Posted by cliffyao
The original file is too big.......
|
Last edited by kurumi; 03-18-2011 at 09:21 PM.
|
|
|
03-19-2011, 02:55 AM
|
#10
|
LQ Guru
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,039
|
Quote:
It might be fun to try writing a bash-only script that can do the same thing
|
I will take the challenge
Code:
#!/bin/bash
testing=true
found3=false
while read -r line
do
if [[ $line =~ ^[0-9] ]] && $testing
then
if [[ $line =~ ^3 ]]
then
found3=true
testing=false
else
[[ $insert ]] && insert+="$line\n" || insert="$line\n"
continue
fi
fi
if $found3 && [[ ! $line =~ ^3 ]]
then
line="$insert$line"
found3=false
fi
echo -e "$line" >> out_file
done<in_file
if $found3
then
echo -e "$insert" >> out_file
fi
You could also change 'insert' to be a temp file and hence if original file is really large then the issue of storing in memory is abated.
|
|
1 members found this post helpful.
|
03-19-2011, 09:58 AM
|
#11
|
LQ 5k Club
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
|
Quote:
Originally Posted by kurumi
well i do not know whether he meant the file is too big to post here, or whether it is in fact a very huge file, but
|
I took it to mean 'too big to do this manually' That seems to be the usual case in these forums.
--- rod.
|
|
|
03-19-2011, 01:59 PM
|
#12
|
Bash Guru
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852
|
grail, that's superb. I didn't really expect anyone to take up the challenge. I'd just thought about it for a few moments before deciding that I didn't have the time or energy to take it on myself.
I'm still trying to figure out the whole process of what you wrote, but it's already taught me something new:
Code:
found=true
if $found ; then echo "true" ; else echo "false" ; fi
# evaluates as "true"
found=false
if $found ; then echo "true" ; else echo "false" ; fi
# evaluates as "false"
It took me a minute to realize that the "true" and "false" contained in the variable are being evaluated as the commands, rather than strings. Interesting usage.
However, I think I'd still just use a regular string test myself, and since the double brackets can handle multiple conditions, I'd change your first if statement, for example, to:
Code:
if [[ $line =~ ^[0-9] && $testing == true ]]
Besides personal preference, I think it makes what's being tested clearer.
|
|
|
03-20-2011, 12:37 AM
|
#13
|
LQ Guru
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,039
|
Quote:
However, I think I'd still just use a regular string test myself, and since the double brackets can handle multiple conditions
|
I follow where you are coming from  And in the simple case like this I generally agree, but in some code I may have a test (using [[) along with a boolean and
also an arithmetic expression. So I have gotten in the habit of using the appropriate test for each.
Like so:
Code:
if [[ $string && -d $is_dir ]] || (( max > MAX || min < MIN)) || $start
These can all be handled within the confines of '[[', but (to me) it is clearer which test I am performing based on the nomenclature used.
Thanks for the feedback though as I do also need to work well with others and sometimes forget 
|
|
|
03-20-2011, 01:42 AM
|
#14
|
Bash Guru
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852
|
Yes, that's reasonable, when the types of evaluation are clearly different. On the other hand, it could be argued that combining multiple conditions is one of the main purposes of [[..]], so it seems clear enough to me either way.
I think that it's more the subtler tricks, where it may not be obvious just what the code is doing, that should be avoided as much as possible--at least when other people are going to see it. I was expecting to see a string test, so it was a bit of a surprise to discover that you were actually using true/false as commands.
|
|
|
All times are GMT -5. The time now is 07:11 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|