LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 06-09-2022, 12:38 AM   #46
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,369

Rep: Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753

If(!) i understand the qn (or close-ish)
Code:
# test file t.t contains 4 recs
ne4 tow4 three4 last4
ne2 tow2 three2 last2
ne1 tow1 three1 last1
ne3 tow3 three3 last3

# my code
#!/usr/bin/perl -w
use strict;

my (
	$file, $rec, %txt_pairs, $rev_rec, $last, $key
	);

$file="t.t";
open( TXT_FILE, '<' , "$file" ) or
            die "Can't open txt file: $file: $!\n";
while ( defined ( $rec = <TXT_FILE> ) )
{
   # Remove unwanted chars
   chomp $rec;                 # newline
   $rec =~ s/^\s+//;           # leading whitespace
   $rec =~ s/\s+$//;           # trailing whitespace

   next unless length($rec);   # anything left?

   # Split 'key value' string 
   ($last)=reverse(split(/\s+/, $rec) );
   $txt_pairs{$last} = $rec;

#print "last $last: Rec $rec \n";
}
close(TXT_FILE) or
            die "Can't close txt file: $file: $!\n";


for $key (sort keys %txt_pairs )
{
    print "$key $txt_pairs{$key}\n";
}

#Results
last1 ne1 tow1 three1 last1
last2 ne2 tow2 three2 last2
last3 ne3 tow3 three3 last3
last4 ne4 tow4 three4 last4
HTH
 
Old 06-09-2022, 01:05 AM   #47
Skaperen
Senior Member
 
Registered: May 2009
Location: center of singularity
Distribution: Xubuntu, Ubuntu, Slackware, Amazon Linux, OpenBSD, LFS (on Sparc_32 and i386)
Posts: 2,688

Original Poster
Blog Entries: 31

Rep: Reputation: 176Reputation: 176
does column one have "ne4 tow4 three4" or "ne4" in the first line of your test file?

it's a misleading example because it looks like white space is the delimiter and it's unclear how to parse just the first column. the difficulty is that splitting the columns could be using a delimiter that is a part of the first column. there could be multiple such "delimiters".
 
Old 06-09-2022, 03:58 AM   #48
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,039

Rep: Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347
Quote:
Originally Posted by Skaperen View Post
does column one have "ne4 tow4 three4" or "ne4" in the first line of your test file?

it's a misleading example because it looks like white space is the delimiter and it's unclear how to parse just the first column. the difficulty is that splitting the columns could be using a delimiter that is a part of the first column. there could be multiple such "delimiters".
And again. Without details we cannot help you. You have to specify a way to parse that input file, telling us "it is misleading" or "it won't work" is just useless.

The packages handled by apt is already stored in a database and has a perl api to manipulate it.
And I still think you’re overcomplicating something that can be solved a lot easier, just you refuses to tell us the real details and your real goal.
 
Old 06-09-2022, 08:51 AM   #49
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,627

Rep: Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556
Quote:
Originally Posted by Skaperen View Post
i have no interest in why they are in that form. knowing why won't improve anything.

i did check out the link. i did not send them anything.

the analysis involves some things i am already doing (how i select and upgrade packages) so i know it will be of no interest to others. this will be an ongoing thing i'll be doing each release. it is to look at how packages get renamed or split with upgrades.
I think it's pretty rare that there's only one person in the entire world interested in something, but if what you're doing is truly of no interest to others, I guess the same is true for this thread.


Last edited by boughtonp; 06-09-2022 at 08:57 AM.
 
1 members found this post helpful.
Old 06-09-2022, 12:51 PM   #50
Skaperen
Senior Member
 
Registered: May 2009
Location: center of singularity
Distribution: Xubuntu, Ubuntu, Slackware, Amazon Linux, OpenBSD, LFS (on Sparc_32 and i386)
Posts: 2,688

Original Poster
Blog Entries: 31

Rep: Reputation: 176Reputation: 176
Quote:
Originally Posted by pan64 View Post
You have to specify a way to parse that input file
that's mostly what this thread is about ... how to parse that input file ... in or for sort.

Quote:
Originally Posted by pan64 View Post
The packages handled by apt is already stored in a database and has a perl api to manipulate it.
And I still think you’re overcomplicating something that can be solved a lot easier, just you refuses to tell us the real details and your real goal.
i have minimized the problem and narrowed it down. it is to sort the (uncompressed) files i find in the "apt-file" package. i have described the format in the widest possible scope ... to consider the most difficult cases where some package file path has one or more delimiter characters in it. there are only 2 delimited columns, first and last. last is the sort key.

so the goal is to sort the specified files as described. i already figured out one way to parse this but wanted to do it in sort to specify the last column as key. but a solution was given: flip first and last, sort, flip back (i can just use it flipped and skip flip back). and another solution involved a regex i have not yet tested (i don't know regex enough to visualize if it should work).

do you think i should use different files, instead, that i don't know the format for (yet)?

i minimized to this narrow problem and asked it. i'm staying on topic with the problem i asked and not going to a wider one of the whole project. i have no interest in (expanding the topic) asking about the wider project.
 
Old 06-10-2022, 12:18 AM   #51
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,369

Rep: Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753
My input test file is as stated.

You seemed to be implying that for your data, the 'last column' is separated by <some space> from preceding data, which may or may not have spaces.

My prog temp reverse a (copy of) the rec from the input file and splits out the now-first (was last) col of the orig data, then uses that as a key to a hash where the 'data' in the hash is the entire rec (unchanged) which seems to be what you wanted.

It then sorts the hash on the 'key' ie what was orig the last col (as reqd) and then prints the key (optional - just remove from print if you want) followed by the entire associated rec.

HTH

PS If that is NOT your requirements, please specify in detail, clearly - thank you.
 
Old 06-13-2022, 06:10 PM   #52
Skaperen
Senior Member
 
Registered: May 2009
Location: center of singularity
Distribution: Xubuntu, Ubuntu, Slackware, Amazon Linux, OpenBSD, LFS (on Sparc_32 and i386)
Posts: 2,688

Original Poster
Blog Entries: 31

Rep: Reputation: 176Reputation: 176
Quote:
Originally Posted by chrism01 View Post
PS If that is NOT your requirements, please specify in detail, clearly - thank you.
i see it as a workaround. as a solution, it works, and looks like it should work well.
 
Old 06-16-2022, 01:00 AM   #53
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,369

Rep: Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753
I did have a slight tweak that might run a bit faster, but the new filtering on LQ won't let me post it..
However the above should work just fine.
 
Old 06-19-2022, 06:31 PM   #54
Skaperen
Senior Member
 
Registered: May 2009
Location: center of singularity
Distribution: Xubuntu, Ubuntu, Slackware, Amazon Linux, OpenBSD, LFS (on Sparc_32 and i386)
Posts: 2,688

Original Poster
Blog Entries: 31

Rep: Reputation: 176Reputation: 176
if there is something i can't post in a technical sense, i put it in a web file and post the URL. i would not even do that for general rule violations. this is a family-rated web site. my youngest niece was reading this site when she was 4 y/o.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] AWK - How to parse a Web log file to count column and the last occurrence of that column Alvin88 Linux - Newbie 10 06-23-2017 05:59 AM
Get first day of last month and last day of last month in bash xowl Linux - Software 18 02-09-2017 09:49 AM
how to sort the 2nd column on the basis of first column without repeating the value ? zediok Linux - Newbie 15 12-20-2011 11:48 AM
Concatenate column 1 and column 2 of related lines cgcamal Programming 4 11-20-2008 10:43 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 03:38 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration