LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   How do I do filtering in Perl (keep sort order and sort again by another means)? (https://www.linuxquestions.org/questions/programming-9/how-do-i-do-filtering-in-perl-keep-sort-order-and-sort-again-by-another-means-691501/)

RavenLX 12-18-2008 10:39 AM

How do I do filtering in Perl (keep sort order and sort again by another means)?
 
Ok, I have a list of items and associated values that were pre-sorted according to value. Some values are the same so I want to sort those by name as well. Here's what I mean:

Desired Output:

Code:

pink|3.3
salmon|3.3

brown|3.0
orange|3.0
peach|3.0
red|3.0
tan|3.0

mustard|2.5
yellow|2.2
gold|1.8
green|1.8
hunter|1.8
lime|1.8

aqua|1.5
blue|1.5
cornflower|1.5
cyan|1.5
indigo|1.5
ivory|1.5
lavender|1.5
magenta|1.5
periwinkle|1.5
purple|1.5
sky|1.5
teal|1.5
violet|1.5
white|1.5

ecru|1.3
charcoal|1.1
eggshell|1.1
gray|1.1

black|1.0

Note that values that are the same are also sorted alphabetically. Now, here's my code, and the actual output:

Code:


#!/usr/bin/perl
use strict;
use Data::Dumper;

my @arr_data = (
                                "pink|3.3",
                                "salmon|3.3",
                                "red|3.0",
                                "orange|3.0",
                                "peach|3.0",
                                "tan|3.0",
                                "brown|3.0",
                                "mustard|2.5",
                                "yellow|2.2",
                                "gold|1.8",
                                "lime|1.8",
                                "green|1.8",
                                "hunter|1.8",
                                "blue|1.5",
                                "sky|1.5",
                                "periwinkle|1.5",
                                "aqua|1.5",
                                "cyan|1.5",
                                "teal|1.5",
                                "indigo|1.5",
                                "violet|1.5",
                                "lavender|1.5",
                                "purple|1.5",
                                "magenta|1.5",
                                "cornflower|1.5",
                                "white|1.5",
                                "ivory|1.5",
                                "ecru|1.3",
                                "eggshell|1.1",
                                "gray|1.1",
                                "charcoal|1.1",
                                "black|1.0"
                                );

my (@temp, @index, @sorter, @sorted);
my ($item, $i);
foreach $item (@arr_data) {
        @temp = split("[\|]", $item);

        # Find index of others with same value
        @index = grep { $arr_data[$_] =~ /$temp[1]/ } 0..$#arr_data;

        # get the data
        undef @sorter;
        foreach $i (@index) {

                # add to sorter array
                $sorter[$#sorter + 1] = $arr_data[$i];

                # remove from old array
                splice @arr_data, $i, 1;

        }

        # sort alphabetically
        @sorter = sort(@sorter);

        # add to new array
        @sorted = (@sorted, @sorter);
}

# *** DEBUG ***
print Dumper(@sorted);
# *** END DEBUG ***

Output:

Code:

$VAR1 = 'pink|3.3';
$VAR2 = 'red|3.0';
$VAR3 = 'gold|1.8';
$VAR4 = 'mustard|2.5';
$VAR5 = 'orange|3.0';
$VAR6 = 'tan|3.0';
$VAR7 = 'peach|3.0';
$VAR8 = 'yellow|2.2';
$VAR9 = 'hunter|1.8';
$VAR10 = 'lime|1.8';
$VAR11 = 'sky|1.5';
$VAR12 = undef;
$VAR13 = undef;
$VAR14 = undef;
$VAR15 = undef;
$VAR16 = 'aqua|1.5';
$VAR17 = 'blue|1.5';
$VAR18 = 'charcoal|1.1';
$VAR19 = 'cornflower|1.5';
$VAR20 = 'eggshell|1.1';
$VAR21 = 'ivory|1.5';
$VAR22 = 'purple|1.5';
$VAR23 = 'teal|1.5';
$VAR24 = 'violet|1.5';
$VAR25 = undef;
$VAR26 = 'black|1.0';
$VAR27 = 'ecru|1.3';
$VAR28 = 'indigo|1.5';
$VAR29 = 'magenta|1.5';
$VAR30 = 'periwinkle|1.5';
$VAR31 = 'gray|1.1';

I'm just not sure how to approach this. Any help would be appreciated. Thanks!

bigearsbilly 12-18-2008 10:48 AM

can you not use sort?
it would much easier.
i.e. not perl.

Telemachos 12-18-2008 11:23 AM

You can do this type of thing by writing a custom sort subroutine. Here is a quick stab. It works, but there are probably more efficient ways to do it:
Code:

#!/usr/bin/perl
use warnings;
use strict;

my @arr_data = (
        "pink|3.3",
        "salmon|3.3",
        "red|3.0",
        "orange|3.0",
        "peach|3.0",
        "tan|3.0",
        "brown|3.0",
        "mustard|2.5",
        "yellow|2.2",
        "gold|1.8",
        "lime|1.8",
        "green|1.8",
        "hunter|1.8",
        "blue|1.5",
        "sky|1.5",
        "periwinkle|1.5",
        "aqua|1.5",
        "cyan|1.5",
        "teal|1.5",
        "indigo|1.5",
        "violet|1.5",
        "lavender|1.5",
        "purple|1.5",
        "magenta|1.5",
        "cornflower|1.5",
        "white|1.5",
        "ivory|1.5",
        "ecru|1.3",
        "eggshell|1.1",
        "gray|1.1",
        "charcoal|1.1",
        "black|1.0"
);

my @sorted = sort my_way @arr_data;

sub my_way {
  my($name_a, $value_a) = split /\|/, $a;
  my($name_b, $value_b) = split /\|/, $b;

  $value_b <=> $value_a or $name_a cmp $name_b;
}

foreach my $line (@sorted) {
  print "\t$line\n";
}

The output is what you want. I'll leave printing it out for folks to do on their own (rather than take up space here).

Edit: If you want the original array itself sorted just do
Code:

@arry_data = sort my_way @arr_data;

Telemachos 12-18-2008 05:30 PM

A second way which is a lot more complicated to look at, but perhaps more efficient:
Code:

#!/usr/bin/perl
use warnings;
use strict;

my @arr_data = (
        "pink|3.3",
        "salmon|3.3",
        "red|3.0",
        "orange|3.0",
        "peach|3.0",
        "tan|3.0",
        "brown|3.0",
        "mustard|2.5",
        "yellow|2.2",
        "gold|1.8",
        "lime|1.8",
        "green|1.8",
        "hunter|1.8",
        "blue|1.5",
        "sky|1.5",
        "periwinkle|1.5",
        "aqua|1.5",
        "cyan|1.5",
        "teal|1.5",
        "indigo|1.5",
        "violet|1.5",
        "lavender|1.5",
        "purple|1.5",
        "magenta|1.5",
        "cornflower|1.5",
        "white|1.5",
        "ivory|1.5",
        "ecru|1.3",
        "eggshell|1.1",
        "gray|1.1",
        "charcoal|1.1",
        "black|1.0"
);

my @sorted =  map  { $_->[0] }
              sort  { $b->[2] <=> $a->[2] || $a->[1] cmp $b->[1] }
              map  { [ $_,  split /\|/ ] } @arr_data;
 
foreach my $line (@sorted) {
  print "\t$line\n";
}

This uses the Schwartzian transform.

bigearsbilly 12-19-2008 02:57 AM

the true way of unix would be:

/bin/sort -t\| -k2,3n -k1,2 < list

syg00 12-19-2008 03:13 AM

Whoa - a perl thread that bigearsbilly refuses to answer.
What's the world coming to ???.

bigearsbilly 12-19-2008 03:25 AM

ho ho!

as a true perl (zeal|big)ot and officianado
even I would go for the path of least typing on this one.

;)

unless of course maybe the individual is
unfortunate enough to be using perl on windows.

Telemachos 12-19-2008 03:57 AM

Quote:

Originally Posted by bigearsbilly (Post 3380897)
the true way of unix would be:

/bin/sort -t\| -k2,3n -k1,2 < list

Two things: first, the OP wanted the numeric sort to be reversed (this is easily fixed); second, how exactly do we know that this isn't part of a larger project that is in Perl for a good reason?

In any case, here is a fixed sort incantation:
Code:

sort -t\| -k2,3nr -k1,2 < unsorted

bigearsbilly 12-19-2008 04:03 AM

1. doh!
sort -k2,3rn -k1,2

2. I don't. Just opining :) Still it is best to use the power at your fingertips if you can.

RavenLX 12-19-2008 10:12 AM

bigearsbilly - I can't use sort (ie. not perl) because this is to be a part of a larger perl script I'm writing in which it all should be self-contained since I'm sorting data that is already loaded into the variables. I was trying to right a sort function for including in the script I'm working on.

Telemachos - Your Schwartzian transform code example is what I'll go with. Thank you. :)


All times are GMT -5. The time now is 01:29 AM.