LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (http://www.linuxquestions.org/questions/linux-software-2/)
-   -   awk's asort sort order according to locale (http://www.linuxquestions.org/questions/linux-software-2/awks-asort-sort-order-according-to-locale-822982/)

BerzinTehvs 07-30-2010 01:05 AM

awk's asort sort order according to locale
 
Hi!

How could it be possible to make awk asort function to be locale-aware?

Systems locale is set to lv_LV, but asort ignores it and sorts as if LC_COLLATE is en_US. Text to be sorted is ISO8859-13 encoded.

output of locale:

LANG=lv_LV
LC_CTYPE="lv_LV"
LC_NUMERIC="lv_LV"
LC_TIME="lv_LV"
LC_COLLATE="lv_LV"
LC_MONETARY="lv_LV"
LC_MESSAGES=en_GB
LC_PAPER="lv_LV"
LC_NAME="lv_LV"
LC_ADDRESS="lv_LV"
LC_TELEPHONE="lv_LV"
LC_MEASUREMENT="lv_LV"
LC_IDENTIFICATION="lv_LV"
LC_ALL=

malfunctioning part of script:
Code:

TMPO=`echo $F4 | awk -F ", " '{RS=ORS=", "; n=split($0,arr); asort(arr); for (i = 1; i <= n; i++) print arr[i] }'`

Tinkster 07-30-2010 01:38 AM

How about sample data and sample output?

BerzinTehvs 07-30-2010 02:49 AM

2 Attachment(s)
I'll better attach the script and the input file.

it should be used so: sinon.sh -s tabula.txt

the last line of current output looks so:
Code:

uzvinnēt:1:(darb.v.):uzvarēt, āpmakt
but it has to look so:
Code:

uzvinnēt:1:(darb.v.):āpmakt, uzvarēt
besause "ā" is the second letter in Latvian alphabet. Sorting within regular a-z works ok.

(sorry, comments in script are in Latvian)

BerzinTehvs 08-03-2010 02:12 AM

As far as it seems to me after studying of the source - gawk's asort() function is not locale-aware. At least - I did not notice any code comparable to sort's from coreutils.

Tinkster 08-03-2010 02:01 PM

Quote:

Originally Posted by BerzinTehvs (Post 4053509)
As far as it seems to me after studying of the source - gawk's asort() function is not locale-aware. At least - I did not notice any code comparable to sort's from coreutils.

That's a real shame. Maybe you should submit a bug, or
a feature request?


Cheers,
Tink


All times are GMT -5. The time now is 09:39 PM.