shell - Sort ignores an apostrophe - sometimes (except when it is the only column used); WHY? -


this happens me both on linux , on cygwin, suspect not bug. still, don't understand it. can explain?

consider following file (tab-delimited, , that's regular apostrophe) (i create cat ensure wasn't non-printing characters source of problem)

$cat > temp cat     1389 cat'    1747 ca't    3175 cat     46848484 ca't    720  $sort temp <gives exact same output cat temp>  $sort -k1,1 temp cat     1389 cat     46848484 cat'    1747 ca't    3456 ca't    720 

why have ignore second column in order sort correctly?

i pulled manual sort , noticed following:

* warning * locale specified environment affects sort order. set lc_all=c traditional sort order uses native byte values.

as turns out, locales specify how lexicographic ordering works given locale. makes lot of sense, reason trips on multi field files...

(see also:)
unusual behaviour of linux's sort command
why sort command sort differently if there trailing fields?

there couple of things can do:

you can sort naively byte value using

lc_all="c" sort temp 

this give more logical result, might not 1 want.

you try sort more basic lexicographical ordering setting locale c , telling want dictionary ordering:

lc_all="c" sort -d temp 

to have sort output locale information , hilight sort key, can use

sort --debug temp 




i'm curious know rule being specified makes sort behave unintuitively across multiple fields.

they're supposed specify correct lexicographic order in given language , dialect. locales' functions not handle multiple field case @ all, or taking kind of different interpretation on "meaning" of line?


Comments

Popular posts from this blog

monitor web browser programmatically in Android? -

Shrink a YouTube video to responsive width -

wpf - PdfWriter.GetInstance throws System.NullReferenceException -