awk - Keep only the line that is latest in the file and is a duplicate based on two fields -
this related questions
- awk - remove line if field duplicate
- sed/awk + regex delete duplicate lines first field matches (ip address)
i have file this:
foo,bar,100,200,300 baz,taz,500,600,800 foo,bar,900,1000,1000 here,there,1000,200,100 foo,bar,100,10000,200 baz,taz,100,40,500 the duplicates determined first 2 fields. in addition, more "recent" record (lower in file / higher line number) 1 should retained.
what awk script output:
baz,taz,100,40,500 foo,bar,100,10000,200 here,there,1000,200,100 output order not important.
explanation of awk syntax great.
this might work (tac , gnu sort):
tac file | sort -sut, -k1,2
Comments
Post a Comment