Thursday, November 8, 2012

Remove duplicate line in text file in UNIX

# save uniq lines to a new file
sort file | uniq > newfile
sort -u file > newfile

#find repeated lines
sort file | uniq -d

#find unique lines
sort file | uniq -c

#if you don't want to sort the output
awk '!x[$0]++' file

reference:
http://stackoverflow.com/questions/6447473/linux-command-or-script-counting-duplicated-lines-in-a-text-file
http://www.liamdelahunty.com/tips/linux_remove_duplicate_lines_with_uniq.php
http://www.cyberciti.biz/faq/unix-linux-shell-removing-duplicate-lines/
http://unstableme.blogspot.com/2008/03/remove-duplicates-without-sorting-file.html


No comments:

Post a Comment