benjamindba

Thursday, November 8, 2012

Remove duplicate line in text file in UNIX

# save uniq lines to a new file
sort file | uniq > newfile
sort -u file > newfile

#find repeated lines
sort file | uniq -d

#find unique lines
sort file | uniq -c

#if you don't want to sort the output
awk '!x[$0]++' file

reference:
http://stackoverflow.com/questions/6447473/linux-command-or-script-counting-duplicated-lines-in-a-text-file
http://www.liamdelahunty.com/tips/linux_remove_duplicate_lines_with_uniq.php
http://www.cyberciti.biz/faq/unix-linux-shell-removing-duplicate-lines/
http://unstableme.blogspot.com/2008/03/remove-duplicates-without-sorting-file.html

Posted by Benjamin Li at 10:28 AM

Email This BlogThis!Share to X Share to Facebook Share to Pinterest

No comments:

Post a Comment

Newer Post Older Post Home

Subscribe to: Post Comments (Atom)

Followers

Blog Archive

► 2018 (3)
- ► October (2)
- ► February (1)

► 2017 (1)
- ► January (1)

► 2014 (6)
- ► October (1)
- ► July (1)
- ► June (2)
- ► January (2)

► 2013 (25)
- ► November (1)
- ► October (4)
- ► September (1)
- ► August (1)
- ► July (2)
- ► May (4)
- ► April (2)
- ► February (8)
- ► January (2)

▼ 2012 (140)
- ► December (4)
- ▼ November (12)
- ► October (15)
- ► September (7)
- ► August (12)
- ► July (12)
- ► June (17)
- ► May (15)
- ► April (7)
- ► March (15)
- ► February (12)
- ► January (12)

► 2011 (46)
- ► December (7)
- ► November (21)
- ► October (13)
- ► September (3)
- ► July (2)

► 2010 (12)
- ► November (2)
- ► October (1)
- ► September (5)
- ► August (4)

About Me

Benjamin Li

View my complete profile

Simple theme. Powered by Blogger.