Monday, November 29, 2010

sort unique duplicated column awk cut uniq

SkyHi @ Monday, November 29, 2010
Step 1:
#end with bb or cc
grep "bb$" virtusertable

Step 2:
#print column1 and colum2 and sort by column2 and keep the first unique line
awk '{print $1,$2}' virtusertable | sort -k2 -u > sortcolum2u.txt

Step 3:
#That will delete lines containing crap[0-3].
grep -Ev 'crap0|crap1|crap2|crap3'

Step 4:
#keep first column
awk '{print $1}' virtusertable 

Step 5:
#print column1 and column2, sort by column2(domain)
awk -F@ '{print $1,$2}' virtusertableColumn1.txt|sort -k2 > virtusertableFsortdomain.txt

Step 6:
##replace space with @
:%s/\s/@/g



REFERENCES
http://www.linuxquestions.org/questions/linux-general-1/sed-or-grep-delete-lines-containing-matching-text-446640/

http://efreedom.com/Question/1-1915636/Way-Uniq-Column
http://stackoverflow.com/questions/2978361/uniq-in-awk-removing-duplicate-values-in-a-column-using-awk
http://www.softpanorama.org/Tools/sort.shtml



Counting unique values in a column with a shell script
$ cut -f2 file.txt | sort | uniq | wc -l


#Finding unique records from file
01224624005
01224626366
01224627408
01224626366
01224627408
##return duplicated lines 01224626366 01224627408
uniq -d sorted_sme.txt > dup_sme.txt
##return unique lines  01224624005
uniq -u sorted_sme.txt > unq_sme.txt

REFERENCES
http://www.computing.net/answers/unix/finding-unique-records-from-file/4591.html

Processing the delimited files using cut

cut command print selected parts of lines from each FILE (or variable) i.e. it remove sections from each line of files:

For example /etc/passwd file is separated using character : delimiters.

To print list of all users, type the following command at shell prompt:
$ cut -d: -f1 /etc/passwd
Output:

root
you
me
vivek
httpd

Where,

* -d : Specifies to use character : as delimiter
* -f1 : Print first field, if you want print second field use -f2 and so on...

Now consider variable service. Let us print out mail word using cut command:
$ service="http mail ssh"
$ echo $service | cut -d' ' -f2

mail

Note that a blank space is used as delimiter.
Processing the delimited files using awk

You can also use awk command for same purpose:
$ awk -F':' '{ print $1 }' /etc/passwd
Output:

root
you
me
vivek
httpd

Where,

* -F: - Use : as fs (delimiter) for the input field separator
* print $1 - Print first field, if you want print second field use $2 and so on


REFERENCES
http://www.cyberciti.biz/tips/processing-the-delimited-files-using-cut-and-awk.html