grep, cut, wc, sort, uniq, diff, aspell, sed command in Linux

grep, cut, wc, sort, uniq, diff, aspell, sed command in Linux:

Grep is a command line tool that allows you to find a string in a file or stream.

How To Use grep

In the simplest case, grep can simply be invoked like this :

  grep 'STRING' filename

This is OK but it does not show the true power of grep. First this only looks at one file. A cool example of using grep with multiple file would be to find all files in a directory that contains the name of a person. This can be easily accomplished using a grep in the following way :

grep 'Nicolas Kassis' *

Notice the use of single quotes; the quotes are not essential, but in this example they are required because the name contains a space. Double quotes could also have been used in this example.

Few examples (of grep command):

grep –v abc file1
(don’t show line that contain abc but all)

grep -n tiger file1

(return line that contain tiger with its line number)

grep –c abc file1
(display count of line only)

grep -l wali *
[only returns the name of file that have at least one line containing the patter (wali)]

<root@wtuto: ~> # grep -r abc *
(perform a recursive search of files, starting with the named directory)

(Note: there are there file f1, f2, f3 in /root/wali (directory) and f2 doesn’t contain the word abc. )

<root@wtuto: wali> # grep -r rose *
[perform case sensitive search (in the current directory)]

Extracting text by column:
cut -f3 -d:   /etc/passwd          (display third colon(:)-delimited field)
cut -c5   /etc/passwd                  (display 5th character)
cut -c1-5   /etc/passwd              (display first 5 characters)

-d specify the column delimiter (default is TAB)
-f   specify column to print
-c   cut by character

Gathering text statistics:

wc file1
(displays no. of lines, words and character in file1)

wc *
(displays no. of lines, words and character of every files in the current directory)

-l	only for line count
-w	only for word count
-c	only for byte count
-m	only for character count (1 character = 1 byte)

Sorting text:
grep bash /etc/passwd | sort

sort -t : -k3 -n /etc/passwd
(sort the UIDs in ascending order)

sort -t : -k3 -n /etc/passwd | cut -f3 -d:
(shows only UIDs in ascending order)

-r	performs a reverse (descending) sort
-n	performs a numeric sort
-f	ignores (folds) case of characters in strings
-u	(unique) removes duplicate lines in output
-t :	uses : as a filed separator
-k 3	third column by : delimited field

Eliminating duplicate lines:
cat>file
wali
salman
obama
wali
wali
wali
ajay
sameer

uniq file
(uniq without argument removes duplicate adjacent lines)
wali
salman
obama
wali
ajay
sameer

-u to output only the lines that are truly unique, only occurring once in the input.
uniq -u file
(display following)
wali
salman
obama
ajay
sameer

-d to output only print one copy of the lines that are repeated in the input

uniq -d file
(show following)
wali

-c
each line will be prepended with a number indicating how many times it appears in the input.

uniq -c file
(display following)
1 wali
1 salman
1 obama
3 wali
1 ajay
1 sameer

Comparing files:
diff and sdiff compares two files for differences.
cat>file

wali
salman
obama
wali
wali
wali
ajay
sameer

cp file file1

vim file1
wali
salman
president
wali
wali
wali
ajay
sameer

diff file file1

diff -u file file1
(line that begin with + exist in file but not in file1,
line that begin with - exist in file1 but not in file)

Spell checking with aspell:
aspell check f1

Tools for manipulating text:
cat f1 | tr 'a-z' 'A-Z'
(convert lower case into uppper)

sed 's/she/he/' f1 > nf1
(replace she with he in f1 and send its result to nf1, but only one change per line, and is case sensitive)

sed 's/she/he/g' f1 > nf1

(g stand for global, which replace all she with he in f1 and send its result to nf1, and is case sensitive)

sed 's/she/he/gi' f1 > nf1
(i ignore case sensitive, rest is same)

sed '2,5s/she/he/gi' f1 > nf1

(replacement occure only from 2nd to 5th line)

sed '/abc/,/xyz/s/she/he/gi' f1 > nf1
(replacement occure only from the line where abc occure to the line where xyz occure)

Connect With Us

Instant Query