Searching with grep: The Essentials
If I had to name the most important and useful tools for the Linux command line, then grep
would definitely be within the top 5.
The tool “grep” is the powerful workhorse every time you need to search for content - whether in a file or a datastream.
- You are searching for a phrase within a file? use grep!
- You are searching for all the files containing a phrase? use grep!
- You want to filter the looong output of a command for relevant information? use grep!
- You want to extract strings from a file based on a given pattern? again: use grep!
For the last one I already created an article here: How to extract strings by a given search-pattern
In this post I wanna focus on the essentials of using grep and how it can become an invaluable part of your daily toolkit.
The basics of using grep
There are typically two ways to use grep
at the command line: with or without a filename as a parameter. When you provide a filename, grep
obviously searches within that file. When you don’t, grep
searches within its input data stream, which will typically be the output of another command.
example 1: search within a file
robert@demo:~$ grep robert /etc/passwd
robert:x:1003:1003::/home/robert:/bin/bash
In this example, I search for the phrase “robert” in the file “/etc/passwd”.
example 2: search within a data stream
robert@demo:~$ ps aux | grep "^postfix"
postfix 2268 0.0 0.5 90516 5284 ? S 2022 0:32 qmgr -l -t unix -u
postfix 22675 0.0 0.6 90440 6764 ? S 07:50 0:00 pickup -l -t unix -u
Here I take the output of the command “ps aux” and filter only for lines starting with the phrase “postfix”.
As you can see, the typical behavior of grep
is to search the data line by line and to print out the entire line if it is matched by the given search pattern.
The two ways of using grep
So if we want to have a simple formula, the typical usage of grep is one of these two:
first: To search within a file, give the filename as a parameter:
or second: To search in the output of a command (aka a “datastream”), push this datastream via the pipe sign to the grep
command,
Where the thing I wrote within square brackets (“[<options>]”) is - uhm - optional. So - beside the searchpattern and perhaps the filename - there is no need to give any additional parameters to grep
.
But a few parameters are really useful to know, so I will cover them here:
The common everyday parameters for grep
These are the four parameters I will seldom go a day without (see examples below):
-i
… ignore the case of the search pattern-c
… only count the lines containing the search pattern-v
… “negate” the search-r
… search all files within a directory recursively-l
… only print out the filenames with matches
Example: search for “ROOT” in the file “/etc/passwd”
robert@demo:~$ grep ROOT /etc/passwd
robert@demo:~$ grep -i ROOT /etc/passwd
root:x:0:0:root:/root:/bin/bash
As you see, the first search didn’t print a match, because grep
by default searches case-sensitive. On the second try, I called grep with -i
to ignore the case and got a result.
If you are not sure, if a phrase is written upper- or lowercase, add
-i
to your parameters to do the search case insensitive.
Example: count the processes matching the pattern “apache2”
robert@demo:~$ ps ax | grep -c apache2
16
There are 16 processes running, matching the pattern “apache2”.
A short heads-up for this use case: If you use grep
to search the process list, be aware that at the time of your search, the grep
command itself exists as a process too. So no matter what a nonsense you are searching for, you will always see a match:
robert@demo:~$ ps ax | grep "boohooo"
27310 pts/0 S+ 0:00 grep --color=auto boohooo
To handle this, we could leverage the next parameter from above: Use -v
to negate the search.
Example: negate the search with “-v”
As a first example, let’s have a look at those lines in “/etc/passwd” not containing the phrase “nologin”:
robert@demo:~$ grep -v nologin /etc/passwd
root:x:0:0:root:/root:/bin/bash
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
robert:x:1003:1003::/home/robert:/bin/bash
With the “-v” command-line switch, grep
prints out all lines it sees except those that match the given pattern. We simply “negate” the search.
To demonstrate this behavior in a more practical scenario, let’s search for all processes containing the phrase “master”:
robert@demo:~$ ps ax | grep master
2263 ? Ss 3:02 /usr/libexec/postfix/master -w
27545 pts/0 S+ 0:00 grep --color=auto master
In the output above, you see a running “master” process alongside a second process describing the freshly started grep
command line.
Now - to avoid false positives - let’s filter the output of the first grep
, to suppress the grep
process itself:
robert@demo:~$ ps ax | grep master | grep -v grep
2263 ? Ss 3:02 /usr/libexec/postfix/master -w
Voila! We see only the process we are interested in.
To take up the other example from above again: If I want to know how many “apache2” processes are running on a system, I need to filter out the grep
process before counting the lines:
robert@demo:~$ ps ax | grep -v grep | grep -c apache2
15
Search multiple files at once
With grep
, searching through multiple files at once is straightforward: Simply add all the filenames you wanna search within as list to the command line:
Example: search in multiple files at once
For this example, I want to know, how the user “robert” is configured on my demo system. For this I simply search the two files “/etc/passwd/” and “/etc/group” for the username:
robert@demo:~$ grep robert /etc/passwd /etc/group
/etc/passwd:robert:x:1003:1003::/home/robert:/bin/bash
/etc/group:robert:x:1003:
/etc/group:project:x:1017max,:robert
As you can see, this time grep
prepends each single output line with the name of the file where the line was found.
If you need to, you can even use patterns to specify multiple files at once:
The following command line would - for instance - search through all “*.conf” files in the current directory for the phrase “DocumentRoot”. Additionally, I added -i
to do the search case-insensitive.
grep -i DocumentRoot *.conf
Sometimes you don’t even know the names of the files you want to search within, but you know the directory, where these files are located. Then the -r
switch asks grep
to search through all available files under a given directory tree:
Example: search for the hostname through the entire /etc directory
Let’s pretend, the hostname of my system is “demo”, and I want to change this. For this, it would be very helpful to know all the files containing the current hostname. And because I know that the vast majority of configuration files are placed somewhere within “/etc”, I could do the search in the following way (This time I spare you the load of output lines):
sudo grep -r demo /etc
This time I used sudo
together with grep
to avoid permission errors while walking through the entire directory-tree under /etc.
If you try something similar on a system of your own, you will see that the output will not be very clear for a first look. In my example, I will see multiple lines referencing the file “/etc/services”:
robert@demo:~$ sudo grep -r demo /etc
...
/etc/services:meter 570/tcp # demon
/etc/services:meter 570/udp # demon
/etc/services:#meter 571/tcp # udemon
/etc/services:#meter 571/udp # udemon
...
(a not very smart chosen hostname for this example😉)
To regain a better overview in such use-cases, we can ask grep
to suppress the output of the matching lines, but to print out only the names of the files containing matches instead:
Example: only show filenames with matches, without the matching lines themselves
robert@demo:~$ sudo grep -rl demo /etc
/etc/sysconfig/network
/etc/pki/tls/misc/CA
/etc/pki/tls/openssl.cnf
/etc/hostname
/etc/services
Now it’s very clear for me, where I need to do the modifications to change the hostname to a better one.
side-note: Did you notice how I combined the command-line switches -r
and -l
to -rl
in the example above? This combination of “short command-line switches” (those consisting only of one single character) into a single parameter is commonly used and supported by most commands. Additionally, most of the time the order of command-line switches doesn’t matter. So the command line sudo grep demo /etc -lr
would give you the exact same result.
Most of the time:
1) Short command-line options can be combined.
2) The order of these options usually doesn’t matter.
Color the output of grep
Did you notice in the example from above, where I searched for a process, that the process list shows an entry containing grep --color=auto ...
?
For reference, here is one of those examples again:
robert@demo:~$ ps ax | grep "boohooo"
27310 pts/0 S+ 0:00 grep --color=auto boohooo
Althoug I called only grep "boohoo"
, the process list shows the added --color...
command-line parameter.
This is because of an alias that’s defined in most environment (I bet in yours too). This alias simply modyfies every call of grep
into a call of grep --color=auto
:
robert@demo:~$ alias | grep grep
alias grep='grep --color=auto'
The purpose of this added parameter is, to color the output of grep within your terminal, to show give you a better understanding where the pattern matches the line.
Have a look at the screenshot I took on my demo system:
You can clearly see, where the pattern “root” has matched the lines. This can be extremely useful when you later start searching with regular expressions - but that’s the plot for a follow-up article.
For now …
If
grep
doesn’t give you a colored output on your system, add the parameter--color=auto
(or--color
for short) to your command line.
Here is what to do next
If you followed me through this article, you certainly have realized that knowing some internals about how things are working at the Linux command line, can save you a lot of time and frustration.
And sometimes it’s just fun to leverage these powerful mechanics.
If you wanna know more about such “internal mechanisms” of the Linux command line - written especially for Linux beginners
have a look at “The Linux Beginners Framework”
In this framework I guide you through 5 simple steps to feel comfortable at the Linux command line.
This framework comes as a free pdf and you can get it here.
Wanna take an unfair advantage?
If it comes to working on the Linux command line - at the end of the day it is always about knowing the right tool for the right task.
And it is about knowing the tools that are most certainly available on the Linux system you are currently on.
To give you all the tools for your day-to-day work at the Linux command line, I have created “The ShellToolbox”.
This book gives you everything
- from the very basic commands, through
- everything you need for working with files and filesystems,
- managing processes,
- managing users and permissions, through
- software management,
- hardware analyses and
- simple shell-scripting to the tools you need for
- doing simple “networking stuff”.
Everything in one single, easy to read book. With explanations and example calls for illustration.
If you are interested, go to shelltoolbox.com and have a look (as long as it is available).