8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23c | Misc | PL/SQL | SQL | RAC | WebLogic | Linux

Linux Redirection and File Processing

This article covers basic input and output redirection and file processing on Linux, with specific reference to the information needed for the RHCSA EX200 and RHCE EX300 certification exams.

Remember, the exams are hands-on, so it doesn't matter which method you use to achieve the result, so long as the end product is correct.

Redirection
cat
less and more
head and tail
grep
sed
awk
wc

Redirection

Many Linux commands produce output that is pushed to the screen, they are said to write to standard output (stdout). Standard output can be redirected to a file, other commands or even a device.

The ">" operator redirects standard output to a new file. If the specified file already exists it is overwritten.

# ls -al /tmp > /root/output.log

The ">>" operator is similar, creating a new file if one doesn't exist, or appending to the exist file if it already exists.

# ls -al /usr >> /root/output.log

In addition to standard output, standard error (stderr) can also be redirected. The following table shows variations of redirection for standard out and standard error.

Redirection	Action
> filename	Redirect standard out to a new file.
>> filename	Append standard out to an existing file.
1> filename	Redirect standard out to a new file.
1>> filename	Append standard out to an existing file.
2> filename	Redirect standard error to a new file.
2>> filename	Append standard error to an existing file.
&> filename	Redirect standard out and standard error to a new file. (Requires bash 3+)
&>> filename	Append standard out and standard error to an existing file. (Requires bash 4+)
> filename 2>&1	Redirect standard out and standard error to a new file.
>> filename 2>&1	Append standard out and standard error to an existing file.

To throw away all standard output and standard error, redirect it to "/dev/null".

The "<" operator is used to redirect standard input. This is typically used to pass a file as standard input for a command. It is more common to see the "|" operator used for this.

command < filename

The "|" operator is used to link (pipe) commands together by passing the standard output (and optionally standard error) from one command and passing it as the standard input for the next command. You will see examples of this throughout this and many other Linux articles.

# ps -ef | grep root

cat

The cat command pushes the contents of a file out to standard output.

# cat /etc/group

Man page: cat

less and more

The less and more commands are a way to deal with displaying large amounts of text on a command line screen. The less command displays one screens worth of text and allows you to scroll up and down using the arrow keys. The "q" key quits.

# ps -ef | less
# cat /var/log/messages | less

The more command shows one page of text at a time, then waits for the user to prompt for more. The return key scrolls down one line, while the space bar scrolls down one page.

# ps -ef | more
# cat /var/log/messages | more

Man pages: less, more

head and tail

The head and tail commands allow you to focus on that start and end of large amounts of text by specifying the number of lines to display. The following command displays the first 5 lines of the text.

# head -5 /etc/passwd
# ps -ef | head -5

The following command display the last 5 lines of test.

# tail -5 /etc/passwd
# ps -ef | tail -5

The "tail -f" command is extremely useful for watching real-time writes to a file. When you run it, the shell will wait on writes to the specified file and display them as they happen. This is great for watching processes write to log files. Use "CTRL+C" to release control back to the shell.

# tail -f /var/log/messages

Man pages: head, tail

grep

The grep command allows you to search through a file or stream for a specific pattern. Any line of data containing that pattern is returned. The following commands use ps to return the current processes, then filter them using grep to return only those lines containing the word "root". The "-i" option makes the filter case insensitive.

# ps -ef | grep root
# ps -ef | grep -i ROOT

The grep command can also use regular expressions to search for more complex patterns. The following example searches for any process lines that contain both 'root' and 'sbin', separated by any number of characters.

# ps -ef | grep 'root.*sbin'

Both the man and info pages of the grep command contain explanations of regular expressions.

Check out the fgrep (grep -F) and egrep (grep -E) variants also.

Man page: grep

sed

The sed command allows you to process the contents of a file or stream to produce amended output. Let's start with a simple search and replace.

# echo "This is my data." | sed 's/my/your/'
This is your data.
#

Use the "i" flag to make it case insensitive.

# echo "This is My data." | sed 's/my/your/'
This is My data.
# echo "This is My data." | sed 's/my/your/i'
This is your data.
#

Use the "g" flag to make the change global, for every occurrence of the word, not just the first.

# echo "This is My data. This is My data." | sed 's/my/your/i'
This is your data. This is My data.
# echo "This is My data. This is My data." | sed 's/my/your/gi'
This is your data. This is your data.
#

You can also use sed to omit lines. Some basic examples are shows below.

# # Remove the first line
# ps -ef | sed '1d'

# # Remove the first 5 lines (1-5)
# ps -ef | sed '1,5d'

# # Remove lines containing a string
# ps -ef | sed '/root/d'

# Only output lines containing the modified string.
# ps -ef | sed -n 's/root/banana/gp'

Whole books have been written about the sed command, so this is very much the tip of the iceberg, but these are the type of things I find myself using regularly.

Man page: sed

awk

Like sed, awk can do incredibly complicated and powerful things, so a full explanation of it is well beyond the scope of this article. Instead I will focus on the one task I use it for all the time, which is selectively pulling columns from a file or stream.

The output from many commands are displayed in the form of columns. For example, the ps command.

# ps -ef | head -1
UID        PID  PPID  C STIME TTY          TIME CMD
#

Using awk we can selectively pull out columns from a file or stream and use them to build up a new string.

# ps -ef | head -1 | awk '{print $1 "," $2 "," $8}'
UID,PID,CMD
# ps -ef | head -1 | awk '{print "The user " $1 " has the process ID of " $2 " and is running command: " $8}'
The user UID has the process ID of PID and is running command: CMD
#

In the above example, I was limiting the input to awk to a single row, but removing the "| head -1" would produce output for all rows returned by the ps command.

To for better example, let's imagine we might want to create a new CSV file containing the output from the ps command, minus the first row.

# ps -ef | sed '1d' | awk '{print $1 "," $2 "," $3 "," $4 "," $5 "," $6 "," $7 "," $8 }' > /tmp/output.txt

The "/tmp/output.txt" file now contains the required information in CSV format.

Man page: awk

wc

The wc command outputs the newline, word and byte count for the specified file or files. The output can be limited using the appropriate flag.

# wc /var/log/messages
  1227  14163 103672 /var/log/messages

# wc -l /var/log/messages
1227 /var/log/messages

# wc -w /var/log/messages
14163 /var/log/messages

# wc -c /var/log/messages
103672 /var/log/messages
#

Man page: wc

For more information see:

Hope this helps. Regards Tim...

Back to the Top.