Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 20 Current »

Goal: Introduce how to pipe the output of one command into the input of another, and how to redirect the output of a command into a file.



Output Redirection and Pipes

  • Redirection of output: > vs >>

    • redirect sends a channel of output to a file.

    • A channel refers to standard input/output as well as standard error (not covered here).

    • You can redirect a file as input to a command using < and << (not covered here).

  • Using pipe “|’

    • A pipe passes standard output as the standard input to another command

  • Examples of the form: 

    • View a text file and pipe to the grep command to filter lines by looking for text.

    • Cat a list and sort by line.

    • Sort and then find unique items.

    • View folder contents and look for a specifically named filename.


Redirection of output: > vs >>

Remember we can use grep to search a file for some text.

This output is written to the command-line

[intro_to_linux]$ grep -i bayes software.csv

Redirect this output to a file:

[intro_to_linux]$ grep -i bayes software.csv > apps.txt
[intro_to_linux]$ ls 
apps.txt  clusters  data  software.csv
[intro_to_linux]$ cat apps.txt

Using a single > will overwrite any existing file of the same name.

[intro_to_linux]$ grep -i IPA software.csv > apps.txt
[intro_to_linux]$ cat apps.txt

The apps.txt file now contains the results from the second call.

Using a double >> will append to any existing file of the same name.

[intro_to_linux]$ grep -i bayes software.csv > apps.txt
# Appends to the existing file.
[intro_to_linux]$ grep -i IPA software.csv >> apps.txt

Example: Using pipe “|” from a file

Over the next set of examples we will pipe (using the “|” character) together a series of commands that will “generate a file that contains a sorted list of unique terms that include the sub string berry”.

[intro_to_linux]$ cat fruits.txt
Gooseberry
Apple
Apricot
Avocado
Strawberry
...
[intro_to_linux]$ cat fruits.txt | wc -l
97

We could use the wc command as it is normally intended, and confirm the file contains 97 lines.

[intro_to_linux]$ wc -l fruits.txt
97 fruits.txt

Example continued

First step, find all the terms that contain the sub string berry.

# The order of items is the same as listed within the fruits.txt file.
[intro_to_linux]$ cat fruits.txt | grep berry 
Gooseberry
Strawberry
Bilberry
Blackberry
Marionberry
Blueberry
Boysenberry
Gooseberry
Cloudberry
Elderberry
Goji berry
Honeyberry
Juniper berry
Cranberry
Cranberry
Marionberry
Gooseberry
Mulberry
Salmonberry
Huckleberry
Raspberry
Salal berry

Example continued

Second step, sort this list of berry terms into alphabetical order.

# Notice the duplicates.
[intro_to_linux]$ cat fruits.txt | grep berry | sort
Bilberry
Blackberry
Blueberry
Boysenberry
Cloudberry
Cranberry
Cranberry
Elderberry
Goji berry
Gooseberry
Gooseberry
Gooseberry
Honeyberry
Huckleberry
Juniper berry
Marionberry
Marionberry
Mulberry
Raspberry
Salal berry
Salmonberry
Strawberry

Example continued

Third step, take the sorted list and find unique terms i.e. remove any duplicates.

The way that the uniq command works, you must provide it with a sorted list.

# Duplicates have been removed leaving only the unique names.
[intro_to_linux]$ cat fruits.txt | grep berry | sort | uniq
Bilberry
Blackberry
Blueberry
Boysenberry
Cloudberry
Cranberry
Elderberry
Goji berry
Gooseberry
Honeyberry
Huckleberry
Juniper berry
Marionberry
Mulberry
Raspberry
Salal berry
Salmonberry
Strawberry

[intro_to_linux]$ cat fruits.txt | grep berry | sort | uniq | wc –l
18

We could pipe this unique list into the wc command to count how many unique terms we have.


Example continued

Fourth step, redirect this output into a file called berries.txt

[intro_to_linux]$ cat fruits.txt | grep berry | sort | uniq > berries.txt
[intro_to_linux]$ cat berries.txt
Bilberry
Blackberry
Blueberry
Boysenberry
Cloudberry
Cranberry
Elderberry
Goji berry
Gooseberry
Honeyberry
Huckleberry
Juniper berry
Marionberry
Mulberry
Raspberry
Salal berry
Salmonberry
Strawberry

[intro_to_linux]$ cat berries.txt | wc -l
18
  • Although a simple example, it demonstrates the principle of piping four commands into a single call.

  • Without using the pipe command you would have had to create intermediate files, to store results, after each command.

  • As you get more confident, you can create more elaborate pipelines of commands.

You might also be presented with a long pipeline/list of commands separated by the | character.

Now you know how to separate this pipeline into individual steps, dissecting by the | character, and then running each step individually.


Example: Pipe from ls command

This example demonstrates piping the output for a recursive ls call and then finding all the folders/files that contain the Feb string.

[intro_to_linux]$ ls -R
[intro_to_linux]$ ls -R | grep "Feb"
February
./data/2022/February:
Feb
./data/2023/Feb:

# Ignore case.
[intro_to_linux]$ ls -R | grep -i "Feb"
feb
./data/2021/feb:
february_01_2021.tx
February
./data/2022/February:
Feb
./data/2023/Feb:

Exercises: Pipeline

Questions:

  1. How does the wc command work? What are its options?

  2. How does the sort command work? What are its options?

  3. How does the uniq command work? What are its options?

  4. How many unique varieties of beans are there in the vegetables.txt file?


Answers

4. How many unique varieties of beans are there in the vegetables.txt file?

  • How do you deal with “soy beans” vs “Soy Beans”?

[intro_to_linux]$ cat vegatables.txt | grep -i beans | sort | uniq -i | wc -l
12

Make sure you are use the commands and options as you intend and understand and are able to describe and explain/justify.

Notice the difference between:

[salexan5@mblog1 intro_to_linux]$ cat vegatables.txt | grep beans | sort
...
kidney beans
...
soy beans

and

[salexan5@mblog1 intro_to_linux]$ cat vegatables.txt | grep -i beans | sort
...
kidney beans
kidney BEANS
...
soy beans
Soy Beans
  • Should “soy beans” and “Soy Beans” be treated the same or different? You need to decide with respect to the context and use case you are following, and intending.

  • What happens if you use bean instead of beans?

  • What does the uniq -i option do?

Again: Explore and understand how commands work, and the order they are run in. For example, in your own time, understand if there is a difference between sort | uniq versus uniq | sort?

 Brief Explanation:

uniq isn’t able to detect the duplicate lines unless they are adjacent to each other.

That is why we sort first.

Try it and see…


  • No labels