7.3 Pipes

You’ll often find yourself needing to use multiple functions in a row to organize some data that you’re working on. This can sometimes lead to dense code that is difficult to read.

# for example
sort(round(sqrt(cats$age * 2), 3))

In the code above, I have multiple steps to get my result, but you have to read what’s going on from the inside out. This can be cumbersome, especially if you need to understand how one function’s output influences the next operation.

7.3.1 Using Pipes

dplyr includes a special operator designed to make code flow and appear more readable.

It’s written as %>%, and you can call it the “pipe” operator.

Our example above can be re-written as:

cats$age %>% 
  sqrt() %>%
  round(3) %>%
  sort()

Instead of being nested within a bunch of commands, you can see read the code as a series of statements: 1. With the ages of all the cats, 2. Take the square-root of these values, then 3. Round the result to the 3rd digit, then 4. Sort the values in ascending order

I encourage you to think of the %>% as short-hand for “then”, when reading code that uses it!

“Pipe” operators are found in other languages; they get their name from the idea that your code can be thought of as a “pipeline”.

Let’s look at another example.

round(1.23456789, 3)

We can use a pipe operator to acheive the same thing.

1.23456789 %>% round(3)

The pipe takes care of making sure the output of the expression on the left-hand-side (a simple numeric, in this case) is inserted as the first argument of the expressing on the right-hand-side. We can also pipe into other argument positions by using a period as a placeholder.

3 %>% round(1.23456789, .)

These are contrived examples, and I don’t suggest using pipes for simple operations like rounding. The pipes really become useful when chaining together multiple operations in sequence, as we’ll do with our dplyr functions.