Lesson 7 dplyr
80% of the work involved with data analysis involves cleaning and shaping the data until it’s in the state you need. Bracket subsetting is handy, but it can be cumbersome and difficult to read, especially for complicated operations. Enter dplyr
!
dplyr
is a package for making data manipulation easier. (It does a lot more too, but this is what we’ll focus on).
Unlike the subsetting commands we’ve already worked on, dplyr
is designed to be highly expressive, and highly readable. It’s structured around a set of verbs, or grammar of data manipulation. The core functions we’ll talk about are below:
select
arrange
filter
group_by
mutate
summarise/summarize