8.1 Filter/subset data
It is often necessary to limit your analysis to some subset of cases. Use the filter()
command to specify the criteria by which to select cases.
=
film %>%
film filter(SubjectSex == 'Female') # criteria for keeping cases
head(film)
## # A tibble: 6 x 10
## Title Release NumSubjects SubjectName SubjectType
## <chr> <dbl> <dbl> <chr> <chr>
## 1 Big ~ 2014 1 Margaret K~ Artist
## 2 Test~ 2014 1 Vera Britt~ Other
## 3 The ~ 2014 1 Brittany M~ Actress
## 4 Wild 2014 1 Cheryl Str~ Other
## 5 Diana 2013 1 Princess D~ Other
## 6 Love~ 2013 1 Linda Love~ Actress
## # ... with 5 more variables: SubjectRace <chr>,
## # PersonOfColor <dbl>, SubjectSex <chr>,
## # LeadActor <chr>, Period <chr>
The conditions inside filter()
identify the cases, or rows, to keep (i.e. you’re selecting only those rows that satisfy the given conditions). This can be based on any number of conditions. Note that a double equals sign ==
is used to check a logical condition.