8.4 Appending and merging data
It is often useful to combine data from different sources. This may take the form of appending (adding additional cases with information on the same variables) or merging (adding additional variables that describe the same cases).
8.4.1 Appending new cases
To append new cases to your data frame, use bind_rows(OldData,NewData)
. Note that the variable names need to match exactly across data frames, but the variable order does not matter.
# old data
= tribble(
myData ~District, ~Students,
115, 985,
116, 1132
)
# new data to add
= tribble(
new ~District, ~Students,
117, 419,
118, 633
)
# Append new to old
= bind_rows(myData,new)
myData
myData
## # A tibble: 4 x 2
## District Students
## <dbl> <dbl>
## 1 115 985
## 2 116 1132
## 3 117 419
## 4 118 633
8.4.2 Merging
To merge data frames (add new variables for existing cases), use left_join(OldData,NewData)
. In order to link rows in one data frame to rows in another, it is critical that the data sets contain a common identifier, with the same variable name and same values. Building on the example above
# new variables
= tribble(
newvars ~District,~Teachers,
115, 43,
116, 71,
118, 55
)
# join new to old
= left_join(myData,newvars) myData
## Joining, by = "District"
myData
## # A tibble: 4 x 3
## District Students Teachers
## <dbl> <dbl> <dbl>
## 1 115 985 43
## 2 116 1132 71
## 3 117 419 NA
## 4 118 633 55