data
├── Workshop_01.csv
├── Workshop_02.csv
├── Workshop_03.csv
├── Workshop_04.csv
├── Workshop_05.csv
├── Workshop_06.csv
├── Workshop_07.csv
├── Workshop_08.csv
├── Workshop_09.csv
└── Workshop_10.csv
Week 1
Scenario: we want to combine and analyze several spreadsheets containing workshop registration data. Each spreadsheet has the same structure - name of attendee 1, school affiliation, and status (faculty, student, staff).
data
├── Workshop_01.csv
├── Workshop_02.csv
├── Workshop_03.csv
├── Workshop_04.csv
├── Workshop_05.csv
├── Workshop_06.csv
├── Workshop_07.csv
├── Workshop_08.csv
├── Workshop_09.csv
└── Workshop_10.csv
Loops are used in most programming languages when you want to repeat some set of code for multiple inputs.
for (thing in list_of_things) {
do_some_function()
}
In R, there are several functions that accomplish the same thing as loops, in particular:
apply() family of functions in Base Rmap() family of functions from the purrr package in the tidyversemap(.x, .f)
map(function-args, function)
So, we have our function read_csv(), and now we want it to repeat for all the file names in our directory. So - how do we accomplish this?
map() to run read_csv() on each file name.| homogeneous | heterogenous | |
|---|---|---|
| 1d | vector | list |
| 2d | matrix | data frame |
| nd | array |