NNLM + CDABS R Community of Practice - R Community of Practice

Learning Goals

Scenario: We want to visualize a small dataset of shelving statistics.

We’ll complete the following tasks:

month	shelver	stacks_books	reference_books	bound_journals	unbound_journals
1	A	0	0	337	0
1	B	81	12	0	0
2	A	0	0	325	2
2	B	62	13	0	0
3	A	0	8	258	0
3	B	138	8	5	0
4	A	0	0	72	0
4	B	70	12	0	0

shelving_wide %>% 
  ggplot(mapping=aes(x=???, y=???)) +
  geom_bar(stat="identity")

long data: data represented with minimum number of columns necessary, tidy data
wide data: Variables may be spread across multiple columns. Column names often represent variable values

month	shelver	material_type	number_shelved
1	A	stacks_books	0
1	A	reference_books	0
1	A	bound_journals	337
1	A	unbound_journals	0
1	B	stacks_books	81
1	B	reference_books	12
1	B	bound_journals	0
1	B	unbound_journals	0
2	A	stacks_books	0
2	A	reference_books	0
2	A	bound_journals	325
2	A	unbound_journals	2
2	B	stacks_books	62
2	B	reference_books	13
2	B	bound_journals	0
2	B	unbound_journals	0
3	A	stacks_books	0
3	A	reference_books	8
3	A	bound_journals	258
3	A	unbound_journals	0
3	B	stacks_books	138
3	B	reference_books	8
3	B	bound_journals	5
3	B	unbound_journals	0
4	A	stacks_books	0
4	A	reference_books	0
4	A	bound_journals	72
4	A	unbound_journals	0
4	B	stacks_books	70
4	B	reference_books	12
4	B	bound_journals	0
4	B	unbound_journals	0

To lengthen our data, we’ll use the pivot_longer() function from the tidyr package. There are four arguments we need to provide:

data - the data frame to lengthen
cols - the columns we want to pivot on
names_to - the name of a new column which will have our old column names as values
values_to - the name of a new column which will hold the cell values of the pivoted columns

month	shelver	stacks_books	reference_books	bound_journals	unbound_journals
1	A	0	0	337	0
1	B	81	12	0	0
2	A	0	0	325	2
2	B	62	13	0	0
3	A	0	8	258	0
3	B	138	8	5	0
4	A	0	0	72	0
4	B	70	12	0	0

month	shelver	stacks_books	reference_books	bound_journals	unbound_journals
1	A	0	0	337	0
1	B	81	12	0	0
2	A	0	0	325	2
2	B	62	13	0	0
3	A	0	8	258	0
3	B	138	8	5	0
4	A	0	0	72	0
4	B	70	12	0	0

month	shelver	stacks_books	reference_books	bound_journals	unbound_journals
1	A	0	0	337	0
1	B	81	12	0	0
2	A	0	0	325	2
2	B	62	13	0	0
3	A	0	8	258	0
3	B	138	8	5	0
4	A	0	0	72	0
4	B	70	12	0	0