if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
library(BiocManager)Introduction and Set Up
Learning Objectives
- Understand the difference between CRAN and Bioconductor
- Be able to install Bioinformatics packages
Getting Set Up
RStudio provides a useful feature called Projects which act like a container for your work. As you use R more, you will find it useful to make sure your files and environment for one real-world project are kept together and separate from other projects.
Let’s create a new project now.
- Go to
File > New Project - In
Create project frommenu chooseNew Directory - Choose Project Type
New Project - Make sure
Create project as subdirectory of:is pointing to Desktop (or whatever your preferred location is) - Call your new directory
r_bioinformatics_lesson - Select the check box that says
Open in New Session - Inside your new project, create folders called
dataandfigures
What is Bioconductor?
In this lesson, we’ll be working with a number of bioinformatics packages along with the tidyverse family of packages. Many R packages come from CRAN (Comprehensive R Archive Network). Packages from CRAN can be installed either by using the Install Packages widget in RStudio (lower-right pane) or with the function install.packages().
Bioconductor is an open-source project that provides tools for the analysis and comprehension of high-throughput biological data, built on the R programming language. It includes a large ecosystem of packages for tasks such as sequence analysis, genomic data visualization, and statistical modeling in bioinformatics. Bioconductor emphasizes reproducibility, interoperability, and the integration of biological metadata, making it especially well-suited for omics research.
To use packages from Bioconductor, we must first install and load the Biocmanager package from CRAN. Biocmanager provides an interface to the Bioconductor repository.
Next, let’s install the packages we will be using in this session. From CRAN, we’ll be installing
| Package Name | Purpose |
|---|---|
tidyverse |
Wrangling and visualizing data |
rentrez |
Accessing data from NCBI databases |
ape |
Phylogenetic analysis |
install.packages(c("tidyverse", "rentrez", "ape"))Then we can install our Bioconductor packages with Biocmanager
| Package Name | Purpose |
|---|---|
Biostrings |
Manipulating biological sequences |
pwalign |
Pairwise Alignment |
DECIPHER |
Multiple Sequence Alignment |
BiocManager::install(c("Biostrings", "pwalign", "DECIPHER"))NOTE: You only need to install a package once on your system (and after updates), but you will want to load the packages into your R session with the library() function.
library(ape)
library(Biostrings)
library(DECIPHER)
library(pwalign)
library(rentrez)
library(tidyverse)Don’t be afraid if you see a lot of text and messages printing out in the console as you go through this process. You may be asked to update other packages that the ones we are trying to install are dependent on. You will also be warned if some of the packages have functions with the same name.