I: A first look at R

Author

Rens Holmer & Mark Sterken

Published

February 11, 2026

The goals for week 1

In this course you will learn how to conduct data analysis in R. As such, the assignments in this book are meant to take you through the steps of data analysis. Furthermore, it introduces concepts and ways to work with R.

In the first week, the goals are:

  1. Acquire basic knowledge on using R and R studio
  2. Recognize and load common data formats
  3. Apply common statistical tools for inspecting and analysing data
  4. Document your code in a clear & concise way.
  5. Apply the data-science cycle (load, inspect, clean, analyse, present)
Tip 1: What is the ‘data-science cycle’?

You are familiar with collecting data, even with some analysis and interpretation. For instance, in Plant Science in Practice you gathered a dataset on wild plants via fieldwork and in Reproduction of Plants you carried out a small experiment in a laboratory setting. During lab work you follow a particular protocol, and it is no different for data-analysis. We use an iterative cycle of loading, inspecting, cleaning, analyzing, and presenting data. It is normal to go back and forth in the cycle. The main assumption is that any data set is imperfect and has issues. In the first week, we will not bother with imperfection in what you use, but we will discuss it.

How week 1 is organized

The material you are supposed to work through each day is a chapter in the book. During the scheduled tutorials background and context will be given. Also, there will be help available if you get stuck with the coding.

There will be three days of working through the chapters (Monday - Wednesday). The exercises in this book require you to use data. In the first week, this data is available via a link on Brightspace. For each day of the course, there is a data set available.

  1. For Chapter 1 we all work on the same dataset1.
  2. For Chapter 2 you have your own, personal dataset based on2. This data set you find in a folder under your own name.
  3. For Chapter 3, we start with a shared dataset3. For the assignment, you will also have your own, personal data set3. This data set you find in a folder under your own name.

The answers to the assignments will be posted on Brightspace the morning after (so the answers for Chapter 1 will appear on Tuesday).

Next to the chapters, there is some (light) reading to support your view on data analysis. After Chapter 2 you are expected to read a paper by Itai Yanai and Martin Lercher4. After Chapter 3 you are expected to read a critique on this paper by Teppo Felin et al.5. These two papers should give you a firm grip on hypothesis testing and data analysis.

The assignment

At the end of Wednesday (23:59), a coding peer-feedback assignment is due, which you submit via feedback fruits on Brightspace. Your assignment will be reviewed by two students, as you will review the assignment of two other students. The instructions for this assignment you find in Chapter 3. Completing the assignment and participation in the code-review is mandatory.

You need to hand in a .html file (generated from a .qmd file). Practice with generating these types of files before the assignment is due. Ideally when the tutorials are given.

The code-review (peer-feedback via Brightspace) is due Thursday at 13:00. Instructions for how to give feedback can be found in Chapter 3. The items we expect you to give feedback on as well.

The exam

The exam of week 1 does not count for your final grade of the course. It does however show how the exams in this course go, and you will also receive feedback and a grade (again, it will not count) on what you hand in. The assignment on Wednesday will prepare you for the exam on Friday.

At the end of week 1 we expect you to be able to:

  • Complete a given .qmd file with answers and completed code-blocks;
  • Start a data-analysis project in R (set a work directory, install and activate packages);
  • Load the data formats covered in week 1 (.tsv, .csv, .xlsx, .Rdata);
  • perform and interpret the outcome of basic checks on loaded data using functions (e.g. dim(), ncol(), summary(), …);
  • Combine objects that in essence fit together (cbind(), rbind());
  • Test data for normality (e.g. by making a qqplot or by shapiro.test());
  • Conduct a t-test (t.test());
  • Conduct a Wilcoxon rank sum test (wilcox.test());
  • Be familiar with correlation and clustering;
  • Translate a p-value to a biological interpretation;
  • Complete {ggplot2} code to make a histogram, a qqplot, a boxplot, and a scatterplot;
  • Be able read a boxplot, qqplot, a scatterplot, and a histogram and translate these figures to a biological interpretation.