Data Science and Informatics in Biology

BIOL 480 (ugrad) 510 (genomics diploma) 630 (graduate) 2020




The Fall 2020 version of the course taught by Prof MT Hallett at the Dept of Biology of Concordia.

More about the goals of the course can be found in overview. We have optimized the course for the Fall 2020 term on-line.

The syllabus is broadly divided into 3 modules. The first module will provide you with the concepts, skills and tools necessary to conduct data science with biological data . For this we will use the R langauge. Data science is the process of getting your data into a computer in an appropriate way and the applicaiton of descriptive statistics, visualization and statistical tests to explore your data. The second module is bioinformatics. Here we talk about resources, databases and tools for molecular data. The third covers computational biology. This is the development of new techniques, typically expressed as computer programs, to explore data. There is a strong focus on machine learning, particularly the basics of deep learning .

Lectures consist of a pre-recorded set of short videos with my voice over. Between videos, there are small exercises to help you understand the material. In class times, I will be there to work through the material with you. Quizes (4 at 5% each) will be in class and 10 minutes each. The midterm (20%) will be a one hour take-home exam. The final (20%) will be 2 hour take home. Here is the evaluation criteria.

For graduate students, the course has a small project, equivlent to a typical assignment, where they can explore bioinformatics in more depth in their area of life science interest.