🌲🧬 Landscape Genetics

Table of Contents

Learning Objectives

To ecologists, a gene is a gene and we just need to find the one that does {insert random phenotypic trait here}. To population geneticists, the landscape has no heterogeity, so we can consider all populations as equally reachable. While these individual assumptions may lead to some wonderful simplifying assumptions in our underlying mathematical models, they really have no more basis in reality than the assumptions in our undergraduate physics class about the cow being a perfect sphere…


And while we all like to simplify our models, we are actually scientists and must embrace the complexity of the ecologic-spatio-genetic systems we work with. Get over it, pull up your big person pants, and lets get to work.

Not on the playground any longer gif

In this course you are going to learn examine the Fundamental Landscape Genetic equation.

$$G \approx f(S,H,E)$$

where the left side of this equation ($G$) represents the spatial distribution and covariance in population genetic structure, which is defined by a complex function of $S$patial, $H$istorical, and $E$cological predictor variables. To do this, you must learn how to:

  • Harvest and work with geospatial data in the form of rasters, points, lines, and polygons.
  • Understand that all of the spatial and ecological variables may be represented by their own unique scale, intensity, and granularity on the landscape.
  • Be able to harvest both distance and covariance structures from genetic data: markers, SNPs, RFLPs, Allozymes, … (whatever really, it makes no difference to this course, you’ll use the markers you have access to and that are best suited to your own system and question).
  • Decide which set of statistical approaches are most appropriate to examine at-site, around-site, and between-site influences on genetic structure and connectivity.
  • Be able to present results in a way that does not suck.

Course Overview

This course had previously been taught in person (a great experience for those of us who really love spending time in Scotland) but alas we are not so fortunate due to COVID. Below are each of the topics that are covered in this course.

Each section is self-contained and consists of a lecture, slides, a more in-depth narrative R notebook on the topic, and an activity using real data. It is intended that the delivery of the lecture content and the activity should take roughly 3 hours.

  • Introduction

    This lecture provides a brief introduction to the field of landscape genetics, what it is, and what it isn’t. I provide an overview of the field and some of the main conceptual ideas we will be going over during this activity as well as the logistics of the course. The activity for this topic will mostly focus on making sure your computer is set up properly, the correct set of packages are installed, and you have downloaded the data for the entire course.

  • Tidyverse

    Being able to work with streams of data is critical for any data scientist. One of the major innovations in data manipulation is the tidyverse package, which the rest of this workshop will rely upon exclusively. This topic makes a brief introduction to the tidyverse constellation of libraries and helps you gain foundational skills in using it.

  • Joins

    For all but the most trivial data sets, we must rely upon distributed content container such as databases and oneline sources. Working with, and joining data, from distributed sources forms the foundation for most DataBase activities. It is also central to analysis of spatial data as well. In this topic we’ll use some complicated airline data to learn joins.

  • Spatial Data

    What is spatial data and how do we work with it in R? This topic explores some of the aspects of why we need to be careful when working with georeferenced data.

  • Vector Data

    Points, lines, and polygons are all examples of vector data. In this topic, you will learn how to create, manipulate, and analyze vector data.

  • Rasters

    Rasters are a data container that allows the quantification of continuously distributed data with well defined geospatial context.

  • Population Genetics

    Population genetics is the analysis of allele and genotype frequencies. This topic provides a basic overview of population genetic foundations as well as examples of how microevoluationary processes.

  • Genetic Diversity

    Genetic diversity is a measure of local variation, either at the allele or the genotype level. In this topic, we explore some of the more common measures of gneetic diversity.

  • Genetic Structure

    Genetic structure is a property of subdivided populations. The amount of standing structure and its spatial arrangement provides valuable insights into both micro- and macroevoluationary processes. In landscape genetics, the primarily focus is on pair-wise structure more than overall differentiation. In this topic, we learn about the parameters commonly used as well as explore how to extract them from a suitable data set.

  • Habitat Analysis

    Core to landscape genetics is the ability to derive data representing the landscape. In this topic, we explore how to leverage vector and raster data to extract localized data that will ultimately be used as the set of predictor variables in our landscape genetic models.

  • Spatial Genetics

    Integrating spatial structure into the analysis of genetic variation is the next step towards applied landscape genetics. In this section, we will examine several different ways to look at spatial pattern in the distribution of genetic variance.

  • Network Models

    Populations may be conceptualized as a diffuse (Markov) network of interacting parts—individual based, neighborhood based, or even as a set of populations embedded within a continuous matrix. As such, we may gain some valuable insights into the processes that have created genetic structure by allowing the structure itself to reveal its own topological structure. In this topic, we explore network approaches for understanding the spatail distribution of genetic structure and covariance.

  • Resistance Modeling

    One of the most blantant phenotypes associated with Landscape Genetic approaches is the use of resistance surfaces. Resistance modeling is based upon the fundamnetal notion that landscapes contain heterogeneous collections of habitat types and those types may have differential permeabiltiy to the movement of organisms. The difficult part here is that we are entirely ignorant of (1) which features the organisms are responding to on the landscape, (2) how much each feature influences the relative ability of the organism to move/disperse, and (3) what the proper granularity of the landscape feature is that the organism both recognizes and responds to.

Meet your instructor



Here are some common questions I get before this class.

What do I need to know a priori?

The following requirements pertain to students in this course:

  • You must have you own computer that is capable of running R and RStudio.
  • You must already know some R. It is not going to be in your best interests to come to this class thinking (well, I’m a smart academic, I can just learn R and sample( c("Population Genetics","GIS","Landscape Ecology"), size=2) at the same time (n.b., if you don’t understand this sentence, perhaps you need not be in this class).

How often do the courses run?

This course is usually offered at least once a year.

Is there time to work on/ask questions about my own data?

Absolutely! I am more than happy to provide personal time to you and your research project/questions after the course is completed. In addition to online interactions, in the past I’ve also hosted individuals from this course in my own laboratory as a visiting scholar (back when we could travel) and would be happy to do so in the future if this helps out with your own reserach progam/project.

What packages do I need to install?

The week before the class starts, I will send out detailed instructions to you on how to configure your R and RStudio setup to be able to handle all the components in this class. The exercises and activities in this workshop are all based upon the most recent version of the latest packages because all lectures and activities are actively knit right before the class starts.

Do you use the <insert random package name> package?

See previous FAQ about packages.