🌲🧬 Landscape Genetics

Table of Contents

Learning Objectives

To ecologists, a gene is a gene and we just need to find the one that does {insert random phenotypic trait here}. To population geneticists, the landscape has no heterogeity, so we can consider all populations as equally reachable. While these individual assumptions may lead to some wonderful simplifying assumptions in our underlying mathematica models, they really have no more basis in reality than the assumptions in our undergraduate physics class about the cow being a perfect sphere…


And while we all like to simplify our models, we are actually scientists and must embace the complexity of the ecologic-spatio-genetic systems we work with. Get over it, pull up your big person pants, and lets get to work.

Not on the playground any longer gif

In this course you are going to learn examine the Fundamental Landscape Genetic equation.

$$G \approx f(S,H,E)$$

where the left side of this equation ($G$) represents the spatial distribution and covariance in population genetic structure, which is defined by a complex function of $S$patial, $H$istorical, and $E$cological predictor variables. To do this, you must learn how to:

  • Harvest and work with geospatial data in the form of rasters, points, lines, and polygons.
  • Understand that all of the spatial and ecolgical variables may be represented by their own unique scale, intensity, and granulariy on the landscape.
  • Be able to harvest both distance and covariance structures from genetic data: markers, SNPs, RFLPs, Allozymes, … (whatever really, it makes no difference to this course, you’ll use the markers you have access to and that are best suited to your own system and question).
  • Decide which set of statistical approaches are most appropriate to examine at-site, around-site, and between-site influences on genetic structure and connectivity.
  • Be able to present results in a way that does not suck.

Course Overview

This course had previously been taught in person (a great experience for those of us who really love spending time in Scotland) but alas we are not so fortunate due to COVID. Below are each of the topics that are covered in this course.

Each section is self-contained and consists of a lecture, slides, a more in-depth narrative R notebook on the topic, and an activity using real data. It is intended that the delivery of the lecture content and the activity should take roughly 3 hours.

  • Introduction

    This lecture provides a brief introduction to the field of landscape genetics, what it is, and what it isn’t. I provide an overview of the field and some of the main conceptual ideas we will be going over during this activity as well as the logistics of the course. The activity for this topic will mostly focus on making sure your computer is set up properly, the correct set of packages are installed, and you have downloaded the data for the entire course.

  • Raster/Vector

    What is spatial data and how do we work with it in R? This topic covers points, lines, and polygons as well as raster data. I provide a brief overview of geospatial data and then we dive into how we can construct each of the types as well as extract data from rasters based upon vector components. We also touch briefly upon elements of cartographic approachs.

  • Genetic Data

    Locked inside each living creature is an imensly complex set of genetic information. In general terms, we can consider this genetic content as falling within two functional groups-the part that does something (the part that has something to do with the function of the organisms and as such has the influence of selection upon it) and the part that does not do anything (the part we use for neutral evolutionary processes). In this topic, we explore how to import various kinds of genetic data into our normal R workflow. This activity will focus on some marker information we have from a Sonoran Desert Bark Beetle colected in Baja California, Mexico.

  • Distance/Structure

    The analysis of popualtion (and individual) genetic structure is a fairly straight-forward endeavour. In fact, one of the key characteristics of what we refer to as Landscape Genetics is the use of distance as a primary metric of our analyses. However, the salient point here is to know what the proper set of partitions are for the questions you are asking and how to extract the correct types of data to be subsequently used in our models.

  • Resistance Modeling

    One of the most blantant phenotypes associated with Landscape Genetic approaches is the use of resistance surfaces. Resistance modeling is based upon the fundamnetal notion that landscapes contain heterogeneous collections of habitat types and those types may have differential permeabiltiy to the movement of organisms. The difficult part here is that we are entirely ignorant of (1) which features the organisms are responding to on the landscape, (2) how much each feature influences the relative ability of the organism to move/disperse, and (3) what the proper granularity of the landscape feature is that the organism both recognizes and responds to.

  • Distance Inferences

    OK, so we have some kind of genetic information we’ve estimated from the organisms we are working with and some combination of spatial, ecological, and historical features that we are entertaining as potential sets of predictor varaibles… Now what? In this topic we explore how to deal with these kinds of data, some of the challenges associated with working on distance metrics, and how to proceed.

  • Network Models

    Populations may be conceptualized as a diffuse (Markov) network of interacting parts—individual based, neighborhood based, or even as a set of populations embedded within a continuous matrix. As such, we may gain some valuable insights into the processes that have created genetic structure by allowing the structure itself to reveal its own topological structure. In this topic, we explore network approaches for understanding the spatail distribution of genetic structure and covariance.

  • Spatial Processes

    In this topic, we explore some additional approaches for understanding spatial processes and strucutre. Not everything should be distilled into a single distance/resistance approach and here are some strategies we can use for understanding more detailed aspects of spatial genetic structure.

Meet your instructor



Here are some common questions I get before this class.

What do I need to know a priori?

The following requirements pertain to students in this course:

  • You must have you own computer that is capable of running R and RStudio.
  • You must already know some R. It is not going to be in your best interests to come to this class thinking (well, I’m a smart academic, I can just learn R and sample( c("Population Genetics","GIS","Landscape Ecology"), size=2) at the same time (n.b., if you don’t understand this sentence, perhaps you need not be in this class).

How often do the courses run?

This course is usually offered at least once a year.

Is there time to work on/ask questions about my own data?

Absolutely! I am more than happy to provide personal time to you and your research project/questions after the course is completed. In addition to online interactions, in the past I’ve also hosted individuals from this course in my own laboratory as a visiting scholar (back when we could travel) and would be happy to do so in the future if this helps out with your own reserach progam/project.

What packages do I need to install?

The week before the class starts, I will send out detailed instructions to you on how to configure your R and RStudio setup to be able to handle all the components in this class. The exercises and activities in this workshop are all based upon the most recent version of the latest packages because all lectures and activities are actively knit right before the class starts.

Do you use the <insert random package name> package?

See previous FAQ about packages.