Using GitHub to Teach Data Literacy

Collaborating as a classroom is significantly improved for both short and long-term knowledge persistence with the help of GitHub. This is…

Octocat from pngegg.com

Collaborating as a classroom is significantly improved for both short and long-term knowledge persistence with the help of GitHub. This is the first in a series of short posts about how I’m implementing this in my own teaching regime for graduate-level courses in data analysis.

Here is a short setup on how I’m using GitHub in teaching an introduction course in Data Literacy for incoming Environmental Studies students. I’m going to post a series of short entities

This class is a required component for our MS (thesis) and MEnvs (non-thesis) academic programs. This course is one of the foundational quantitative offerings in our program (along with CORE GIS training) and is intended for students to take it in their first semester of classes. We follow up in the spring semester with a second, more applied course offering.

Last year, I started directly exposing students to GitHub directly and used GitHub Classroom to distribute the homework and other issues. It got a bit odd and weird having so many different repositories, I thought that this year I’d try submodules as an approach.

Main course structure

The structure of the class is broken up into individual self-contained “Learning Modules”, each of which is contained in its own GitHub repository. This modularity of my teaching content allows me to pick-and-match topics that I’ve already created in a standard format and facilitates reproducibility and reusability when I’m spinning up new courses — I just grab the stuff that I’ve already created and add the set of new topics to GitHub. Mix and match from there. It has worked really well as most of my teaching has been on basic data literacy and analysis, genetic analysis, and spatial/landscape analyses.

The content of each topic is partitioned into the following components. I am a huge fan of [[Backward Design]] in curriculum development and generally try to flip my courses.

Pre-Meeting content:

  • Background papers and examples using the topic.
  • Foundational literature on the mechanics of applying the topic.
  • A pre-recorded video of a lecture on the topic.
  • Slides & data sets used in the topic.
  • A longer Narrative document on the topic that goes into examples and demonstrations of everything intended to be discussed in this topic. This is the “core” of literature that I develop on the topic itself.

In-Person Content:

  • Specialization and extension of the topic as very short lecture components.
  • Make sure all questions are answered on topic
  • Group or Individual work on applying the topic to promote `active learning`.

Post-Meeting Assessment on Mastery:

  • Some form of assessment where the individuals can demonstrate their mastery of the topic through direct and/or indirect assessment methods.
  • Closing the loop on the topic to make sure that subsequent visitations by the participants have all open items described with working examples, etc.

The main class repository

The main repository has mostly logistics for the course. The main README.md is generally the syllabus. It goes over the description and basic need for the class, some background and motivations, the general Course Level Learning Objectives, the list of Learning Modules, grading/late/attendence policies, and the other syllabus materials that my university requires me to add.

Here is an example this particular course.

The Learning Modules

Within the course, each Learning Module is a self-contained object, as outline above, and exists in their own GitHub repository. So, for example, the first day of class materials are contained within a Welcome Repository. It is a pretty shallow one, with just a set of learning objectives and a single slide deck. This content is basically just getting everyones computers set up and having the proper software installed, etc. I find that the main syllabus and welcome modules are sufficient for the first “in-class” session to get everyone on the same level (and add/drop extends through the first week so I go a bit slow here).

So how I structure it is that I have the main repository for the class and with the addition of each Learning Module, I will add that repository as a submodule to the original one. I to this because it allows the students to “pull” new version as they become available. All the time keeping the content localized into a single main folder.

To get the submodules, you can add them by going to your main directory and issuing the following command (replacing `username`, `UserName`, and `Repository` with the appropriate content):

git submodule add https://username@github.com/UserName/Repository.git

So afterwards, as I added the `Welcome.git` repository as a submodule, I get see the following changes in my main folder.

~/Desktop/ENVS543-Fall_2023 ‹main*› » git status 
On branch main 
Your branch is up to date with ‘origin/main’. 
Changes to be committed: 
 (use “git restore — staged <file>…” to unstage) 
 new file: .gitmodules 
 new file: Welcome

And if you look inside the new `.gitmodules` file, you’ll see that if you move that folder around much, you may break things.

~/Desktop/ENVS543-Fall_2023 ‹main*› » cat .gitmodules 
[submodule “Welcome”] 
   path = Welcome 
   url = https://dyerlab@github.com/DyerlabTeaching/Welcome.git

After I add, commit, and push the main folder back up to Github, my repository shows the submodule as a link:

Submodule repository link in main module filder.

Longer Maintenance

As we go through the semester, we accumulate multiple repositories, each of which has these components. At the end of the semester, the students will then collate the individual Narrative components (from the Pre-Meeting Content) into a stand-alone textbook that covers the entirety of their learning experience. Since the repositories are directly linked into this textbook, as approaches and libraries evolve through time, they can easily keep up to date by pulling the latest version and remaking the textbook.

So through the semester, I can add each additional learning module to the repository, adding content and context. At the end of the course, I drop in a few scripts that allows the students to make the textbook with a simple build.

I’m still not decided if I’ll be using Github Classroom this year or not. I’ll have to play around with it a bit.