Jonathan C. Johnson

E-mail: jonathan.johnson10@okstate.edu
Office: MSCS 524
Department of Mathematics
Oklahoma State University

MATH 4910/5010: Topological Data Analysis - Fall 2023

Data analysis is a booming field with applications in nearly every field and industry: finance, medicine, politics, biology, and the list goes on. Topological data analysis(TDA) is a suite of mathematical tools which allows us to analysis more complex structures in the shape of data than we can with traditional methods. In this course, you will be introduced to some of these methods, learn about how these methods have been applied in practice, and even get some hands on experience manipulating and analyzing data. The course syllabus can be found on Canvas.

Course Texts

Main Texts

  • Computational Topology for Data Analysis, by T. K. Dey and Y. Wang (available through Tamal K. Dey's website)
  • Beginning Data Science in R 4: Data Analysis, Visualization, and Modelling for the Data Scientist by T. Mailund

Both of these test are available online through Edmon Low Library.

Recommended Text
  • Computational Topology: An Introduction, by H. Edelsbrunner and J. Harer

R Labs


Downloading and installing R and RStudio

R is a free, open source statistical programming language useful for data cleaning, analysis, and visualization. RStudio is an integrated development environment (IDE), where you can input commands in a console, write scripts, see results, see variables you’ve define, and so on all in one screen. For clarification, R is the name of the programming language and the interpreter which executes R code, and RStudio is a convenient interface for writing and running R code. In this class we will be using R/RStudio to view, manipulate, visualize, and analyze data.

You can download R from here: Get R
You can download RStudio from here: Get RStudio
Here's a detailed guide to installing R and RStudio.

R Markdown Files

Each R lab will be completed using a R markdown file and involves the following steps:

  1. Login to Canvas and download the lab files, including the lab's R markdown file, into a folder on your hard drive.
  2. Go to the assignments webpage via its link on this page.
  3. Follow the instructions on the lab's webpage. (Be sure the tasks in green boxes are completed in the lab's R markdown file.)
  4. Knit an html file from the .Rmd file in RStudio.
  5. Upload the completed .Rmd file and the knitted .html file into the Canvas assignment associated to the lab.
For a detailed guide to getting started with R markdown files, work through the practice lab.

Labs

Exercises

I've added exercises covering material through chapter 2. Exercises

Prsentations

Presentation 1: For the first presentation, each of you will be assigned an exercise from this list. You will present the solution to this exercise in class. This should take about 10-20 min. You are free to discuss your exercises with each other. Also, you are encouraged to schedule a meeting with me, outside of normal office hours, to discuss your exercise.

Presentation 2: For the second presentation, each of you will choose a paper from this list and give talk about it in class. Send me a message indicating with topic you would like to present on. You may also choose a topic not on this list but it must be approved by me first. This should take about 20-30 min. Your talk should do the following:

  1. Explain the context of the paper.
  2. Briefly give an overview of the subject's background including any necessary definitions.
  3. Describe the conclusions of the paper.
  4. Discuss which topological data analysis methods were used and how.

Schedule/Slides

  1. Course Introduction (slides)
  2. Introduction to Topology
  3. Simplicial (and related) complexes
  4. (Simplicial) homology
  5. Persistent homology
  6. Point cloud data analysis
  7. TDA in action