Applied Text Analysis from Close Reading to Machine Learning

Course Description: 

This course will focus on the task of turning raw materials – both analog and digital – into a usable dataset for text analysis. It is meant as a supplement to the introductory course in text analysis (see UWC 5008), adding a practical and more skill-based environment. For each of the subjects treated in the introductory course, we will work through different hands-on approaches, introduce relevant software, and experiment with how to work with texts on a medium- to large-scale.

Prerequisite: Students in UWC 5009 must also be enrolled in UWC 5008. A basic familiarity with programming languages such as R or Python is strongly recommended, although advanced skills are not required.

Learning Outcomes: 

• create and clean a full-text corpus relevant to their research area
• extract relevant metadata from their corpus
• use basic tools for textual analysis (Voyant, Juxta, TaPOR, Mallet)
• grasp and describe the role of programming languages in more advanced “under the hood” techniques
• design research questions appropriate to textual analysis in their respective disciplines