2 October 2020
Text Mining with R
online courseTo realize complex designs in empirical social research, scientists need basic knowledge of computational algorithms to be able to select those appropriate for their needs. Specific projects may further require certain adaptations to standard procedures, language resources or analysis workflows. Instead of relying on off-the-shelf analysis software, using script programming languages is a very powerful way to fulfill such requirements. The course teaches an overview of text mining in connection with data acquisition, preprocessing and methodological integration using the statistical programming language R.
Course leader
Dr. Andreas Niekler (Leipzig University), Dr. Gregor Wiedemann (Hamburg University)
Target group
Participants will learn about opportunities and limits of text mining methods to analyze qualitative and quantitative aspects of large text collections.
Course aim
With example scripts provided in the programming language R, participants will learn how to realize single steps of such an analysis on a specific corpus. We cover a range of text mining methods from simple lexicometric measures such as word frequencies, key term extraction and co-occurrence analysis, to more complex machine learning approaches such as topic models and supervised text classification. The goal is to provide a broad overview of several technologies already established in social sciences. Participants will be enabled to identify their own priorities and to lay foundations for further independent studying tailored to their individual needs. The last workshop day is reserved to discuss participant's project ideas and study designs.
Fee info
EUR 400: Student rate
EUR 600: Academic rate