Cologne, Germany

Text Mining with R

online course
when 28 September 2020 - 2 October 2020
language English
duration 1 week
credits 2 EC
fee EUR 400

To realize complex designs in empirical social research, scientists need basic knowledge of computational algorithms to be able to select those appropriate for their needs. Specific projects may further require certain adaptations to standard procedures, language resources or analysis workflows. Instead of relying on off-the-shelf analysis software, using script programming languages is a very powerful way to fulfill such requirements. The course teaches an overview of text mining in connection with data acquisition, preprocessing and methodological integration using the statistical programming language R.

Course leader

Dr. Andreas Niekler (Leipzig University), Dr. Gregor Wiedemann (Hamburg University)

Target group

Participants will learn about opportunities and limits of text mining methods to analyze qualitative and quantitative aspects of large text collections.

Course aim

With example scripts provided in the programming language R, participants will learn how to realize single steps of such an analysis on a specific corpus. We cover a range of text mining methods from simple lexicometric measures such as word frequencies, key term extraction and co-occurrence analysis, to more complex machine learning approaches such as topic models and supervised text classification. The goal is to provide a broad overview of several technologies already established in social sciences. Participants will be enabled to identify their own priorities and to lay foundations for further independent studying tailored to their individual needs. The last workshop day is reserved to discuss participant's project ideas and study designs.

Fee info

EUR 400: Student rate
EUR 600: Academic rate