Denmark, Aarhus

Text Mining the Great Unread – Data-intensive methods and digital tools for analysis of texts in the humanities and social sciences

when 25 July 2016 - 5 August 2016
language English
duration 2 weeks
credits 10 ECTS
fee EUR 552

Texts have always been essential to research and education in the humanities and social sciences. Close reading and detailed interpretation have traditionally constituted the standard approach to texts, that is, we combine qualitative methods and theoretically motivated arguments to a small textual corpus with the purpose of understanding the meaning of that corpus. However, the rapid expansion of digital full-text databases, increasingly faster computers, and advances in language technology are starting to impact the standard approach by offering a new digital and data-intensive paradigm in the study of text. Humanities and social science researchers are beginning to ask new types of questions and propose novel solutions to old problems by using faster and more efficient methods to collect, analyze, and visualize texts.

Many students (as well as researchers) experience a lack of digital competences when faced with text mining, that is, the application of tools and methods to analyze large sets of digitized texts. This is unfortunate because text mining 1) enables students to extract high quality information and acquire new knowledge in a fast and efficient manner; and 2) enhances the qualifications of students for a data-driven job market that is relying on the very same tools and methods. Finally, many tools and methods in text mining are in need of a thorough revision by academics who understand the importance of text meaning and context. Academia and industry alike are therefore in great need of students with text mining skills.

“Text Mining the Great Unread” is an introductory level course to text mining tools and methods in the humanities and social sciences, which will supply participants with sufficient knowledge and experience to develop and implement their own text mining projects. The core of the course is a series of hands-on workshops supplemented by lectures and tutorials by international researchers and industry experts. Through the course, participants will become familiar with text mining methods and software for analyzing and visualizing texts. Participants will learn how to write their own text mining application in R and Python. Through the workshops, participants will also be presented with a range of paradigmatic studies and go through explain research design, best practice, and reporting standards. It is possible to work with one’s own corpus, but historical and contemporary corpora (both works of fiction, historical documents and websites) are also available in class. Participants are not expected to have prior experience with text mining (i.e., programming, statistics, or visualization).

Course leader

Hilke Reckman,
Kristoffer Laigaard Nielbo,
Mads Rosendahl Thomsen.

Target group

Humanities students

Course aim

Bachelor/Master' Level

Fee info

EUR 552: Danish and EU/EØS (tomplads): 552 EUR
EUR 1550: NON-EU-EØS students (free-mover): 1550 EUR