11 August 2018
on course website
Big Data Management and Analysis in UNIX
The growing availability of extremely large datasets requires scientists and analysts to use powerful supercomputers or computer clusters to store, manage, and analyze these data. These clusters typically run on Linux, which requires some programming skills and insights into suitable software packages. Our course will introduce you to programming in a Linux environment, teach you how to efficiently manage very large datasets (e.g. using sed, awk, and grep commands) and create simple shell scripts to analyze your data (e.g. using a Linux version of the freely available statistics program R). You will also learn how to visualize your data and results in customized plots and figures. These skills are extremely valuable for scientists from all disciplines as well as for business practitioners (e.g. consultants or financial analysts) who are planning to work with big data.
The format of the course is three hour lectures in the morning, followed by two hours of supervised work in computer tutorials in the afternoon. Both the lectures and tutorials will be held in a computer room. The lectures will be interactive, with short examples that allow students to apply the introduced concepts. In the tutorials, students will get more hands-on training in a supervised environment with exercises covering the day’s topics, and they will have the opportunity to work on the assignments. The computer room will stay open to students for self-study after the tutorials.
Dr. Aysu Okbay, Richard K. Linnér
Scientists and data analysts from all disciplines, as well as business practitioners (e.g. consultants or financial analysts) who are planning to work with big data. Our courses are multi-disciplinary and therefore are open to students with a wide variety of backgrounds.
By the end of this course, the student should understand and feel comfortable with:
•Basic Linux programming
•The Unix philosophy and environment; files, processes, pipes, filters and basic utilities
•Login and logout procedures
•File transfer between systems
•Text file manipulation with sed, awk, cut, paste, cat, etc.
•Basic text editing using the vim editor
•Automation through functions, control structures and shell scripts
•Version control with Git
•Working with R through the UNIX command line
•Plotting in R
Contact Hours: 45
If you want to earn more credits you can take courses in our other sessions to create a 4 or 6 week programme.
EUR 1000: The tuition fee includes:
• Airport pick-up service
• Welcome goodie bag
• Orientation programme
• Course excursions
• On-site support
• Emergency assistance
• Transcript of records after completion of the course
An early bird discount of €150 is available for students who apply and pay before 15 March, and students from VU Amsterdam as well as from exchange partner universities will receive a €250 discount. You apply for the discount simply by indicating that you are currently a student at VU Amsterdam or at a partner university in the online application.
There are also discounts for students who attend multiple sessions, combine 2 courses and receive a €200 discount and combine 3 to receive a €300 discount. All courses include excursions. We will also organize trips and excursions as part of our social programme, which is a great way to get to know your fellow students and learn more about Amsterdam and the Netherlands. The social programme is not included in the tuition fee.
Furnished accommodation is available. Various housing options will be offered.
The VU Amsterdam Summer School offers ten scholarships that cover the full tuition and housing fees of one course. Information about how to apply for the scholarship will be posted on the VU Amsterdam Summer School website.
• Combine 2 courses: €200 di
on course website