Skip to main content
Data Analysis in Python
Graduate Program (& Advanced Certificate) Status
Course Description

Please note that this course will be offered in a condensed format, during the first 6 weeks of the term.

This course will provide a comprehensive, fast-paced introduction to Scientific Python. The overarching goal is to equip students with enough programming experience to start working in any area of computation and data-intensive research. This course will lay a foundation from which new tools and techniques can be explored. 

IMPORTANT: In most courses, you cover new materials during in-person classes, and you are asked to practice and do homework afterwards. Scientific Python this term will follow an opposite scheme: 

  • We will meet for the first class normally. 

  • Starting from the second week, you must work through the new material before the class by completing a Jupyter notebook that contains detailed explanations, videos, examples, and exercises. You can find an example notebook here (you need to install Jupyter to open the notebook). 

  • After a short recap of the new material, the in-person classes are devoted to practice exercises and answering questions. By the end of the in-person sessions you have to solve and submit a final problem that contributes to your final grade. 

Past experience shows that students taking the course have varying levels of programming skills. The design of the course aims to allow students to process new material at their own pace and to ensure that everyone can participate in the hands-on exercises.


  1. Introduction, basic data types 
  2. Memory usage, regex, lists 
  3. Sorting, dictionaries, iterables 
  4. Numpy 
  5. Matplotlib 
  6. Pandas I 
  7. Pandas II 
  8. JSON, using a simple web API 
  9. XML, HTML, web scrapping 
  10. Networks I 
  11. Network II – community detection 
  12. Stats and machine learning with scikit-learn 
Learning Outcomes

By the end of the course, students will have experience with techniques which are vital to effective scientific research, including:

  • The basic syntax and use of Python as a scientific tool, including writing and executing scripts to automate common tasks, using the IPython interpreter for interactive exploration of data and code, and using the Jupyter notebook to share and collaborate.
  • Loading data from a variety of common formats
  • Manipulating data efficiently with Numpy
  • Basic web scraping
  • Use of web APIs
  • Use of special python packages, like graph-tool.
  • Performing basic data mining and machine learning analysis with Scipy and Scikit-learn

The final grade will have two components:  

  • Class problems (50%): A small programing task that you complete at the end of each in-person session. 
  • Final project (50%): You will create a Jupyter notebook to analyze a data set of your choice using tools that you learned during the course. 

Extra credit can be earned by solving advanced problems. 


Students with no programming experience should not take the course. For students outside of DNDS, we require that 

  1. You have completed the Coding for Economists or similar course (when in doubt, contact the instructor) OR
  2. Contact the instructor ahead of time and show a Python project/program that you have written. 

Priority is given to DNDS students and then to students who are taking the course for grade, otherwise the course is filled up on a first-come-first-serve basis. 

Course Level
Academic Year
US Credits
ECTS Credits
Course Code
DNDS 6288