## Graduate Program (& Advanced Certificate) Status

Mandatory | |

Mandatory-Elective |

The increasing volume and nature of big network data sets in the social and natural sciences call for more complex and sophisticated mathematical and statistical tools. In this course we will present the fundamentals of probability theory and statistics, and apply such mathematical and statistical tools to networks. The assessment of the statistical validity of the observed results will be analyzed and, when possible, quantitatively evaluated. Besides the mathematical theory, the course will have a practical approach with home assignments and hands-on classes. During the class all examples and sample codes will be provided in Python and Jupyter notebooks.

**Learning activities and teaching methods**Lectures: 12 classes of 100 min. Around 80% of the classes will be theory only. The other 20% will include programming exercises or evaluation of data sets. Therefore, use of a computer will be required during some lectures. Students can form groups and use their own laptops. Instructions on the required software will be provided during the first class.

By successfully completing the course, students will be able to:

Learn the fundamentals of how to perform data analysis and use statistical methods in the investigation of networks;

Evaluate the statistical reliability of empirical estimations against an appropriate null hypothesis;

Learn how to make use of large sets of data for investigating networks observed in the fields of social sciences;

Perform empirical analyses and statistical validation of large datasets obtained from the Internet or from other business and scientific sources.

**What you will NOT learn in this course**:

Fundamentals of coding and data visualization. This course has the prerequisite that you already have a basic proficiency with Python and will be able to develop and apply your skills towards data analysis and statistical visualization. For learning the basics to code (for-loops, lists, functions, reading and writing data from/to files, etc.), consider attending DNDS 6013 Scientific Python.

Similarly, basic concepts in network science are not covered in this course. For learning the basics of network science (degree-distributions, clustering, centrality measures, small-world networks, scale-free networks, etc), consider attending DNDS 6000 Fundamental Ideas in Network Science.

(1) **Assessment type 1 (50% of the final grade)**. Attendance in at least 80% of classes, active cooperation, homework: Students will get home assignments consisting of statistical analyses, simple problems or data processing, which they will have to complete individually and submit electronically.

(2) **Assessment type 2 (50% of the final grade)**. The final test will be lecture 12, consisting of questions related to the course that can be answered or solved by hand. The use of materials, including calculators or computers, is not permitted during the final test.

**Requirements for audit**:

Attendance in at least 80% of classes, active cooperation, and completing the home assignments.

- Proven proficiency with Python;
- Knowledge of fundamental network concepts;
- Basic skills in statistics and linear algebra.

Part of this course focuses on applying scientific programming with Python for research. We make no use of programs with a Graphical User Interface, like those available with spreadsheets. Since we need to pick one programming language for the course, we require students to prove proficiency with Python before the course starts, in one of the following ways:

- Having taken for grade or audit the course DNDS 6013 Scientific Python.
- Having taken a MOOC course on programming with Python and show the certificate.
- Show and discuss a project you developed in Python with the instructor. Projects from someone else (web, friend, previous students) are not considered.

Moreover, familiarity with the fundamentals of network science is required. We require students to prove knowledge of the basic concepts of network science by having taken in the past, or be taking this semester, for grade or audit the course DNDS 6000 Fundamental Ideas in Network Science.

The instructor holds no responsibility in case you do not satisfy the prerequisites and need to drop the course.

If you are already familiar with the fundamental ideas of network science, you can consider attending the more advanced course Structure and Dynamics of Complex Networks in the winter term. If you are already familiar with the use of simple statistical tools and their application to networks, and in particular if you want to learn about network embedding, inference and the use of bayesian statistics in network science, the recommended statistical course is Statistical Methods in Network Science II taught in the winter term.