Network data have an inherent high-dimensional and relational character that is not well served by text-book theory and methods of statistical data analysis. As there is no natural way to “look” at a large network and extract its most salient structural features, the latter must be achieved by developing principled generative models of network structure, and deriving robust methods of statistical inference to extract their parameters from data. Furthermore, network data is often noisy, incomplete or assessible only indirectly via functional behavior. In such situations, the network structure needs to be reconstructed prior to analysis, which also requires an inferential framework.

The objective of this course is to introduce the student to the state-of-the-art in the still-evolving research field of statistical inference of networks, covering large-scale modular identification with stochastic block models, separation of structure from noise and prevention of overfitting, network measurement error and uncertainty quantification, network reconstruction from noisy, incomplete and indirect information such as time series. Besides fundamental theory and methodology, the course will cover phenomenology such as phase transitions in the detectability of structures in networks and explore fundamental limitations of the inference problem.

In addition to the underlying theory, the course will also cover existing software implementations of some of the methods, mostly using Python.

*Learning activities and teaching methods*

Lectures: 12 classes of 100 min. Around 80% of the classes will be theory only. The other 20% will include programming exercises or evaluation of data sets. Therefore, use of a computer will be required during some lectures. Students can form groups and use their own laptops. Instructions on the required software will be provided in class.

By successfully completing the course, students will be able to:

Understand the fundamental problems and challenges in statistical network analysis.

Navigate through the modern field of network inference.

Be aware and learn how to overcome common pitfalls in network analysis such as overfitting.

Move beyond statistical tests and rejection of null models, towards Bayesian modeling and inference of the generative mechanisms of networks.

Incorporate network measurement error into analyses and reconstruct networks from indirect data.

Use the theory and methods learned to perform their own research in network science.

What you will NOT learn in this course:

Fundamentals of statistics and probability applied to network science are covered in DNDS 6291 Statistical Methods in Network Science, which is a pre-requisite to this course.

Basic concepts in network science are not covered in this course. To acquire this background, consider attending DNDS 6000 Fundamental Ideas in Network Science.

(1) Assessment type 1 (50% of the final grade). Attendance in at least 80% of classes, active involvement and homework: Students will receive assignments consisting of statistical analyses, simple problems and data processing, which they will have to complete individually and submit electronically.

(2) Assessment type 2 (50% of the final grade). Final project consisting of a written report and oral presentation, where the student performs a guided partial replication of a research paper, or a data-driven or theoretical analysis of some of the methods covered in the course. Students will be given a list of potential projects, but will also be encouraged to suggest their own topics.

Requirements for audit:

Attendance in at least 80% of classes, active cooperation, and completing the home assignments.

Students should have taken DNDS 6291 Statistical Methods in Network Science;

Knowledge of fundamental network concepts;

Basic skills in statistics and probability.

Students are required to have taken DNDS 6291 Statistical Methods in Network Science, and are expected to have basic knowledge of statistics and probability.

Moreover, familiarity with the fundamentals of network science is required. We require students to prove knowledge of the basic concepts of network science by having taken in the past, for grade or audit, the course DNDS 6000 Fundamental Ideas in Network Science.

For this course it is also recommended to have a basic proficiency with Python, and ability to use it for data analysis and visualization. To acquire such background, consider attending DNDS 6288 or 6013 Scientific Python.

The instructor holds no responsibility in case you do not satisfy the prerequisites and need to drop the course.