Brief introduction to the course
Network science is fundamentally a computational science. This course will provide algorithms and tools to generate network data and analyze real-world networks. The course will build on concepts of the courses “Fundamental ideas in network science” and “Social networks”. Many of the concepts will be revisited and put in a computational and numerical context, and issues like complexity or scalability of algorithms will be addressed. You are not required to have strong programming skills coming into the course: you will be expected to build these skills during the course. However, you do need to be familiar with coding and solve simple computational tasks. Students are free to work in any computer language/network software they feel most comfortable. However, during the class all examples and sample code will be provided in Python and Jupyter notebooks, and use of Python is strongly encouraged.
The course will have a practical approach, with many hands-on classes and with the development of a final project.
The goals of the course
The overarching goal is to learn how to properly generate and handle network data, aimed at students with a theoretical background in network science and a little programming experience. This course will strengthen the computational foundation from which students can perform research in network science.
Lectures: theory classes and hands-on sessions. Use of a computer will be required during most of the lectures. Students can use their own laptops or the facilities provided by CEU. Instructions for installation of Python and the required packages will be provided during the first class.
- Data structures for networks (adjacency, edge list, compressed row format, mention of graph databases). Parsing methods starting from different data formats. Node and link attributes. Computational techniques to describe static, directed, weighted, and temporal networks
- Linear and logarithmic binning, normalization and fitting. Special focus on power-law data, with coverage of the maximum likelihood approach and goodness of fit. Hands-on session with application of the concepts to data
- Shortest paths (Dijkstra's algorithm) and calculation of connected components (breadth first search). Scalability of the algorithms. Algorithms for closeness and betweenness centrality.
- Algorithms to compute eigenvector and PageRank centrality. Hands-on session on data.
- Generation of Erdős-Rényi networks, study of their components in the different regimes. Numerical challenges in reproducing the phase transition diagram. Implementation of Barabási-Albert model and of the Watts and Strogatz model. Numerical caveats.
- The importance of a null model: generate the proper null models for your data. Study of: citation networks, temporal networks, spatial networks.
- Community detection: Modularity maximization methods (greedy algorithm and Louvain method). Random walks based methods (Infomap). Generation of benchmarks (Girvan-Newman and Lancichinetti-Fortunato-Radicchi), measures to compare partitions and their implementation (Rand Index, Jaccard Index, Normalized Mutual Information, Variation of Information). Hands-on session on data.
- Random Walks and their simulation
- Temporal Networks
- Networks: An Introduction. M.E.J. Newman. Oxford University Press.
- Mining of Massive Datasets. Leskovec, Rajaraman, Ullman, Cambridge University Press
- Introduction to Algorithms -3rd edition. Cormen et al. MIT Press.
- Google Python course (https://developers.google.com/edu/python/)
- A list of papers and online resources will be provided during classes
Further information, such as the course website, assessment deadlines, office hours, contact details etc. will be given during the course.The instructor reserves the right to modify this syllabus as deemed necessary any time during the term
. Any modifications to the syllabus will be discussed with students during a class period. Students are responsible for information given in class.