LAB 10 - NETWORKS AND EIGENVECTOR CENTRALITY
Last week we learned how to use iterative techniques to find the dominant eigenvector for a
system, and used this to analyze a Markov process. This week we are going to further explore
these techniques and see how they can be used to analyze networks.
Simply put, a network consists of a collection of nodes and edges, where each edge can be
thought of as a connection which joins two nodes. We often represent networks graphically, by
drawing a dot for each node, and a line for each edge as shown below in Figure 1. Networks
are sometimes referred to as graphs in mathematics, and the nodes are sometimes referred to
Networks are incredibly useful in real world applications, as they allow us to represent relationships between objects in a wide variety of different systems. For example, a data scientist
might use networks to represent connections on social media, by displaying each Facebook
user as a node of the network, with an edge joining two nodes when the corresponding users
are Facebook friends (see Figure 2). Urban planners may use networks to design important
infrastructure, by representing water pump station as nodes, and water mains as edges. Law
enforcement officials may use networks to understand criminal organizations, with gang members represented by nodes, and known affiliations between members represented by edges.
Finally, we could use networks to represent Hollywood actors and actresses, connecting two
actors/actresses if they’ve starred in a movie together. This would allow us to answer definitively whether all of Hollywood really is 6 degrees from Kevin Bacon.
By introducing a slight variation on the idea of a network as defined above, we can capture
other types of relationships. For example, we could add a direction to each of our edges, and
think of them as providing a connection from one node, to another node. We indicate the
direction of each edge by adding an arrow, and call the resulting network a directed network
(see Figure 5).
Directed networks allow us to model relationships that are not symmetric. For example, the
food chain could be represented as a directed network, with each node representing a different
species, and a directed edge added which travels from each predator to the prey it is known
to eat. In this network there would be an edge from the node representing lions to the node
representing gazelles, because lions are known to eat gazelles, but not one going the other way.
Perhaps one of the most ubiquitous directed networks in our everyday world is the internet,
or the world wide web. Understanding this network is crucial to a number of tasks that we do
everyday, often without thinking twice about them. It is this network that we will focus on in
There are several different ways in which the internet can be modeled as a network, each
of which captures a different aspect of the web. For example, we could model the physical
infrastructure of the internet (wires, hubs, and internet service providers) as a network, to
model the way in which data is sent from one computer to another over the internet. Another
way to represent the internet is by assigning a node to each of the estimated 1.8 billion webpages
on the internet, and then adding a directed edge from webpage A to webpage B when there is
a link on webpage A which redirects the user to webpage B. It is this second viewpoint which
we will be using in this lab.
To understand the importance of being able to study this network, consider the problem of
designing an internet search engine, like Google, Yahoo, Bing, or AltaVista. A user comes to
LAB 10 - NETWORKS AND EIGENVECTOR CENTRALITY 3
the search engine website and types in a search term, say “Spider-Puppy”, hoping to find a
website with an image like the one below: