{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lecture 6: Graph construction and basic management in Python" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**NetworkX** is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. It is usually imported with the *nx* abbreviation." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import networkx as nx" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "NetworkX supports 4 type of graphs:\n", "* undirected, simple graphs: `Graph`\n", "* directed simple graphs: `DiGraph`\n", "* undirected graph with parallel edges: `MultiGraph`\n", "* directed graph with parallel edges: `MultiDiGraph`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Creation of new, empty graph is straightforward:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "graph = nx.Graph() # undirected, simple graph" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Building a graph from scratch" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can add nodes and edges to a graph:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "graph.add_node(1)\n", "graph.add_node(2)\n", "graph.add_node(3)\n", "graph.add_node(4)\n", "graph.add_node(5)\n", "graph.add_node(6)\n", "graph.add_node(7)\n", "graph.add_node(8)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "graph.add_edge(1, 2)\n", "graph.add_edge(1, 3)\n", "graph.add_edge(1, 4)\n", "graph.add_edge(2, 3)\n", "graph.add_edge(2, 5)\n", "graph.add_edge(2, 6)\n", "graph.add_edge(3, 6)\n", "graph.add_edge(4, 5)\n", "graph.add_edge(4, 7)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Adding an edge to a non-existing node will also create that particular node:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "graph.add_edge(1, 9)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Graph visualization with Matplotlib" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "\n", "# Special Jupyter Notebook command, so the plots by matplotlib will be display inside the Jupyter Notebook\n", "%matplotlib inline\n", "\n", "nx.draw_networkx(graph)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Building a graph from a CSV file" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Process a CSV file line-by-line with the **csv** Python package:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import csv\n", "\n", "flight_graph = nx.Graph()\n", "\n", "csv_file = open('06_flights.csv')\n", "csv_reader = csv.reader(csv_file, delimiter=';')\n", "next(csv_reader, None) # skip header line\n", "for row in csv_reader:\n", " print('Reading flight %s <=> %s' % (row[0], row[1]))\n", " flight_graph.add_edge(row[0], row[1])\n", "csv_file.close()\n", "\n", "nx.draw_networkx(flight_graph, node_color = 'lightgreen')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Closing an opened file is easy to forget and a common programmer mistake. Use the `with` statement, which will automatically close the file (if it was successfully opened):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "flight_graph = nx.Graph()\n", "\n", "with open('06_flights.csv') as csv_file:\n", " csv_reader = csv.reader(csv_file, delimiter=';')\n", " next(csv_reader, None) # skip header line\n", " for row in csv_reader:\n", " #print('Reading flight %s <=> %s' % (row[0], row[1]))\n", " flight_graph.add_edge(row[0], row[1])\n", "\n", "nx.draw_networkx(flight_graph, node_color = 'lightgreen')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Building a graph from a *pandas* DataFrame" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Parse the CSV file into a *pandas* DataFrame:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "\n", "flight_table = pd.read_csv('06_flights.csv', delimiter = ';')\n", "display(flight_table)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*NetworkX* has a **from** and **to** conversion for *pandas* DataFrames:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "flight_graph = nx.from_pandas_edgelist(flight_table, 'From city', 'To city')\n", "nx.draw_networkx(flight_graph, node_color = 'lightgreen')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can define the type of the graph with the optional `create_using` parameter. Its default value is `Graph`.\n", "```python\n", "nx.from_pandas_edgelist(flight_table, 'From city', 'To city', create_using = nx.DiGraph)\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Analyzing the graph" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Quering the size and degree information" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('Number of nodes: %d' % flight_graph.order())\n", "print('Number of edges: %d' % flight_graph.size())\n", "print('Degrees of the nodes: %s' % flight_graph.degree())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For *directed graphs*, there is also `in_degree` and `out_degree` defined." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Iterate through the nodes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for node in flight_graph.nodes:\n", " print(node)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Note: iterating through the graph itself (`flight_graph`) is the same.*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Iterate through the edges" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for from_node, to_node in flight_graph.edges:\n", " print(\"%s <=> %s\" % (from_node, to_node))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Query the neighbors of a node" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for neighbor in flight_graph.neighbors('Budapest'):\n", " print(neighbor)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pay attention that it is written as `neighbors` (American English) and *NOT* `neighbours` (British English)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Query the edge metadata" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The metadata is now empty, but e.g. the *weight* of an edge will be a metadata later." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('Metadata for the Budapest <=> Paris edge: %s' % flight_graph['Budapest']['Paris'])\n", "print('Metadata for all edges from Budapest: %s' % flight_graph['Budapest'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Check node and edge existence" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "if flight_graph.has_node('Budapest'):\n", " print('The Budapest node exists.')\n", "if flight_graph.has_edge('Budapest', 'Paris'):\n", " print('The Budapest <=> Paris edge exists.')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Further readings\n", "\n", "* Check out the official [NetworkX tutorial](https://networkx.github.io/documentation/stable/tutorial.html).\n", "* Browse the official [NetworkX reference](https://networkx.github.io/documentation/stable/reference/index.html)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary exercise on graphs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task:** request a starting city from the user and a number of maximum transfers. Calculate which cities can be reached! \n", "Handle the case of a not existing starting city." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task:** solve it by using the [Traversal algorithms](https://networkx.github.io/documentation/stable/reference/algorithms/traversal.html) of *NetworkX*." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.6" } }, "nbformat": 4, "nbformat_minor": 2 }