{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lecture 4: Shapefile handling" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Install the pyshp Python package!** \n", "\n", "If you have Anaconda installed, open the *Anaconda Prompt* and type in:\n", "```\n", "pip install pyshp\n", "```\n", "\n", "If you have standalone Python3 and Jupyter Notebook install, open a command prompt / terminal and type in:\n", "```\n", "pip3 install pyshp\n", "```\n", "\n", "*If you have the pyshp package already installed, make sure its version is >= 2.0*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Opening a shapefile" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Open a shapefile (.shp). \n", "dBase file of attributes (.dbf) is automatically detected by name convention." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import shapefile\n", "\n", "sf = shapefile.Reader('04_megye_region.shp', encoding = 'latin1')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The attributes in the dBase (.dbf) file are in the ISO-8859-2 Central European character encoding for this file. Since the default encoding would be Unicode, we have to override this setting." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reading shapefile" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check whether files contains polygons:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"Geometry type: %d\" % sf.shapeType)\n", "if sf.shapeType == shapefile.POLYGON:\n", " print(\"This file contains polygons\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The library defines the following gemoetry types:\n", " - NULL = 0\n", " - POINT = 1\n", " - POLYLINE = 3\n", " - POLYGON = 5\n", " - MULTIPOINT = 8\n", " - POINTZ = 11\n", " - POLYLINEZ = 13\n", " - POLYGONZ = 15\n", " - MULTIPOINTZ = 18\n", " - POINTM = 21\n", " - POLYLINEM = 23\n", " - POLYGONM = 25\n", " - MULTIPOINTM = 28\n", " - MULTIPATCH = 31" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Print the number of shapes (geometries) in the file:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"Number of counties: %d\" % len(sf.shapes()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Print the available attributes, their type and order:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"Attributes: %s\" % sf.fields)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Don't care about \"DeletionFlag\" for now.*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Read all shapes (geometries) and records (attribute table):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "shapes = sf.shapes()\n", "records = sf.records()\n", "print(\"Number of geometries: %d\" % len(shapes))\n", "print(\"Number of records: %d\" % len(records))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Iterate through each shape-record pair and print each county's name and the number of points in its gemoetry:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in range(0, len(shapes)):\n", " # Get the name of the county, which is the first attribute (index 0)\n", " name = records[i][0]\n", " \n", " # The shape is a closed polygon, the first and the last points are the same\n", " point_count = len(shapes[i].points) - 1\n", " \n", " # Print out the name of the counties and the number of points in their polygons\n", " print(\"{0}: {1} points\".format(name.title(), point_count))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternative way to read all shapes and records at the same time:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "shape_records = sf.shapeRecords()\n", "print(\"First county: %s\" % shape_records[0].record[0])\n", "print(\"Number of points: %d\" % (len(shape_records[0].shape.points) - 1))\n", "print()\n", "\n", "for sr in shape_records:\n", " name = sr.record[0]\n", " point_count = len(sr.shape.points) - 1\n", " print(\"{0}: {1} points\".format(name.title(), point_count))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Closing an opened file" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sf.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Naturally you cannot read from a file you have closed." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary exercise on shapefile reading" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task:** calculate the perimeter of each county." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What can you observe? Are all values approximately correct? \n", "*Hint: pay special attention to Pest county!*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Polygon parts" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Pest county* is a holed polygon and both the points of the external ring and the internal hole ring are given in the `points` list.\n", "\n", "We can check how many parts are in a shape through the`parts` list of a shape. The external ring is always the first part, followed by the inner holes, if any." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sf = shapefile.Reader('04_megye_region.shp', encoding = 'latin1')\n", "\n", "shape_records = sf.shapeRecords()\n", "for sr in shape_records:\n", " name = sr.record[0]\n", " print(\"%s: %d parts (%s)\" % (name, len(sr.shape.parts), sr.shape.parts))\n", " \n", "sf.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we can observe *Pest county* has two parts and the second part starts with the 1126th point. So the external ring only conists of the 0th-1125th points." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task:** fix the previous perimeter computation by only taking the external ring into consideration!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plotting" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use the previously introduced *Matplotlib* library to draw the polygons as line diagrams:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import shapefile\n", "\n", "# Special Jupyter Notebook command, so the plots by matplotlib will be display inside the Jupyter Notebook\n", "%matplotlib inline\n", "\n", "sf = shapefile.Reader('04_megye_region.shp', encoding = 'latin1')\n", "\n", "# Start new plot figure\n", "plt.figure()\n", "# Iterate through all the shapes\n", "for shape in sf.shapes():\n", " # Only consider the first polygon if multiple parts are defined\n", " end = len(shape.points) if len(shape.parts) == 1 else shape.parts[1] - 1\n", "\n", " # Get the X an Y positions into separate lists\n", " xs = [coord[0] for coord in shape.points[:end]]\n", " ys = [coord[1] for coord in shape.points[:end]]\n", "\n", " # Add polygon to plot\n", " plt.plot(xs, ys)\n", "\n", "# Display plot ...\n", "plt.show() \n", "# ... or save plot\n", "#plt.savefig('04_map.png')\n", "\n", "sf.close()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.6" } }, "nbformat": 4, "nbformat_minor": 2 }