{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Tutorial 1.2 - Spatial analysis with Python\n",
"\n",
"```{attention}\n",
"Finnish university students are encouraged to use the CSC Notebooks platform.
\n",
"\n",
"\n",
"Others can follow the lesson interactively using Binder. Check the rocket icon on the top of this page.\n",
"```\n",
"\n",
"In the first week, we will take a quick tour to Python's (spatial) data science ecosystem and see how we can use some of the fundamental open source Python packages, such as:\n",
"\n",
" - pandas / geopandas\n",
" - shapely\n",
" - pysal\n",
" - pyproj\n",
" - osmnx / pyrosm\n",
" - matplotlib (visualization)\n",
" \n",
"As you can see, we won't use any GIS software for doing the programming (such as ArcGIS/arcpy or QGIS), but focus on learning the open source packages that are independent from any specific software. These libraries form nowadays not only the core for modern spatial data science, but they are also fundamental parts of commercial applications used and developed by many companies around the world. \n",
"\n",
"```{note} \n",
"\n",
"If you have experience working with the Python's spatial data science stack, this tutorial probably does not bring much new to you, but to get everyone on the same page, we will all go through this introductory tutorial.\n",
"\n",
"```\n",
"\n",
"**Contents:**\n",
"\n",
" - Reading / writing spatial data\n",
" - Retrieving OpenStreetMap data\n",
" - Reprojections\n",
" - Spatial join\n",
" - Plotting data with matplotlib"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Fundamental library: Geopandas\n",
"\n",
"In this course, the most often used Python package that you will learn is [geopandas](https://geopandas.org/). Geopandas makes it possible to work with geospatial data in Python in a relatively easy way. Geopandas combines the capabilities of the data analysis library [pandas](https://pandas.pydata.org/pandas-docs/stable/) with other packages like [shapely](https://shapely.readthedocs.io/en/stable/manual.html) and [fiona](https://fiona.readthedocs.io/en/latest/manual.html) for managing spatial data. The main data structures in geopandas are `GeoSeries` and `GeoDataFrame` which extend the capabilities of `Series` and `DataFrames` from pandas. In case you wish to have additional help getting started with pandas, we recommend you to take a look lessons 5 and 6 from the openly available [Geo-Python -course](geo-python.github.io). The main difference between GeoDataFrames and pandas DataFrames is that a [GeoDataFrame](http://geopandas.org/data_structures.html#geodataframe) should contain (at least) one column for geometries. By default, the name of this column is `'geometry'`. The geometry column is a [GeoSeries](http://geopandas.org/data_structures.html#geoseries) which contains the geometries (points, lines, polygons, multipolygons etc.) as shapely objects. \n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Reading and writing spatial data\n",
"\n",
"Next we will learn some of the basic functionalities of geopandas. We have a couple of GeoJSON files stored in the `data` folder that we will use.\n",
"\n",
"We can read the data easily with `read_file()` -function:"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
| \n", " | addr:city | \n", "addr:country | \n", "addr:housenumber | \n", "addr:housename | \n", "addr:postcode | \n", "addr:street | \n", "name | \n", "opening_hours | \n", "operator | \n", "... | \n", "start_date | \n", "wikipedia | \n", "id | \n", "timestamp | \n", "version | \n", "tags | \n", "osm_type | \n", "internet_access | \n", "changeset | \n", "geometry | \n", "|
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "Helsinki | \n", "None | \n", "29 | \n", "None | \n", "00170 | \n", "Unioninkatu | \n", "None | \n", "None | \n", "None | \n", "None | \n", "... | \n", "None | \n", "None | \n", "4253124 | \n", "1542041335 | \n", "4 | \n", "None | \n", "way | \n", "None | \n", "NaN | \n", "POLYGON ((24.95121 60.16999, 24.95122 60.16988... | \n", "
| 1 | \n", "Helsinki | \n", "None | \n", "2 | \n", "None | \n", "00100 | \n", "Kaivokatu | \n", "ainfo@ateneum.fi | \n", "Ateneum | \n", "Tu, Fr 10:00-18:00; We-Th 10:00-20:00; Sa-Su 1... | \n", "None | \n", "... | \n", "1887 | \n", "fi:Ateneumin taidemuseo | \n", "8033120 | \n", "1544822447 | \n", "27 | \n", "{'architect': 'Theodor Höijer', 'contact:websi... | \n", "way | \n", "None | \n", "NaN | \n", "POLYGON ((24.94477 60.16982, 24.94450 60.16981... | \n", "
| 2 | \n", "Helsinki | \n", "FI | \n", "22-24 | \n", "None | \n", "None | \n", "Mannerheimintie | \n", "None | \n", "Lasipalatsi | \n", "None | \n", "None | \n", "... | \n", "1936 | \n", "fi:Lasipalatsi | \n", "8035238 | \n", "1533831167 | \n", "23 | \n", "{'name:fi': 'Lasipalatsi', 'name:sv': 'Glaspal... | \n", "way | \n", "None | \n", "NaN | \n", "POLYGON ((24.93561 60.17045, 24.93555 60.17054... | \n", "
| 3 | \n", "Helsinki | \n", "None | \n", "2 | \n", "None | \n", "00100 | \n", "Mannerheiminaukio | \n", "None | \n", "Kiasma | \n", "Tu 10:00-17:00; We-Fr 10:00-20:30; Sa 10:00-18... | \n", "None | \n", "... | \n", "1998 | \n", "fi:Kiasma (rakennus) | \n", "8042215 | \n", "1553963033 | \n", "30 | \n", "{'name:en': 'Museum of Modern Art Kiasma', 'na... | \n", "way | \n", "None | \n", "NaN | \n", "POLYGON ((24.93682 60.17152, 24.93662 60.17150... | \n", "
| 4 | \n", "None | \n", "FI | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "... | \n", "None | \n", "None | \n", "15243643 | \n", "1546289715 | \n", "7 | \n", "None | \n", "way | \n", "None | \n", "NaN | \n", "POLYGON ((24.93675 60.16779, 24.93660 60.16789... | \n", "
5 rows × 34 columns
\n", "| \n", " | id | \n", "timestamp | \n", "version | \n", "changeset | \n", "
|---|---|---|---|---|
| count | \n", "4.860000e+02 | \n", "4.860000e+02 | \n", "486.000000 | \n", "66.0 | \n", "
| mean | \n", "1.400780e+08 | \n", "1.455829e+09 | \n", "4.849794 | \n", "0.0 | \n", "
| std | \n", "1.633527e+08 | \n", "9.247528e+07 | \n", "4.561162 | \n", "0.0 | \n", "
| min | \n", "8.253000e+03 | \n", "1.197929e+09 | \n", "1.000000 | \n", "0.0 | \n", "
| 25% | \n", "2.294267e+07 | \n", "1.374229e+09 | \n", "2.000000 | \n", "0.0 | \n", "
| 50% | \n", "1.228699e+08 | \n", "1.493288e+09 | \n", "3.000000 | \n", "0.0 | \n", "
| 75% | \n", "1.359805e+08 | \n", "1.530222e+09 | \n", "7.000000 | \n", "0.0 | \n", "
| max | \n", "1.042029e+09 | \n", "1.555840e+09 | \n", "31.000000 | \n", "0.0 | \n", "
| \n", " | addr:city | \n", "addr:country | \n", "addr:full | \n", "addr:housenumber | \n", "addr:housename | \n", "addr:postcode | \n", "addr:place | \n", "addr:street | \n", "name | \n", "... | \n", "source | \n", "start_date | \n", "wikipedia | \n", "id | \n", "timestamp | \n", "version | \n", "geometry | \n", "tags | \n", "osm_type | \n", "changeset | \n", "|
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "Espoo | \n", "FI | \n", "None | \n", "2 | \n", "None | \n", "02150 | \n", "None | \n", "Konemiehentie | \n", "None | \n", "Aalto Tietotekniikka | \n", "... | \n", "None | \n", "1998 | \n", "None | \n", "4217650 | \n", "0 | \n", "-1 | \n", "POLYGON ((24.82129 60.18718, 24.82164 60.18712... | \n", "{\"alt_name\":\"T-talo\",\"loc_name\":\"Tikkitalo\",\"n... | \n", "way | \n", "NaN | \n", "
| 1 | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "... | \n", "None | \n", "None | \n", "None | \n", "4217760 | \n", "0 | \n", "-1 | \n", "POLYGON ((24.83776 60.18905, 24.83796 60.18938... | \n", "{\"access\":\"private\",\"parking\":\"multi-storey\",\"... | \n", "way | \n", "NaN | \n", "
| 2 | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "... | \n", "None | \n", "None | \n", "None | \n", "4220761 | \n", "0 | \n", "-1 | \n", "POLYGON ((24.85599 60.20719, 24.85590 60.20719... | \n", "None | \n", "way | \n", "NaN | \n", "
| 3 | \n", "Helsinki | \n", "FI | \n", "None | \n", "5 | \n", "Uimastadion | \n", "00250 | \n", "None | \n", "Hammarskjöldintie | \n", "None | \n", "Uimastadion | \n", "... | \n", "None | \n", "None | \n", "None | \n", "4252923 | \n", "0 | \n", "-1 | \n", "POLYGON ((24.93076 60.18914, 24.93067 60.18898... | \n", "{\"leisure\":\"stadium\",\"name:en\":\"Swimming Stadi... | \n", "way | \n", "NaN | \n", "
| 4 | \n", "Espoo | \n", "None | \n", "None | \n", "9 | \n", "None | \n", "02150 | \n", "None | \n", "Otaniementie | \n", "None | \n", "Aalto-yliopisto Harald Herlin -oppimiskeskus | \n", "... | \n", "Bing | \n", "None | \n", "None | \n", "4252948 | \n", "0 | \n", "-1 | \n", "POLYGON ((24.82740 60.18514, 24.82806 60.18480... | \n", "{\"name:en\":\"Aalto University Harald Herlin Lea... | \n", "way | \n", "NaN | \n", "
5 rows × 40 columns
\n", "| \n", " | changeset_left | \n", "tags_left | \n", "lon | \n", "id_left | \n", "timestamp_left | \n", "version_left | \n", "lat | \n", "addr:city_left | \n", "addr:country_left | \n", "addr:housenumber_left | \n", "... | \n", "shop | \n", "source_right | \n", "start_date_right | \n", "wikipedia_right | \n", "id_right | \n", "timestamp_right | \n", "version_right | \n", "tags_right | \n", "osm_type_right | \n", "changeset_right | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "0.0 | \n", "{\"contact:website\":\"http://www.pikkuranska.com... | \n", "24.866842 | \n", "25279508 | \n", "0 | \n", "0 | \n", "60.208969 | \n", "None | \n", "None | \n", "None | \n", "... | \n", "None | \n", "None | \n", "1957 | \n", "None | \n", "28175497 | \n", "0 | \n", "-1 | \n", "{\"architect\":\"Eliel Muoniovaara\",\"source:archi... | \n", "way | \n", "NaN | \n", "
| 1 | \n", "0.0 | \n", "{\"toilets:wheelchair\":\"yes\",\"was:name\":\"Antin ... | \n", "24.883369 | \n", "27392509 | \n", "0 | \n", "0 | \n", "60.181183 | \n", "None | \n", "None | \n", "None | \n", "... | \n", "None | \n", "None | \n", "None | \n", "None | \n", "26405360 | \n", "0 | \n", "-1 | \n", "None | \n", "way | \n", "NaN | \n", "
| 2 | \n", "0.0 | \n", "{\"cuisine\":\"nepalese\",\"takeaway\":\"yes\"} | \n", "25.042477 | \n", "50812719 | \n", "0 | \n", "0 | \n", "60.206657 | \n", "Helsinki | \n", "FI | \n", "4 | \n", "... | \n", "None | \n", "None | \n", "None | \n", "None | \n", "15505662 | \n", "0 | \n", "-1 | \n", "None | \n", "way | \n", "NaN | \n", "
| 3 | \n", "0.0 | \n", "{\"wheelchair\":\"yes\"} | \n", "25.030569 | \n", "50818866 | \n", "0 | \n", "0 | \n", "60.195324 | \n", "Helsinki | \n", "None | \n", "14 | \n", "... | \n", "None | \n", "None | \n", "1991 | \n", "None | \n", "10637173 | \n", "0 | \n", "-1 | \n", "{\"building:maintenance:operator\":\"Lassila&Tika... | \n", "way | \n", "NaN | \n", "
| 4 | \n", "0.0 | \n", "{\"outdoor_seating\":\"yes\",\"takeaway\":\"no\",\"whee... | \n", "25.041740 | \n", "50820888 | \n", "0 | \n", "0 | \n", "60.190361 | \n", "None | \n", "None | \n", "None | \n", "... | \n", "None | \n", "None | \n", "None | \n", "None | \n", "47855788 | \n", "0 | \n", "-1 | \n", "{\"fixme\":\"shape\"} | \n", "way | \n", "NaN | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 1385 | \n", "NaN | \n", "{\"description\":\"Suomen paras pihvipaikka - ja ... | \n", "NaN | \n", "583343945 | \n", "0 | \n", "-1 | \n", "NaN | \n", "Espoo | \n", "None | \n", "16 | \n", "... | \n", "None | \n", "None | \n", "None | \n", "None | \n", "583343945 | \n", "0 | \n", "-1 | \n", "{\"description\":\"Suomen paras pihvipaikka - ja ... | \n", "way | \n", "NaN | \n", "
| 1385 | \n", "NaN | \n", "{\"description\":\"Suomen paras pihvipaikka - ja ... | \n", "NaN | \n", "583343945 | \n", "0 | \n", "-1 | \n", "NaN | \n", "Espoo | \n", "None | \n", "16 | \n", "... | \n", "None | \n", "None | \n", "None | \n", "None | \n", "583343944 | \n", "0 | \n", "-1 | \n", "None | \n", "way | \n", "NaN | \n", "
| 1386 | \n", "NaN | \n", "{\"contact:website\":\"http://loylyhelsinki.fi/\",... | \n", "NaN | \n", "625576876 | \n", "0 | \n", "-1 | \n", "NaN | \n", "Helsinki | \n", "None | \n", "4 | \n", "... | \n", "None | \n", "None | \n", "None | \n", "None | \n", "466092407 | \n", "0 | \n", "-1 | \n", "None | \n", "way | \n", "NaN | \n", "
| 1386 | \n", "NaN | \n", "{\"contact:website\":\"http://loylyhelsinki.fi/\",... | \n", "NaN | \n", "625576876 | \n", "0 | \n", "-1 | \n", "NaN | \n", "Helsinki | \n", "None | \n", "4 | \n", "... | \n", "None | \n", "None | \n", "None | \n", "None | \n", "625576876 | \n", "0 | \n", "-1 | \n", "{\"contact:website\":\"http://loylyhelsinki.fi/\",... | \n", "way | \n", "NaN | \n", "
| 1387 | \n", "NaN | \n", "None | \n", "NaN | \n", "832179600 | \n", "0 | \n", "-1 | \n", "NaN | \n", "Sipoo | \n", "None | \n", "262 | \n", "... | \n", "None | \n", "None | \n", "None | \n", "None | \n", "832179600 | \n", "0 | \n", "-1 | \n", "None | \n", "way | \n", "NaN | \n", "
1368 rows × 76 columns
\n", "