{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "PBuSNMIUmJ8r" }, "source": [ "# Introduction to pyjeo" ] }, { "cell_type": "markdown", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "iwhEWdc8mJ80", "outputId": "ca7c9db2-f86f-4e73-f30d-9646681353c3" }, "source": [ "**Material provided by Dr. Pieter Kempeneers (European Commission, Joint Research Centre)**\n", "\n", "[pyjeo](https://pyjeo.readthedocs.io) is the follow up of [PKTOOLS](http://pktools.nongnu.org/html/index.html), a suite of utilities written in C++ for image processing with a focus on remote sensing applications. It is distributed under a General Public License (GPLv3) as a Python package.\n", "\n", "In a nutshell, the main differences between pyjeo and pktools from a user's perspective are:\n", "\n", "- pyjeo is a Python package should be run in a **Python environment**, whereas pktools applications are run from the command line (e.g., in a bash shell)\n", "- pyjeo runs with images entirely **in memory**, whereas pktools runs most applications line per line. This makes pyjeo considerably faster, but with a larger memory footprint. However, there are some methods implemented in pyjeo to reduce the memory footprint by tiling the image\n", "\n", "\n", "Run the following script to perform the installation of pyjeo\n", "\n", " sudo apt update\n", " sudo apt upgrade -y \n", " cd ~/Downloads\n", " wget https://raw.githubusercontent.com/selvaje/SE_data/main/exercise/install_pyjeo.sh\n", " sudo install_pyjeo.sh\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import Python modules" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from pathlib import Path\n", "import numpy as np\n", "import pandas as pd\n", "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "from datetime import datetime\n", "import pyjeo as pj" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Call the inline help function for [pj.geometry.warp](https://pyjeo.readthedocs.io/en/latest/3_reference.html#geometry.warp)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "help(pj.geometry.warp)" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "## Jim: a geospatial raster dataset object" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Open an image and show its properties" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "wb2l2rt5mJ81", "outputId": "4262b7a0-0fd9-4a42-ec86-84457c885e03" }, "outputs": [], "source": [ "jim = pj.Jim('geodata/vegetation/ETmean08-11.tif')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.properties.nrOfCol()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A Jim object is a geospatial dataset within a coordinate reference system" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.properties.getProjection()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From [gdal](https://gdal.org/tutorials/geotransforms_tut.html#introduction-to-geotransforms): \"A geotransform is an affine transformation from the image coordinate space (row, column), also known as (pixel, line) to the georeferenced coordinate space (projected or geographic coordinates)\"\n", "\n", "Geotransform is an array with 6 items:\n", "\n", "[0] top left x\n", "\n", "[1] w-e pixel resolution\n", "\n", "[2] rotation, 0 if image is “north up”\n", "\n", "[3] top left y\n", "\n", "[4] rotation, 0 if image is “north up”\n", "\n", "[5] n-s pixel resolution" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.properties.getGeoTransform()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have a single band and single plane raster object" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.properties.nrOfBand()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.properties.nrOfPlane()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Open a second GeoTIFF image of the same dimension" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "gpp = pj.Jim('geodata/vegetation/GPPmean08-11.tif')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Stack this image to the existing bands of jim, creating a multi-band jim object" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.geometry.stackBand(gpp)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.properties.nrOfBand()" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "## Exercise 1: stackBand function vs. method" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a new jim object `jim_stacked` using the [stackBand](https://pyjeo.readthedocs.io/en/latest/3_reference.html#geometry.stackBand) function instead of the [stackBand](https://pyjeo.readthedocs.io/en/latest/3_reference.html#geometry._Geometry.stackBand) method. It should contain two bands, one for `ETmean08-11.tif` and one for `GPPmean08-11.tif`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "## Labeled dimensions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's label the dimensions" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bandnames = ['ET', 'GPP']\n", "planenames = [datetime.strptime('2019-08-11','%Y-%m-%d')]\n", "jim.properties.setDimension({'band' : bandnames, 'plane' : planenames})" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.properties.getDimension('band')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.properties.getDimension('plane')" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "## Exercise 2: cropBand using numbered and labeled index" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Crop second band as a new image (index starts from 0)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim1 = pj.geometry.cropBand(jim, 1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a new jim object `gpp` that contains the second band using the [cropBand]() function and the labeled index ('GPP')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "gpp = pj.geometry.cropBand(jim, 'GPP')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check if results are identical" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim1.properties.isEqual(gpp)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim1.properties.getDimension()" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "## Working with multi-plane Jim objects" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First convert the bands to planes using [band2plane](https://pyjeo.readthedocs.io/en/latest/3_reference.html#geometry._Geometry.band2plane)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.geometry.band2plane()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The labels of band are copied to plane" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.properties.getDimension()" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "## Calculate composites via reducePlane" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Reduce the planes by calculating the mean of all planes" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "jim_mean = pj.geometry.reducePlane(jim, 'mean')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Where is the RuntimeWarning coming from?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.stats.getStats()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim[jim < 0] = 0\n", "jim.stats.getStats()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim_mean = pj.geometry.reducePlane(jim, 'mean')" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "### Calculate custom composites via callback function in reducePlane" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def getMax(reduced, plane):\n", " return pj.pixops.supremum(reduced, plane)\n", "jim_max = pj.geometry.reducePlane(jim, getMax)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Crop the plane `ET` from `jim` using the method [cropPlane](https://pyjeo.readthedocs.io/en/latest/3_reference.html#geometry._Geometry.cropPlane). Check the resulting `jim` is a single plane and single band Jim object" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.geometry.cropPlane('ET')\n", "print(jim.properties.nrOfPlane())\n", "print(jim.properties.nrOfBand())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.geometry.plane2band()\n", "bandnames = ['ET']\n", "planenames = [datetime.strptime('2019-08-11','%Y-%m-%d')]\n", "jim.properties.setDimension({'band' : bandnames, 'plane' : planenames})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bridging to third party libraries: Numpy" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "jim.np()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check the type of jim.np()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(jim.np())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use [sum](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.sum.html#numpy-ndarray-sum) function of Numpy to assert there are no pixels where `jim_max < jim_mean`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(jim_max < jim_mean).np().sum()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Get items\n", "\n", "With pyjeo we create the masks in memory in a \"pythonic\" way using [get items](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/3_reference.html#get-jim-items) without the need to write temporary files." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get sub-dataset based on pixel coordinates (first 3 rows and columns, starting from 0)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim[0:3,0:3]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Last 3 rows and columns, show geographic bounding box [ulx, uly, lrx, lry]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim[-3:,-3:].properties.getBBox()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim[0:3,0:3].np()" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "### Set items\n", "\n", "In pyjeo, we can apply the mask in a \"pythonic\" way using [set items](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/3_reference.html#set-jim-items)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "rOCybsZamJ83", "outputId": "970b90d4-5153-4b54-8940-dea47cfc8774", "scrolled": false }, "outputs": [], "source": [ "jim[0:3,0:3] = 0\n", "jim[0:5,0:5].np()" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "**Display image using matplotlib**\n", "\n", "We can show an image with matplotlib by providing a Numpy representation of the dataset" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# plt.gray() # show the filtered result in grayscale\n", "fig = plt.figure(figsize=(10,10))\n", "ax1 = fig.add_subplot(111)\n", "ax1.imshow(jim.np())\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "## Calculate median filter via scipy" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from scipy import ndimage\n", "npimage = ndimage.median_filter(jim.np(), size = 5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# plt.gray() # show the filtered result in grayscale\n", "fig = plt.figure(figsize=(10,10))\n", "ax1 = fig.add_subplot(111)\n", "ax1.imshow(npimage)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "## Exercise 4: Create a geospatial Jim object with CRS from a Numpy array" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**A numpy array is not a geospatial dataset with a spatial coordinate reference system !**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ajim = pj.np2jim(npimage)\n", "print(ajim.properties.getProjection())\n", "print(ajim.properties.getGeoTransform())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use [setProjection](https://pyjeo.readthedocs.io/en/latest/3_reference.html#properties._Properties.setProjection) and [getProjection](https://pyjeo.readthedocs.io/en/latest/3_reference.html#properties._Properties.getProjection) to set the Coordinate Reference System (CRS) of ajim" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "## Calculating the median_filter in place using the output parameter" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ndimage.median_filter(jim.np(), output = jim.np(), size = 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As an alternative, we can set the Numpy representation of a Jim object in place" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.np()[:] = ndimage.median_filter(jim.np(), size = 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bridging to third party libraries: Xarray" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.xr()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**An Xarray dataset is defined with a spatial coordinate reference system !**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ajim = pj.xr2jim(jim.xr())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(ajim.properties.getProjection())\n", "print(ajim.properties.getGeoTransform())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "At the basis we find a Numpy data array" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(jim.xr().ET.data)" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "### Calculate median filter via Xarray" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Processing can take several minutes..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# xr_median = jim.xr().ET.rolling(x = 3, y = 3, center=True).median()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# xr_median.plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Notice that XArray member functions return the processed result and do not alter the input image !**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bridging JimVect to third party libraries" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sample = pj.JimVect('geodata/shp/polygons.sqlite')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "pdf = pd.DataFrame(sample.dict())\n", "pdf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data extraction and regional statistics" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "extracted = pj.geometry.extract(sample, jim, rule='mean', output='/vsimem/mean.json', oformat='GeoJSON')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pdf = pd.DataFrame(extracted.dict())\n", "pdf" ] } ], "metadata": { "colab": { "collapsed_sections": [], "name": "02_pktools.ipynb", "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" } }, "nbformat": 4, "nbformat_minor": 4 }