{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "PBuSNMIUmJ8r", "tags": [] }, "source": [ "# Performing raster and vector operations in Python using pyjeo" ] }, { "cell_type": "markdown", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "iwhEWdc8mJ80", "outputId": "ca7c9db2-f86f-4e73-f30d-9646681353c3" }, "source": [ "**Material provided by Dr. Pieter Kempeneers**\n", "\n", "[pyjeo](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/) is the follow up of [PKTOOLS](http://pktools.nongnu.org/html/index.html), a suite of utilities written in C++ for image processing with a focus on remote sensing applications. It is distributed under a General Public License (GPLv3) as a Python package.\n", "\n", "The examples in this notebook replicate the exercises on [pktools](http://spatial-ecology.net/docs/build/html/PKTOOLS/pktools_osgeo.html) in order to appreciate the difference and still analogy for those that are familiar with pktools.\n", "\n", "In a nutshell, the main differences between pyjeo and pktools from a user's perspective are:\n", "\n", "- pyjeo is a Python package should be run in a **Python environment**, whereas pktools applications are run from the command line (e.g., in a bash shell)\n", "- pyjeo runs with images entirely **in memory**, whereas pktools runs most applications line per line. This makes pyjeo considerably faster, but with a larger memory footprint. However, there are some methods implemented in pyjeo to reduce the memory footprint by tiling the image" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from pathlib import Path\n", "import numpy as np\n", "import pandas as pd\n", "%matplotlib inline\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "## Creating masks" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**pktools**\n", "We create three masks using different threshold values with [pkgetmask](http://pktools.nongnu.org/html/md_pkgetmask.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%bash\n", "pkgetmask -co COMPRESS=DEFLATE -co ZLEVEL=9 -min 1 -max 2 -data 1 -nodata 0 -ot Byte -i geodata/vegetation/ETmean08-11.tif -o geodata/vegetation/ETmean08-11_01_trhA.tif\n", "pkgetmask -co COMPRESS=DEFLATE -co ZLEVEL=9 -min 5 -max 8 -data 1 -nodata 0 -ot Byte -i geodata/vegetation/ETmean08-11.tif -o geodata/vegetation/ETmean08-11_01_trhB.tif\n", "pkgetmask -co COMPRESS=DEFLATE -co ZLEVEL=9 -min 0 -max 10 -data 0 -nodata 1 -ot Byte -i geodata/vegetation/ETmean08-11.tif -o geodata/vegetation/ETmean08-11_01_trhC.tif" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**pyjeo**\n", "\n", "With pyjeo we create the masks in memory in a \"pythonic\" way using [get items](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/3_reference.html#get-jim-items) without the need to write temporary files." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pyjeo as pj" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "wb2l2rt5mJ81", "outputId": "4262b7a0-0fd9-4a42-ec86-84457c885e03" }, "outputs": [], "source": [ "fn = Path('geodata/vegetation/ETmean08-11.tif')\n", "jim = pj.Jim(fn)\n", "\n", "#get mask\n", "mask1 = (jim>=1) & (jim<=2)\n", "mask2 = (jim>=5) & (jim<=8)\n", "mask3 = (jim<0) | (jim>10)" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "## Applying masks" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "**pktools** \n", "\n", "Use the prepared mask and apply to the image with [pksetmask](http://pktools.nongnu.org/html/md_pksetmask.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%bash\n", "pksetmask -co COMPRESS=DEFLATE -co ZLEVEL=9 \\\n", "-m geodata/vegetation/ETmean08-11_01_trhA.tif -msknodata 1 -nodata -9 \\\n", "-m geodata/vegetation/ETmean08-11_01_trhB.tif -msknodata 1 -nodata -5 \\\n", "-m geodata/vegetation/ETmean08-11_01_trhC.tif -msknodata 1 -nodata -10 \\\n", "-i geodata/vegetation/ETmean08-11.tif -o geodata/vegetation/ETmean08-11_01_msk.tif" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "**pyjeo**\n", "\n", "In pyjeo, we can apply the mask in a \"pythonic\" way using [set items](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/3_reference.html#set-jim-items)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "rOCybsZamJ83", "outputId": "970b90d4-5153-4b54-8940-dea47cfc8774", "scrolled": true }, "outputs": [], "source": [ "jim[mask1] = -9\n", "jim[mask2] = -5\n", "jim[mask3] = -10" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "However, we can even skip the intermediate step of creating the mask:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "rOCybsZamJ83", "outputId": "970b90d4-5153-4b54-8940-dea47cfc8774", "scrolled": true }, "outputs": [], "source": [ "jim = pj.Jim(fn)\n", "\n", "jim[(jim<0) | (jim>10)] = -10\n", "jim[(jim>=5) & (jim<=8)] = -5\n", "jim[(jim>=1) & (jim<=2)] = -9" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "The result can then be written on disk if needed:" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "## Exercise: check if pktools and pyjeo results are equal" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim0 = pj.Jim('geodata/vegetation/ETmean08-11_01_msk.tif')\n", "jim0.properties.isEqual(jim)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.properties.isEqual(pj.Jim('geodata/vegetation/ETmean08-11_01_msk.tif'))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "rOCybsZamJ83", "outputId": "970b90d4-5153-4b54-8940-dea47cfc8774", "scrolled": true }, "outputs": [], "source": [ "jim.io.write('geodata/vegetation/ETmean08-11_01_msk_pyjeo.tif', co = ['COMPRESS=DEFLATE', 'ZLEVEL=9'])" ] }, { "cell_type": "markdown", "metadata": { "id": "xWjYbcGimJ84" }, "source": [ "## Composite images" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "**pktools** \n", "\n", "Create a mask to apply during the composite" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "qMEVO1RjmJ84", "outputId": "99d0d72f-4de1-43a8-d9f6-58fc221a5d30" }, "outputs": [], "source": [ "%%bash\n", "pkgetmask -co COMPRESS=DEFLATE -co ZLEVEL=9 -min 0 -max 25 -data 0 -nodata 1 -ot Byte -i geodata/LST/LST_MOYDmax_month1.tif -o geodata/LST/LST_MOYDmax_month1-msk.tif" ] }, { "cell_type": "markdown", "metadata": { "id": "c82Wn5bqmJ84" }, "source": [ "Calculate mean and standard deviation with several images with [pkcomposite](http://pktools.nongnu.org/html/md_pkcomposite.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "th7g2X8omJ85", "outputId": "58bd6a7d-b1dd-4456-ec73-846df8af7c76" }, "outputs": [], "source": [ "%%bash\n", "pkcomposite $(for file in geodata/LST/LST_MOYDmax_month??.tif geodata/LST/LST_MOYDmax_month?.tif; do echo -i $file; done) \\\n", "-m geodata/LST/LST_MOYDmax_month1-msk.tif -msknodata 0 -cr mean -dstnodata 0 \\\n", "-co COMPRESS=LZW -co ZLEVEL=9 -o geodata/LST/LST_MOYDmax_monthMean.tif\n", "\n", "pkcomposite $(for file in geodata/LST/LST_MOYDmax_month?.tif geodata/LST/LST_MOYDmax_month??.tif; do echo -i $file; done) \\\n", "-m geodata/LST/LST_MOYDmax_month1-msk.tif -msknodata 0 -cr stdev -dstnodata -1 \\\n", "-co COMPRESS=LZW -co ZLEVEL=9 -o geodata/LST/LST_MOYDmax_monthStdev.tif" ] }, { "cell_type": "markdown", "metadata": { "id": "8Jlxls9pmJ85" }, "source": [ "An alternative way is to use [pkstatprofile](http://pktools.nongnu.org/html/pkstatprofile.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "foJoTUH0mJ85", "outputId": "40b097b4-c4a2-4ead-9a7d-c8726f825f76", "scrolled": true }, "outputs": [], "source": [ "%%bash\n", "# Create a multiband vrt\n", "gdalbuildvrt -overwrite -separate geodata/LST/LST_MOYDmax_month.vrt geodata/LST/LST_MOYDmax_month?.tif geodata/LST/LST_MOYDmax_month??.tif\n", "# Calculate mean and standard deviation\n", "pkstatprofile -co COMPRESS=LZW -nodata 0 -f mean -f stdev -i geodata/LST/LST_MOYDmax_month.vrt -o geodata/LST/LST_MOYDmax_month_mean_stdev.tif" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "**pyjeo**\n", "\n", "In pyjeo, we can can composite images using [reducePlane](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/3_reference.html?highlight=reduceplane#geometry._Geometry.reducePlane)" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "First we create a Jim object for all monthly GeoTIFF files containing LST data" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# iterate over files in\n", "# that directory\n", "from datetime import datetime\n", "files1 = sorted(Path('geodata/LST/').glob('LST_MOYDmax_month?.tif'))\n", "files2 = sorted(Path('geodata/LST/').glob('LST_MOYDmax_month??.tif'))\n", "\n", "#create single band multi-plane image\n", "mask = pj.Jim()\n", "jim = pj.Jim()\n", "for file in files1 + files2:\n", " month = file.name.replace('LST_MOYDmax_month','').replace('.tif','')\n", " date = datetime.strptime('2019-' + month + '-01','%Y-%m-%d')\n", " jim.geometry.stackPlane(pj.Jim(file))\n", " jim.properties.setDimension(date, 'plane', append = True)\n", "jim.properties.setDimension(['LST'], 'band')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.properties.getDimension()" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "## Exercise \n", "use the Xarray representation of Jim to display details on the dimension and variables of the jim object" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.xr()" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "We then create a mask based on the first month (plane is 0) and calculate the mean composite" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "print(\"crop\")\n", "mask = pj.geometry.cropPlane(jim, 0)\n", "print(\"mask\")\n", "mask = (mask >= 0) & (mask <= 25)\n", "print(\"set mask\")\n", "jim[mask] = 0\n", "print(\"reduce plane\")\n", "mean = pj.geometry.reducePlane(jim, rule = 'mean', nodata = 0)" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "## Exercise \n", "The reducePlane operation takes a long time to process. The reason is the nodata value that complicates the calculation considerably. Create a new composite `mean_alldata` that does take into account all pixel values (without considering nodata values). Notice how the execution time is reduced..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mean_alldata = pj.geometry.reducePlane(jim, rule = 'mean')" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "**Bridging pyjeo to Numpy**\n", "\n", "Pyjeo has been designed to allow for [bridging](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/2_tutorial.html#bridging-jim-to-third-party-packages) Jim raster image objects to third party libraries such as Numpy.\n", "\n", "We will re-use the mask Jim object to store the results." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "mask.pixops.convert('GDT_Float32')\n", "jim.np()[jim.np()==0] = np.nan\n", "\n", "mask.np()[:] = np.nanmean(jim.np(), axis=0)\n", "mask.geometry.stackBand(mask)\n", "mask.np(1)[:] = np.nanstd(jim.np(), axis=0)" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "We can avoid NaN in the resulting image by replacing it with 0" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mask.np()[:] = np.nan_to_num(np.nanmean(jim.np(), axis=0), nan=0)\n", "mask.np(1)[:] = np.nan_to_num(np.nanstd(jim.np(), axis=0), nan=0)" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "Even better is to avoid duplication of data to reduce memory footprint" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mask.np()[:] = np.nan_to_num(np.nanmean(jim.np(), axis=0), copy = False, nan=0)\n", "mask.geometry.stackBand(mask)\n", "mask.np(1)[:] = np.nan_to_num(np.nanstd(jim.np(), axis=0), copy = False, nan=0)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "plt.gray() # show the filtered result in grayscale\n", "fig = plt.figure(figsize=(20,20))\n", "ax1 = fig.add_subplot(131) # left side\n", "ax2 = fig.add_subplot(132) # middle\n", "ax3 = fig.add_subplot(133) # right side\n", "ax1.imshow(jim.np()[0,:,:])\n", "ax2.imshow(mask.np()[:,:])\n", "ax3.imshow(mask.np(1)[:,:])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "**Bridging pyjeo to Xarray**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.xr()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xrmean = jim.xr().where(jim.xr()!=0).mean(dim = 'time', skipna=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xrstd = jim.xr().where(jim.xr()!=0).std(dim = 'time', skipna=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim.xr().isel(time = 0).LST.plot(cmap='gray')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xrmean.LST.plot(cmap='gray')" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "xrstd.LST.plot(cmap='gray')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Xarray supports many operations on time series. For instance, we can alculate the seasonal median of LST. \n", "Hint: check this website on [Digital Earth Africa](https://docs.digitalearthafrica.org/en/latest/sandbox/notebooks/Frequently_used_code/Working_with_time.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Xarray has implemented convenient wrappers for plotting. Plot the 12 months of LST via Xarray representation of Jim" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "jim.xr().LST.plot(col='time', col_wrap=6)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise\n", "Xarray has implemented convenient wrappers for plotting. Plot the four seasons in four different colums (see same website on [Digital Earth Africa](https://docs.digitalearthafrica.org/en/latest/sandbox/notebooks/Frequently_used_code/Working_with_time.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "ds_seasonal.LST.plot(col='season', col_wrap=4)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds_seasonal" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using tiling mechanism" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tiletotal = 4\n", "overlap = 0\n", "def reduceTile(tileindex):\n", " jim = pj.Jim()\n", " for file in files1 + files2:\n", " print(file)\n", " month = file.name.replace('LST_MOYDmax_month','').replace('.tif','')\n", " date = datetime.strptime('2019-' + month + '-01','%Y-%m-%d')\n", " jim.geometry.stackPlane(pj.Jim(file, tileindex = tileindex, tiletotal = tiletotal, overlap = overlap))\n", " jim.properties.setDimension(date, 'plane', append = True)\n", " jim.properties.setDimension(['LST'], 'band')\n", " print(\"crop\")\n", " mask = pj.geometry.cropPlane(jim, 0)\n", " print(\"mask\")\n", " mask = (mask >= 0) & (mask <= 25)\n", " print(\"set mask\")\n", " jim[mask] = 0\n", " return pj.geometry.reducePlane(jim, rule = 'mean', nodata = 0)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.gray() # show the filtered result in grayscale\n", "fig = plt.figure(figsize=(20,20))\n", "ax = []\n", "ax.append(fig.add_subplot(221))\n", "ax.append(fig.add_subplot(222))\n", "ax.append(fig.add_subplot(223))\n", "ax.append(fig.add_subplot(224))\n", "\n", "for tileindex in range(4):\n", " ax[tileindex].imshow(reduceTile(tileindex).np())\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "Do0OMoGLmJ86" }, "source": [ "## Filter images" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "**pktools** \n", "\n", "Aggregating and filtering images using [pkfilter](http://pktools.nongnu.org/html/pkfilter.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "RH6nU0xTmJ86", "outputId": "656143a2-ce5f-42d8-9cd9-0d5a41efe287" }, "outputs": [], "source": [ "%%bash\n", "# mean aggregation \n", "pkfilter -co COMPRESS=DEFLATE -co ZLEVEL=9 -nodata 0 -ot Float32 -dx 10 -dy 10 -f mean -d 10 -i geodata/LST/LST_MOYDmax_monthMean.tif -o geodata/LST/LST_MOYDmax_monthMean_aggreate10mean.tif\n", "# mean circular moving window\n", "pkfilter -co COMPRESS=DEFLATE -co ZLEVEL=9 -nodata 0 -ot Float32 -dx 11 -dy 11 -f mean -circ -i geodata/LST/LST_MOYDmax_monthMean.tif -o geodata/LST/LST_MOYDmax_monthMean_circular11mean.tif" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "**pyjeo**\n", "\n", "In pyjeo, we can can use [filter](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/3_reference.html?highlight=filter2d#ngbops._NgbOps.filter2d) method, but more functions are available via bridging pyjeo to third party libraries such as [scipy]()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pyjeo as pj\n", "from pathlib import Path\n", "from scipy import ndimage\n", "import numpy as np\n", "fn = Path('geodata/LST/LST_MOYDmax_monthMean.tif')\n", "jim = pj.Jim(fn)\n", "taps = np.ones((10, 10))\n", "mean = pj.ngbops.firfilter2d(jim, taps=taps, norm=True, pad='symmetric')\n", "print(mean.properties.nrOfCol(), mean.properties.nrOfRow())\n", "mean = mean[::10,::10]\n", "print(mean.properties.nrOfCol(), mean.properties.nrOfRow())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.gray() # show the filtered result in grayscale\n", "fig = plt.figure(figsize=(20,20))\n", "ax1 = fig.add_subplot(121) # left side\n", "ax2 = fig.add_subplot(122) # right side\n", "ax1.imshow(jim.np())\n", "ax2.imshow(mean.np())\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def unit_circle(r):\n", " A = np.arange(-r,r+1)**2\n", " dists = np.sqrt(A[:,None] + A)\n", " return (dists<=r).astype(int)\n", "unit_circle(10)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mean = pj.ngbops.firfilter2d(jim, taps=unit_circle(10), norm=True, pad='symmetric')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.gray() # show the filtered result in grayscale\n", "fig = plt.figure(figsize=(20,20))\n", "ax1 = fig.add_subplot(121) # left side\n", "ax2 = fig.add_subplot(122) # right side\n", "ax1.imshow(jim.np())\n", "ax2.imshow(mean.np())\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "More functions are available via bridging pyjeo to third party libraries such as Multidimensional image processing from [scipy](https://docs.scipy.org/doc/scipy/tutorial/ndimage.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim_filtered = pj.Jim(jim)\n", "jim_filtered.np()[:] = ndimage.gaussian_filter(jim.np(), sigma = 2)\n", "ndimage.gaussian_filter(jim.np(), sigma = 2, output=jim.np())\n", "assert jim.properties.isEqual(jim_filtered)" ] }, { "cell_type": "markdown", "metadata": { "id": "vvikECXdmJ87" }, "source": [ "## Images statistics" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "**pktools** \n", "\n", "Aggregating and filtering images using [pkstat](http://pktools.nongnu.org/html/pkstat.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "PuVVTgQpmJ87", "outputId": "9997f3ee-750e-4191-9456-c58d8107b24c" }, "outputs": [], "source": [ "%%bash\n", "pkstat -hist -src_min 0 -i geodata/temperature/ug_bio_3.tif > geodata/temperature/ug_bio_3.hist\n", "head geodata/temperature/ug_bio_3.hist" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "6UzuZRORmJ87", "outputId": "f1bda511-4de5-4ade-ffc9-ffcdaa20c0aa", "scrolled": true }, "outputs": [], "source": [ "%%bash\n", "pkstat -hist -nbin 20 -src_min 0 -i geodata/vegetation/GPPstdev08-11.tif" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "**pyjeo**\n", "\n", "In pyjeo, we can can use [getStats](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/3_reference.html?highlight=getstats#stats._Stats.getStats)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fn = Path('geodata/temperature/ug_bio_3.tif')\n", "jim = pj.Jim(fn)\n", "stats = jim.stats.getStats('histogram', src_min = 0)\n", "print(stats['bin'][0:10])\n", "print(stats['histogram'][0:10])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fn = Path('geodata/vegetation/GPPstdev08-11.tif')\n", "jim = pj.Jim(fn)\n", "stats = jim.stats.getStats('histogram', src_min = 0, nbin = 20)\n", "for index, bin in enumerate(stats['bin']):\n", " print(bin, stats['histogram'][index])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig = plt.figure(figsize = (10, 5))\n", "plt.bar(stats['bin'],stats['histogram'])\n", "plt.xlabel(\"pixel value\")\n", "plt.ylabel(\"abs frequency\")\n", "plt.title(\"Histogram of pixel values\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "OYjlbLPnmJ87" }, "source": [ "## Images reclassification" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "**pktools** \n", "\n", "Aggregating and filtering images using [pkreclass](http://pktools.nongnu.org/html/pkreclass.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "2oFcBEgHmJ88", "outputId": "25dad608-fc04-4585-a3a0-2b56f9e13f59" }, "outputs": [], "source": [ "%%bash\n", "pkstat -hist -i geodata/temperature/ug_bio_3.tif | grep -v \" 0\" | awk '{ if ($1<75) {print $1, 0} else {print $1 , 1}}' > geodata/temperature/reclass_ug_bio_3.txt\n", "pkreclass -co COMPRESS=DEFLATE -co ZLEVEL=9 -code geodata/temperature/reclass_ug_bio_3.txt -i geodata/temperature/ug_bio_3.tif -o geodata/temperature/reclass_ug_bio_3.tif" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "**pyjeo**\n", "\n", "In pyjeo, we can can use [reclass](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/3_reference.html?highlight=reclass#classify._Classify.reclass)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fn = 'geodata/temperature/ug_bio_3.tif'\n", "jim = pj.Jim(fn)\n", "print(jim.properties.getDataType())\n", "print(jim.stats.getStats())\n", "stats = jim.stats.getStats('histogram')\n", "\n", "for index, bin in enumerate(stats['bin']):\n", " if stats['histogram'][index] > 0:\n", " print(bin, stats['histogram'][index])\n", "\n", "if -9999 in jim.np():\n", " print(\"value -9999 is found\")\n", "classes0 = [c for c in stats['bin'] if stats['histogram'][stats['bin'].index(c)] > 0 and c < 75]\n", "classes1 = [c for c in stats['bin'] if stats['histogram'][stats['bin'].index(c)] > 0 and c >= 75]\n", "reclasses0 = np.zeros_like(classes0).tolist()\n", "reclasses1 = np.ones_like(classes1).tolist()\n", "print(classes0 + classes1)\n", "print(reclasses0 + reclasses1)\n", "\n", "reclass = pj.classify.reclass(jim, classes = classes0 + classes1, reclasses = reclasses0 + reclasses1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, we can do it much simpler:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jim[jim < 75] = 0\n", "jim[jim >= 75] = 1\n", "print(reclass.properties.isEqual(jim))" ] }, { "cell_type": "markdown", "metadata": { "id": "UelAeY4JmJ88" }, "source": [ "## Zonal statistic (polygon extraction)" ] }, { "cell_type": "markdown", "metadata": { "id": "Al7egxCLmJ81" }, "source": [ "**pktools** \n", "\n", "Aggregating and filtering images using [extractogr](http://pktools.nongnu.org/html/pkextractogr.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "zGYUSaTnmJ88", "outputId": "f651593c-e991-4d5f-d2aa-4304dee10f0c" }, "outputs": [], "source": [ "%%bash\n", "rm -f geodata/shp/polygons_stat.*\n", "pkextractogr -srcnodata -339999995214436424907732413799364296704 -r mean -r stdev -r min -i geodata/vegetation/GPPmean08-11.tif -s geodata/shp/polygons.sqlite -o geodata/shp/polygons_stat.sqlite\n", "pkextractogr -f \"ESRI Shapefile\" -srcnodata -339999995214436424907732413799364296704 -r mean -r stdev -r min -i geodata/vegetation/GPPmean08-11.tif -s geodata/shp/polygons.sqlite -o geodata/shp/polygons_stat.shp\n", "\n", "# we can also create a csv that can be manipulate later on with awk\n", "rm -f geodata/shp/polygons_stat.csv\n", "pkextractogr -f CSV -srcnodata -339999995214436424907732413799364296704 -r mean -r stdev -r min -i geodata/vegetation/GPPmean08-11.tif -s geodata/shp/polygons.sqlite -o geodata/shp/polygons_stat.csv" ] }, { "cell_type": "markdown", "metadata": { "id": "xo_AY1bamJ89" }, "source": [ "**Zonal statistic (point extraction)**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import functools\n", "import time\n", "\n", "def timer(func):\n", " @functools.wraps(func)\n", " def wrapper_timer(*args, **kwargs):\n", " tic = time.perf_counter()\n", " value = func(*args, **kwargs)\n", " toc = time.perf_counter()\n", " elapsed_time = toc - tic\n", " print(f\"Elapsed time: {elapsed_time:0.4f} seconds\")\n", " return value\n", " return wrapper_timer" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "dQQp0FR9mJ89", "outputId": "72cb35dd-25f6-473e-c446-fd5bf6d34bb5", "scrolled": true }, "outputs": [], "source": [ "%%bash \n", "# at point location\n", "rm -f geodata/shp/point_stat.csv\n", "pkextractogr -f CSV -srcnodata -339999995214436424907732413799364296704 -r mean -r stdev -r min -i geodata/vegetation/GPPmean08-11.tif -s geodata/shp/presence.shp -o geodata/shp/point_stat.csv\n", "# at point location + 1 pixel around \n", "rm -f geodata/shp/point-buf_stat.csv\n", "pkextractogr -f CSV -buf 2 -srcnodata -339999995214436424907732413799364296704 -r mean -r stdev -r min -i geodata/vegetation/GPPmean08-11.tif -s geodata/shp/presence.shp -o geodata/shp/point-buf_stat.csv" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "**pyjeo**\n", "\n", "In pyjeo, we can can use [extract](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/3_reference.html#geometry._GeometryVect.extract)" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "output in SQLite format" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fn = 'geodata/vegetation/GPPmean08-11.tif'\n", "vfn = 'geodata/shp/polygons.sqlite'\n", "jim = pj.Jim(fn)\n", "jim[jim<0]=-1\n", "print(jim.stats.getStats())\n", "v = pj.JimVect(vfn)\n", "output = 'geodata/shp/temp.sqlite'\n", "Path(output).unlink(missing_ok = True)\n", "extracted1 = pj.geometry.extract(v, jim, output=output, rule=['mean', 'stdev', 'min'], srcnodata = -1)" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "output in ESRI Shapefile format" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "output = 'geodata/shp/temp.shp'\n", "Path(output).unlink(missing_ok = True)\n", "extracted2 = pj.geometry.extract(v, jim, rule=['mean', 'stdev', 'min'], output='geodata/shp/temp.shp', oformat='ESRI Shapefile', srcnodata = -1)" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "calculate in memory and get result in dictionary" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "extracted3 = pj.geometry.extract(v, jim, rule=['allpoints'], output='temp1', oformat='Memory', srcnodata = -1)" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "In pandas format" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "pd.DataFrame(extracted3.dict())" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "Extract point data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vfn = 'geodata/shp/presence.shp'\n", "v = pj.JimVect(vfn)\n", "extracted4 = pj.geometry.extract(v, jim, rule=['mean'], output='point', oformat='Memory', srcnodata = -1)" ] }, { "cell_type": "markdown", "metadata": { "id": "UC8VPm5CmJ83" }, "source": [ "Extract points with buffer to calculate mean and standard deviation and minimum" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "buffer = jim.properties.getDeltaX()*1\n", "extracted5 = pj.geometry.extract(v, jim, rule=['mean', 'stdev', 'min'], output='point_buf', oformat='Memory', srcnodata = -1, buffer = buffer)\n", "pd.DataFrame(extracted5.dict())" ] }, { "cell_type": "markdown", "metadata": { "id": "gYgDnke_mJ89" }, "source": [ "**Remove all the output**" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "mWC8G93emJ8-" }, "outputs": [], "source": [ "%%bash\n", "rm -f geodata/vegetation/GPPcv08-11.tif geodata/LST/*_crop.tif geodata/vegetation/ETmean08-11_crop_trh.tif geodata/vegetation/ETmean08-11_crop_trh.txt vegetation/ETmean08-11_crop.txt geodata/vegetation/ETmosaic.vrt geodata/vegetation/ETmosaic.tif geodata/vegetation/stack_??.tifgeodata/vegetation/stack.vrtgeodata/vegetation/tiles.*geodata/vegetation/ETmean08-11_crop_proximity.tifgeodata/vegetation/ETmean08-11_crop_buffer.tif geodata/dem/slope.tifgeodata/dem/aspect.tif geodata/dem/tri.tifgeodata/dem/tpi.tifgeodata/dem/roughness.tifgeodata/vegetation/ETmean08-11_01_trh?.tifgeodata/LST/LST_MOYDmax_month1-msk.tifgeodata/LST/LST_MOYDmax_monthStdev.tifgeodata/LST/LST_MOYDmax_monthMean.tifgeodata/LST/LST_MOYDmax_month_mean_stdev.tifgeodata/LST/LST_MOYDmax_month.vrtgeodata/LST/LST_MOYDmax_monthMean_aggreate10mean.tifgeodata/LST/LST_MOYDmax_monthMean_circular11mean.tif geodata/temperature/reclass_ug_bio_3.tifgeodata/temperature/reclass_ug_bio_3.txtgeodata/shp/polygons_stat.csvgeodata/shp/point-buf_stat.csvgeodata/shp/point_stat.csvgeodata/shp/polygons_stat.*geodata/shp/TM_LARGE_BORDERS.shp.* geodata/shp/TM_UGANDA_BORDERS-0.3.* geodata/vegetation/ETmean08-11_crop.txt geodata/vegetation/ETmean08-11_01_msk_pyjeo.tif geodata/shp/temp.*" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "colab": { "collapsed_sections": [], "name": "02_pktools.ipynb", "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.2" } }, "nbformat": 4, "nbformat_minor": 4 }