{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "PBuSNMIUmJ8r",
    "tags": []
   },
   "source": [
    "# Performing raster and vector operations in Python using pyjeo"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "iwhEWdc8mJ80",
    "outputId": "ca7c9db2-f86f-4e73-f30d-9646681353c3"
   },
   "source": [
    "**Material provided by Dr. Pieter Kempeneers**\n",
    "\n",
    "[pyjeo](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/) is the follow up of [PKTOOLS](http://pktools.nongnu.org/html/index.html), a suite of utilities written in C++ for image processing with a focus on remote sensing applications. It is distributed under a General Public License (GPLv3) as a Python package.\n",
    "\n",
    "The examples in this notebook replicate the exercises on [pktools](http://spatial-ecology.net/docs/build/html/PKTOOLS/pktools_osgeo.html) in order to appreciate the difference and still analogy for those that are familiar with pktools.\n",
    "\n",
    "In a nutshell, the main differences between pyjeo and pktools from a user's perspective are:\n",
    "\n",
    "- pyjeo is a Python package should be run in a **Python environment**, whereas pktools applications are run from the command line (e.g., in a bash shell)\n",
    "- pyjeo runs with images entirely **in memory**, whereas pktools runs most applications line per line. This makes pyjeo considerably faster, but with a larger memory footprint. However, there are some methods implemented in pyjeo to reduce the memory footprint by tiling the image"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "%matplotlib inline\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Al7egxCLmJ81"
   },
   "source": [
    "## Creating masks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**pktools**\n",
    "We create three masks using different threshold values with [pkgetmask](http://pktools.nongnu.org/html/md_pkgetmask.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "pkgetmask  -co COMPRESS=DEFLATE -co ZLEVEL=9 -min  1  -max  2 -data 1 -nodata 0 -ot Byte  -i  geodata/vegetation/ETmean08-11.tif  -o geodata/vegetation/ETmean08-11_01_trhA.tif\n",
    "pkgetmask  -co COMPRESS=DEFLATE -co ZLEVEL=9 -min  5  -max  8 -data 1 -nodata 0 -ot Byte  -i  geodata/vegetation/ETmean08-11.tif  -o geodata/vegetation/ETmean08-11_01_trhB.tif\n",
    "pkgetmask  -co COMPRESS=DEFLATE -co ZLEVEL=9 -min  0  -max  10 -data 0 -nodata 1 -ot Byte  -i  geodata/vegetation/ETmean08-11.tif  -o geodata/vegetation/ETmean08-11_01_trhC.tif"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**pyjeo**\n",
    "\n",
    "With pyjeo we create the masks in memory in a \"pythonic\" way using [get items](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/3_reference.html#get-jim-items) without the need to write temporary files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pyjeo as pj"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "wb2l2rt5mJ81",
    "outputId": "4262b7a0-0fd9-4a42-ec86-84457c885e03"
   },
   "outputs": [],
   "source": [
    "fn = Path('geodata/vegetation/ETmean08-11.tif')\n",
    "jim = pj.Jim(fn)\n",
    "\n",
    "#get mask\n",
    "mask1 = (jim>=1) & (jim<=2)\n",
    "mask2 = (jim>=5) & (jim<=8)\n",
    "mask3 = (jim<0) | (jim>10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Al7egxCLmJ81"
   },
   "source": [
    "## Applying masks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Al7egxCLmJ81"
   },
   "source": [
    "**pktools** \n",
    "\n",
    "Use the prepared mask and apply to the image with [pksetmask](http://pktools.nongnu.org/html/md_pksetmask.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "pksetmask -co COMPRESS=DEFLATE -co ZLEVEL=9 \\\n",
    "-m geodata/vegetation/ETmean08-11_01_trhA.tif  -msknodata 1 -nodata  -9 \\\n",
    "-m geodata/vegetation/ETmean08-11_01_trhB.tif  -msknodata 1 -nodata  -5 \\\n",
    "-m geodata/vegetation/ETmean08-11_01_trhC.tif  -msknodata 1 -nodata -10 \\\n",
    "-i geodata/vegetation/ETmean08-11.tif -o geodata/vegetation/ETmean08-11_01_msk.tif"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "**pyjeo**\n",
    "\n",
    "In pyjeo, we can apply the mask in a \"pythonic\" way using [set items](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/3_reference.html#set-jim-items)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "rOCybsZamJ83",
    "outputId": "970b90d4-5153-4b54-8940-dea47cfc8774",
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "jim[mask1] = -9\n",
    "jim[mask2] = -5\n",
    "jim[mask3] = -10"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "However, we can even skip the intermediate step of creating the mask:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "rOCybsZamJ83",
    "outputId": "970b90d4-5153-4b54-8940-dea47cfc8774",
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "jim = pj.Jim(fn)\n",
    "\n",
    "jim[(jim<0) | (jim>10)] = -10\n",
    "jim[(jim>=5) & (jim<=8)] = -5\n",
    "jim[(jim>=1) & (jim<=2)] = -9"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "The result can then be written on disk if needed:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "## Exercise: check if pktools and pyjeo results are equal"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "jim0 = pj.Jim('geodata/vegetation/ETmean08-11_01_msk.tif')\n",
    "jim0.properties.isEqual(jim)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "jim.properties.isEqual(pj.Jim('geodata/vegetation/ETmean08-11_01_msk.tif'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "rOCybsZamJ83",
    "outputId": "970b90d4-5153-4b54-8940-dea47cfc8774",
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "jim.io.write('geodata/vegetation/ETmean08-11_01_msk_pyjeo.tif', co = ['COMPRESS=DEFLATE', 'ZLEVEL=9'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "xWjYbcGimJ84"
   },
   "source": [
    "## Composite images"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Al7egxCLmJ81"
   },
   "source": [
    "**pktools** \n",
    "\n",
    "Create a mask to apply during the composite"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "qMEVO1RjmJ84",
    "outputId": "99d0d72f-4de1-43a8-d9f6-58fc221a5d30"
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "pkgetmask  -co COMPRESS=DEFLATE -co ZLEVEL=9 -min 0 -max 25 -data 0 -nodata 1 -ot Byte -i geodata/LST/LST_MOYDmax_month1.tif -o geodata/LST/LST_MOYDmax_month1-msk.tif"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "c82Wn5bqmJ84"
   },
   "source": [
    "Calculate mean and standard deviation with several images with [pkcomposite](http://pktools.nongnu.org/html/md_pkcomposite.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "th7g2X8omJ85",
    "outputId": "58bd6a7d-b1dd-4456-ec73-846df8af7c76"
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "pkcomposite $(for file in geodata/LST/LST_MOYDmax_month??.tif geodata/LST/LST_MOYDmax_month?.tif; do echo -i $file; done) \\\n",
    "-m geodata/LST/LST_MOYDmax_month1-msk.tif -msknodata 0 -cr mean   -dstnodata 0 \\\n",
    "-co  COMPRESS=LZW -co ZLEVEL=9 -o geodata/LST/LST_MOYDmax_monthMean.tif\n",
    "\n",
    "pkcomposite $(for file in geodata/LST/LST_MOYDmax_month?.tif geodata/LST/LST_MOYDmax_month??.tif; do echo -i $file; done) \\\n",
    "-m geodata/LST/LST_MOYDmax_month1-msk.tif -msknodata 0 -cr stdev   -dstnodata -1 \\\n",
    "-co  COMPRESS=LZW -co ZLEVEL=9 -o geodata/LST/LST_MOYDmax_monthStdev.tif"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "8Jlxls9pmJ85"
   },
   "source": [
    "An alternative way is to use [pkstatprofile](http://pktools.nongnu.org/html/pkstatprofile.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "foJoTUH0mJ85",
    "outputId": "40b097b4-c4a2-4ead-9a7d-c8726f825f76",
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "# Create a multiband vrt\n",
    "gdalbuildvrt -overwrite -separate geodata/LST/LST_MOYDmax_month.vrt geodata/LST/LST_MOYDmax_month?.tif geodata/LST/LST_MOYDmax_month??.tif\n",
    "# Calculate mean and standard deviation\n",
    "pkstatprofile -co  COMPRESS=LZW -nodata 0 -f mean -f stdev  -i geodata/LST/LST_MOYDmax_month.vrt -o geodata/LST/LST_MOYDmax_month_mean_stdev.tif"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "**pyjeo**\n",
    "\n",
    "In pyjeo, we can can composite images using [reducePlane](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/3_reference.html?highlight=reduceplane#geometry._Geometry.reducePlane)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "First we create a Jim object for all monthly GeoTIFF files containing LST data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# iterate over files in\n",
    "# that directory\n",
    "from datetime import datetime\n",
    "files1 = sorted(Path('geodata/LST/').glob('LST_MOYDmax_month?.tif'))\n",
    "files2 = sorted(Path('geodata/LST/').glob('LST_MOYDmax_month??.tif'))\n",
    "\n",
    "#create single band multi-plane image\n",
    "mask = pj.Jim()\n",
    "jim = pj.Jim()\n",
    "for file in files1 + files2:\n",
    "    month = file.name.replace('LST_MOYDmax_month','').replace('.tif','')\n",
    "    date = datetime.strptime('2019-' + month + '-01','%Y-%m-%d')\n",
    "    jim.geometry.stackPlane(pj.Jim(file))\n",
    "    jim.properties.setDimension(date, 'plane', append = True)\n",
    "jim.properties.setDimension(['LST'], 'band')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "jim.properties.getDimension()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "## Exercise  \n",
    "use the Xarray representation of Jim to display details on the dimension and variables of the jim object"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "jim.xr()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "We then create a mask based on the first month (plane is 0) and calculate the mean composite"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "print(\"crop\")\n",
    "mask = pj.geometry.cropPlane(jim, 0)\n",
    "print(\"mask\")\n",
    "mask = (mask >= 0) & (mask <= 25)\n",
    "print(\"set mask\")\n",
    "jim[mask] = 0\n",
    "print(\"reduce plane\")\n",
    "mean = pj.geometry.reducePlane(jim, rule = 'mean', nodata = 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "## Exercise  \n",
    "The reducePlane operation takes a long time to process. The reason is the nodata value that complicates the calculation considerably. Create a new composite `mean_alldata` that does take into account all pixel values (without considering nodata values). Notice how the execution time is reduced..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mean_alldata = pj.geometry.reducePlane(jim, rule = 'mean')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "**Bridging pyjeo to Numpy**\n",
    "\n",
    "Pyjeo has been designed to allow for [bridging](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/2_tutorial.html#bridging-jim-to-third-party-packages) Jim raster image objects to third party libraries such as Numpy.\n",
    "\n",
    "We will re-use the mask Jim object to store the results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "mask.pixops.convert('GDT_Float32')\n",
    "jim.np()[jim.np()==0] = np.nan\n",
    "\n",
    "mask.np()[:] = np.nanmean(jim.np(), axis=0)\n",
    "mask.geometry.stackBand(mask)\n",
    "mask.np(1)[:] = np.nanstd(jim.np(), axis=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "We can avoid NaN in the resulting image by replacing it with 0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mask.np()[:] = np.nan_to_num(np.nanmean(jim.np(), axis=0), nan=0)\n",
    "mask.np(1)[:] = np.nan_to_num(np.nanstd(jim.np(), axis=0), nan=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "Even better is to avoid duplication of data to reduce memory footprint"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mask.np()[:] = np.nan_to_num(np.nanmean(jim.np(), axis=0), copy = False, nan=0)\n",
    "mask.geometry.stackBand(mask)\n",
    "mask.np(1)[:] = np.nan_to_num(np.nanstd(jim.np(), axis=0), copy = False, nan=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "plt.gray()  # show the filtered result in grayscale\n",
    "fig = plt.figure(figsize=(20,20))\n",
    "ax1 = fig.add_subplot(131)  # left side\n",
    "ax2 = fig.add_subplot(132)  # middle\n",
    "ax3 = fig.add_subplot(133)  # right side\n",
    "ax1.imshow(jim.np()[0,:,:])\n",
    "ax2.imshow(mask.np()[:,:])\n",
    "ax3.imshow(mask.np(1)[:,:])\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "**Bridging pyjeo to Xarray**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "jim.xr()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "xrmean = jim.xr().where(jim.xr()!=0).mean(dim = 'time', skipna=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "xrstd = jim.xr().where(jim.xr()!=0).std(dim = 'time', skipna=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "jim.xr().isel(time = 0).LST.plot(cmap='gray')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "xrmean.LST.plot(cmap='gray')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "xrstd.LST.plot(cmap='gray')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Xarray supports many operations on time series. For instance, we can alculate the seasonal median of LST.  \n",
    "Hint: check this website on [Digital Earth Africa](https://docs.digitalearthafrica.org/en/latest/sandbox/notebooks/Frequently_used_code/Working_with_time.html)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Xarray has implemented convenient wrappers for plotting. Plot the 12 months of LST via Xarray representation of Jim"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "jim.xr().LST.plot(col='time', col_wrap=6)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise\n",
    "Xarray has implemented convenient wrappers for plotting. Plot the four seasons in four different colums (see same website on [Digital Earth Africa](https://docs.digitalearthafrica.org/en/latest/sandbox/notebooks/Frequently_used_code/Working_with_time.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "ds_seasonal.LST.plot(col='season', col_wrap=4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds_seasonal"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using tiling mechanism"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "tiletotal = 4\n",
    "overlap = 0\n",
    "def reduceTile(tileindex):\n",
    "    jim = pj.Jim()\n",
    "    for file in files1 + files2:\n",
    "        print(file)\n",
    "        month = file.name.replace('LST_MOYDmax_month','').replace('.tif','')\n",
    "        date = datetime.strptime('2019-' + month + '-01','%Y-%m-%d')\n",
    "        jim.geometry.stackPlane(pj.Jim(file, tileindex = tileindex, tiletotal = tiletotal, overlap = overlap))\n",
    "        jim.properties.setDimension(date, 'plane', append = True)\n",
    "    jim.properties.setDimension(['LST'], 'band')\n",
    "    print(\"crop\")\n",
    "    mask = pj.geometry.cropPlane(jim, 0)\n",
    "    print(\"mask\")\n",
    "    mask = (mask >= 0) & (mask <= 25)\n",
    "    print(\"set mask\")\n",
    "    jim[mask] = 0\n",
    "    return pj.geometry.reducePlane(jim, rule = 'mean', nodata = 0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.gray()  # show the filtered result in grayscale\n",
    "fig = plt.figure(figsize=(20,20))\n",
    "ax = []\n",
    "ax.append(fig.add_subplot(221))\n",
    "ax.append(fig.add_subplot(222))\n",
    "ax.append(fig.add_subplot(223))\n",
    "ax.append(fig.add_subplot(224))\n",
    "\n",
    "for tileindex in range(4):\n",
    "    ax[tileindex].imshow(reduceTile(tileindex).np())\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Do0OMoGLmJ86"
   },
   "source": [
    "## Filter images"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Al7egxCLmJ81"
   },
   "source": [
    "**pktools** \n",
    "\n",
    "Aggregating and filtering images using [pkfilter](http://pktools.nongnu.org/html/pkfilter.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "RH6nU0xTmJ86",
    "outputId": "656143a2-ce5f-42d8-9cd9-0d5a41efe287"
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "# mean aggregation \n",
    "pkfilter -co COMPRESS=DEFLATE -co ZLEVEL=9 -nodata 0 -ot Float32 -dx 10 -dy 10 -f mean -d 10 -i geodata/LST/LST_MOYDmax_monthMean.tif -o geodata/LST/LST_MOYDmax_monthMean_aggreate10mean.tif\n",
    "# mean circular moving window\n",
    "pkfilter -co COMPRESS=DEFLATE -co ZLEVEL=9 -nodata 0 -ot Float32 -dx 11 -dy 11 -f mean -circ  -i geodata/LST/LST_MOYDmax_monthMean.tif -o geodata/LST/LST_MOYDmax_monthMean_circular11mean.tif"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "**pyjeo**\n",
    "\n",
    "In pyjeo, we can can use [filter](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/3_reference.html?highlight=filter2d#ngbops._NgbOps.filter2d) method, but more functions are available via bridging pyjeo to third party libraries such as [scipy]()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pyjeo as pj\n",
    "from pathlib import Path\n",
    "from scipy import ndimage\n",
    "import numpy as np\n",
    "fn = Path('geodata/LST/LST_MOYDmax_monthMean.tif')\n",
    "jim = pj.Jim(fn)\n",
    "taps = np.ones((10, 10))\n",
    "mean = pj.ngbops.firfilter2d(jim, taps=taps, norm=True, pad='symmetric')\n",
    "print(mean.properties.nrOfCol(), mean.properties.nrOfRow())\n",
    "mean = mean[::10,::10]\n",
    "print(mean.properties.nrOfCol(), mean.properties.nrOfRow())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.gray()  # show the filtered result in grayscale\n",
    "fig = plt.figure(figsize=(20,20))\n",
    "ax1 = fig.add_subplot(121)  # left side\n",
    "ax2 = fig.add_subplot(122)  # right side\n",
    "ax1.imshow(jim.np())\n",
    "ax2.imshow(mean.np())\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def unit_circle(r):\n",
    "    A = np.arange(-r,r+1)**2\n",
    "    dists = np.sqrt(A[:,None] + A)\n",
    "    return (dists<=r).astype(int)\n",
    "unit_circle(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mean = pj.ngbops.firfilter2d(jim, taps=unit_circle(10), norm=True, pad='symmetric')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.gray()  # show the filtered result in grayscale\n",
    "fig = plt.figure(figsize=(20,20))\n",
    "ax1 = fig.add_subplot(121)  # left side\n",
    "ax2 = fig.add_subplot(122)  # right side\n",
    "ax1.imshow(jim.np())\n",
    "ax2.imshow(mean.np())\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "More functions are available via bridging pyjeo to third party libraries such as Multidimensional image processing from [scipy](https://docs.scipy.org/doc/scipy/tutorial/ndimage.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "jim_filtered = pj.Jim(jim)\n",
    "jim_filtered.np()[:] = ndimage.gaussian_filter(jim.np(), sigma = 2)\n",
    "ndimage.gaussian_filter(jim.np(), sigma = 2, output=jim.np())\n",
    "assert jim.properties.isEqual(jim_filtered)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "vvikECXdmJ87"
   },
   "source": [
    "## Images statistics"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Al7egxCLmJ81"
   },
   "source": [
    "**pktools** \n",
    "\n",
    "Aggregating and filtering images using [pkstat](http://pktools.nongnu.org/html/pkstat.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "PuVVTgQpmJ87",
    "outputId": "9997f3ee-750e-4191-9456-c58d8107b24c"
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "pkstat -hist  -src_min 0  -i geodata/temperature/ug_bio_3.tif > geodata/temperature/ug_bio_3.hist\n",
    "head geodata/temperature/ug_bio_3.hist"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "6UzuZRORmJ87",
    "outputId": "f1bda511-4de5-4ade-ffc9-ffcdaa20c0aa",
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "pkstat -hist  -nbin  20 -src_min 0  -i geodata/vegetation/GPPstdev08-11.tif"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "**pyjeo**\n",
    "\n",
    "In pyjeo, we can can use [getStats](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/3_reference.html?highlight=getstats#stats._Stats.getStats)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fn = Path('geodata/temperature/ug_bio_3.tif')\n",
    "jim = pj.Jim(fn)\n",
    "stats = jim.stats.getStats('histogram', src_min = 0)\n",
    "print(stats['bin'][0:10])\n",
    "print(stats['histogram'][0:10])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fn = Path('geodata/vegetation/GPPstdev08-11.tif')\n",
    "jim = pj.Jim(fn)\n",
    "stats = jim.stats.getStats('histogram', src_min = 0, nbin = 20)\n",
    "for index, bin in enumerate(stats['bin']):\n",
    "    print(bin, stats['histogram'][index])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig = plt.figure(figsize = (10, 5))\n",
    "plt.bar(stats['bin'],stats['histogram'])\n",
    "plt.xlabel(\"pixel value\")\n",
    "plt.ylabel(\"abs frequency\")\n",
    "plt.title(\"Histogram of pixel values\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "OYjlbLPnmJ87"
   },
   "source": [
    "## Images reclassification"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Al7egxCLmJ81"
   },
   "source": [
    "**pktools** \n",
    "\n",
    "Aggregating and filtering images using [pkreclass](http://pktools.nongnu.org/html/pkreclass.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "2oFcBEgHmJ88",
    "outputId": "25dad608-fc04-4585-a3a0-2b56f9e13f59"
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "pkstat -hist -i geodata/temperature/ug_bio_3.tif | grep -v \" 0\" | awk '{ if ($1<75) {print $1, 0} else {print $1 , 1}}' > geodata/temperature/reclass_ug_bio_3.txt\n",
    "pkreclass -co COMPRESS=DEFLATE -co ZLEVEL=9 -code geodata/temperature/reclass_ug_bio_3.txt -i geodata/temperature/ug_bio_3.tif  -o geodata/temperature/reclass_ug_bio_3.tif"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "**pyjeo**\n",
    "\n",
    "In pyjeo, we can can use [reclass](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/3_reference.html?highlight=reclass#classify._Classify.reclass)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fn = 'geodata/temperature/ug_bio_3.tif'\n",
    "jim = pj.Jim(fn)\n",
    "print(jim.properties.getDataType())\n",
    "print(jim.stats.getStats())\n",
    "stats = jim.stats.getStats('histogram')\n",
    "\n",
    "for index, bin in enumerate(stats['bin']):\n",
    "    if stats['histogram'][index] > 0:\n",
    "        print(bin, stats['histogram'][index])\n",
    "\n",
    "if -9999 in jim.np():\n",
    "    print(\"value -9999 is found\")\n",
    "classes0 = [c for c in stats['bin'] if stats['histogram'][stats['bin'].index(c)] > 0 and c < 75]\n",
    "classes1 = [c for c in stats['bin'] if stats['histogram'][stats['bin'].index(c)] > 0 and c >= 75]\n",
    "reclasses0 = np.zeros_like(classes0).tolist()\n",
    "reclasses1 = np.ones_like(classes1).tolist()\n",
    "print(classes0 + classes1)\n",
    "print(reclasses0 + reclasses1)\n",
    "\n",
    "reclass = pj.classify.reclass(jim, classes = classes0 + classes1, reclasses = reclasses0 + reclasses1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "However, we can do it much simpler:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "jim[jim < 75] = 0\n",
    "jim[jim >= 75] = 1\n",
    "print(reclass.properties.isEqual(jim))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UelAeY4JmJ88"
   },
   "source": [
    "## Zonal statistic (polygon extraction)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Al7egxCLmJ81"
   },
   "source": [
    "**pktools** \n",
    "\n",
    "Aggregating and filtering images using [extractogr](http://pktools.nongnu.org/html/pkextractogr.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "zGYUSaTnmJ88",
    "outputId": "f651593c-e991-4d5f-d2aa-4304dee10f0c"
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "rm -f geodata/shp/polygons_stat.*\n",
    "pkextractogr -srcnodata -339999995214436424907732413799364296704   -r mean -r stdev -r min -i geodata/vegetation/GPPmean08-11.tif -s geodata/shp/polygons.sqlite -o geodata/shp/polygons_stat.sqlite\n",
    "pkextractogr -f \"ESRI Shapefile\" -srcnodata -339999995214436424907732413799364296704   -r mean -r stdev -r min -i geodata/vegetation/GPPmean08-11.tif -s geodata/shp/polygons.sqlite -o geodata/shp/polygons_stat.shp\n",
    "\n",
    "# we can also create a csv that can be manipulate later on with awk\n",
    "rm  -f geodata/shp/polygons_stat.csv\n",
    "pkextractogr -f CSV -srcnodata -339999995214436424907732413799364296704   -r mean -r stdev -r min -i geodata/vegetation/GPPmean08-11.tif -s geodata/shp/polygons.sqlite -o geodata/shp/polygons_stat.csv"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "xo_AY1bamJ89"
   },
   "source": [
    "**Zonal statistic (point extraction)**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import functools\n",
    "import time\n",
    "\n",
    "def timer(func):\n",
    "    @functools.wraps(func)\n",
    "    def wrapper_timer(*args, **kwargs):\n",
    "        tic = time.perf_counter()\n",
    "        value = func(*args, **kwargs)\n",
    "        toc = time.perf_counter()\n",
    "        elapsed_time = toc - tic\n",
    "        print(f\"Elapsed time: {elapsed_time:0.4f} seconds\")\n",
    "        return value\n",
    "    return wrapper_timer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "dQQp0FR9mJ89",
    "outputId": "72cb35dd-25f6-473e-c446-fd5bf6d34bb5",
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%%bash \n",
    "# at point location\n",
    "rm -f geodata/shp/point_stat.csv\n",
    "pkextractogr -f CSV -srcnodata -339999995214436424907732413799364296704 -r mean -r stdev -r min -i geodata/vegetation/GPPmean08-11.tif -s geodata/shp/presence.shp -o geodata/shp/point_stat.csv\n",
    "# at point location + 1 pixel around \n",
    "rm -f geodata/shp/point-buf_stat.csv\n",
    "pkextractogr -f CSV -buf 2 -srcnodata -339999995214436424907732413799364296704 -r mean -r stdev -r min -i geodata/vegetation/GPPmean08-11.tif -s geodata/shp/presence.shp -o geodata/shp/point-buf_stat.csv"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "**pyjeo**\n",
    "\n",
    "In pyjeo, we can can use [extract](https://jeodpp.jrc.ec.europa.eu/services/processing/pyjeohelp/3_reference.html#geometry._GeometryVect.extract)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "output in SQLite format"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fn = 'geodata/vegetation/GPPmean08-11.tif'\n",
    "vfn = 'geodata/shp/polygons.sqlite'\n",
    "jim = pj.Jim(fn)\n",
    "jim[jim<0]=-1\n",
    "print(jim.stats.getStats())\n",
    "v = pj.JimVect(vfn)\n",
    "output = 'geodata/shp/temp.sqlite'\n",
    "Path(output).unlink(missing_ok = True)\n",
    "extracted1 = pj.geometry.extract(v, jim, output=output, rule=['mean', 'stdev', 'min'], srcnodata = -1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "output in ESRI Shapefile format"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "output = 'geodata/shp/temp.shp'\n",
    "Path(output).unlink(missing_ok = True)\n",
    "extracted2 = pj.geometry.extract(v, jim, rule=['mean', 'stdev', 'min'], output='geodata/shp/temp.shp', oformat='ESRI Shapefile', srcnodata = -1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "calculate in memory and get result in dictionary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "extracted3 = pj.geometry.extract(v, jim, rule=['allpoints'], output='temp1', oformat='Memory', srcnodata = -1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "In pandas format"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "pd.DataFrame(extracted3.dict())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "Extract point data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "vfn = 'geodata/shp/presence.shp'\n",
    "v = pj.JimVect(vfn)\n",
    "extracted4 = pj.geometry.extract(v, jim, rule=['mean'], output='point', oformat='Memory', srcnodata = -1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UC8VPm5CmJ83"
   },
   "source": [
    "Extract points with buffer to calculate mean and standard deviation and minimum"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "buffer = jim.properties.getDeltaX()*1\n",
    "extracted5 = pj.geometry.extract(v, jim, rule=['mean', 'stdev', 'min'], output='point_buf', oformat='Memory', srcnodata = -1, buffer = buffer)\n",
    "pd.DataFrame(extracted5.dict())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "gYgDnke_mJ89"
   },
   "source": [
    "**Remove all the output**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "mWC8G93emJ8-"
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "rm -f  geodata/vegetation/GPPcv08-11.tif geodata/LST/*_crop.tif geodata/vegetation/ETmean08-11_crop_trh.tif geodata/vegetation/ETmean08-11_crop_trh.txt vegetation/ETmean08-11_crop.txt geodata/vegetation/ETmosaic.vrt geodata/vegetation/ETmosaic.tif geodata/vegetation/stack_??.tifgeodata/vegetation/stack.vrtgeodata/vegetation/tiles.*geodata/vegetation/ETmean08-11_crop_proximity.tifgeodata/vegetation/ETmean08-11_crop_buffer.tif geodata/dem/slope.tifgeodata/dem/aspect.tif geodata/dem/tri.tifgeodata/dem/tpi.tifgeodata/dem/roughness.tifgeodata/vegetation/ETmean08-11_01_trh?.tifgeodata/LST/LST_MOYDmax_month1-msk.tifgeodata/LST/LST_MOYDmax_monthStdev.tifgeodata/LST/LST_MOYDmax_monthMean.tifgeodata/LST/LST_MOYDmax_month_mean_stdev.tifgeodata/LST/LST_MOYDmax_month.vrtgeodata/LST/LST_MOYDmax_monthMean_aggreate10mean.tifgeodata/LST/LST_MOYDmax_monthMean_circular11mean.tif geodata/temperature/reclass_ug_bio_3.tifgeodata/temperature/reclass_ug_bio_3.txtgeodata/shp/polygons_stat.csvgeodata/shp/point-buf_stat.csvgeodata/shp/point_stat.csvgeodata/shp/polygons_stat.*geodata/shp/TM_LARGE_BORDERS.shp.* geodata/shp/TM_UGANDA_BORDERS-0.3.* geodata/vegetation/ETmean08-11_crop.txt geodata/vegetation/ETmean08-11_01_msk_pyjeo.tif geodata/shp/temp.*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "name": "02_pktools.ipynb",
   "provenance": [],
   "toc_visible": true
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}