pyjeo: an open source image processing library in Python

GEO-OPEN-HACK-2024 25/06/2024

Pieter Kempeneers, Joint Research Centre, Ispra

Time table

Time

Subject

09:30-09:45

Introduction: installation and design

09:45-10:00

The Jim data model

10:00-10:30

Briding pyjeo to third party libraries

10:30-10:45

Coffee break

10:45-11:15

Exercise 1: parallel processing through tiling

11:15-12:30

Exercise 2: parallel processing though multi-threading

12:30-13:30

Lunch break

Installation and usage

  • on your local computer

  • on SURF (Easybuild and pip)

Let’s start importing pyjeo

Design

The pyjeo package is grouped in modules. A module combines a number of operations that belong together.

[2]:
from IPython.display import Image
Image("../images/modules.png" , width = 400, height = 500)
[2]:
../_images/PKTOOLS_pyjeo_introduction2_7_0.png

Documentation

The documentation is online (readthedocs).

Call for inline help on a module

[ ]:
help(pj.geometry.warp)

Jim data model

The Jim 3D data cube is a contiguous array of data in memory (x, y, z), with:

  • x: columns

  • y: rows

  • z: planes

Bands represent different data cubes within the same Jim object.

Single band image

[ ]:
jim = pj.Jim('../data/example.tif')
[16]:
Image("../images/singleband.png" , width = 500, height = 500)
[16]:
../_images/PKTOOLS_pyjeo_introduction2_14_0.png
[ ]:
jim.properties.nrOfCol()
[ ]:
jim.properties.nrOfRow()
[ ]:
jim.properties.nrOfBand()

Multi-band image

Multi-band GeoTIFF images are read per default as multi-band images

[ ]:
jim = pj.Jim('../data/multiband.tif')
[ ]:
jim.properties.nrOfBand()
[15]:
Image("../images/multiband.png" , width = 500, height = 500)
[15]:
../_images/PKTOOLS_pyjeo_introduction2_21_0.png

Multi-plane image

Multi-band GeoTIFF images are read as 2D multi-band Jim objects as a default. To read them as 3D Jim objects, use the band2plane parameter.

[ ]:
jim = pj.Jim('../data/multiband.tif', band2plane = True)
[ ]:
jim.properties.nrOfBand()
[ ]:
jim.properties.nrOfPlane()
[5]:
Image("../images/multiband.png" , width = 500, height = 500)
[5]:
../_images/PKTOOLS_pyjeo_introduction2_26_0.png

Multi-plane multi-band image

Are a combination of multi-plane and multi-band. Multi-band 3D GeoTIFF images are currently not supported. We will see how to read these objects from NetCDF files next.

[17]:
Image("../images/cube.png" , width = 500, height = 500)
[17]:
../_images/PKTOOLS_pyjeo_introduction2_28_0.png

Bridging pyjeo to third party libraries

pyjeo Jim objects can be converted to:

We will use the xarray module to read netCDF files.

[ ]:
import xarray as xr
[ ]:
x = xr.load_dataset('../data/example.nc')

The xarray Dataset provides names for its dimensions (‘x’, ‘y’, ‘time’) and variables (‘B1’, ‘B2’)

[ ]:
x
[ ]:
jim = pj.xr2jim(x)
[ ]:
jim.properties.nrOfBand()
[ ]:
jim.properties.nrOfPlane()
[18]:
Image("../images/cube.png"  , width = 500, height = 500)
[18]:
../_images/PKTOOLS_pyjeo_introduction2_38_0.png

The variable names for the bands and the time dimension are retained when converted to a Jim object

[ ]:
jim.properties.getDimension()

Calculate NDVI (see also Exercise 1)

[ ]:
jim.pixops.NDVI(red = 'B1', nir = 'B2', name = 'NDVI', addBand = True)
[ ]:
jim.properties.getDimension()

Representing the Jim object as a Xarray Dataset, DataArray, and Numpy array

jim.xr() -> xarray Dataset representation of the Jim object jim

[ ]:
jim.xr()

jim.xr().B1 -> xarray DataArray for band B1 (without variable name)

[ ]:
jim.xr().B1

jim.xr().B1.data -> Numpy array representation of the xarray DataArray (without meta-data for geospatial information)

[ ]:
jim.xr().B1.data

Numpy array representation of a Jim object (without memory copy)

[ ]:
jim.np(band = 0)

Access Jim elements

Load a multiband image as 3D Jim object (converting bands to planes)

[19]:
Image("../images/multiband.png"  , width = 500, height = 500)
[19]:
../_images/PKTOOLS_pyjeo_introduction2_53_0.png
[20]:
Image("../images/multiplane.png", width = 500, height = 500)
[20]:
../_images/PKTOOLS_pyjeo_introduction2_54_0.png

Get the first plane

  • The order of dimensions: plane, y, x

  • Bands are not dimensions in a Jim object

  • Returns a copy of the Jim object

  • Indices start from 0 (the second plane has index 1)

  • Colons express ranges of indices ([0:2] selects indices 0 and 1, [:] selects all indices of that dimension)

  • The following code is similar to the geometry function: pj.geometry.cropPlane(0)

[ ]:
jim[0,:,:]
[21]:
Image("../images/singleband.png", width = 500, height = 500)
[21]:
../_images/PKTOOLS_pyjeo_introduction2_57_0.png

Exercise: get the first block of 5 x 5 pixels within the second plane

  • The order of dimensions: plane, y, x

  • Bands are not dimensions in a Jim object

[ ]:
jim[1,0:5,0:5].np()

Processing Jim objects

Methods and Functions

Functions

  • Functions that operate on objects must have the objects passed as arguments

  • Functions leave their arguments unaltered

  • A new object is returned

[ ]:
jim = pj.Jim('../data/multiband.tif')
[ ]:
jim.properties.nrOfBand()
[22]:
Image("../images/multiband.png", width = 500, height = 500)
[22]:
../_images/PKTOOLS_pyjeo_introduction2_65_0.png
[ ]:
jim_cropped = pj.geometry.cropBand(jim, 0)
jim_cropped.properties.nrOfBand()
[23]:
Image("../images/singleband.png"  , width = 500, height = 500)
[23]:
../_images/PKTOOLS_pyjeo_introduction2_67_0.png
[ ]:
jim.properties.nrOfBand()
[24]:
Image("../images/multiband.png" , width = 500, height = 500)
[24]:
../_images/PKTOOLS_pyjeo_introduction2_69_0.png

jim is unaltered

Methods

  • Methods directly operate on objects, i.e., instances of a class

  • Methods can change objects in-place (overwrite input)

  • No(ne) object is returned

[ ]:
jim = pj.Jim('../data/multiband.tif')
[ ]:
jim.properties.nrOfBand()
[25]:
Image("../images/multiband.png" , width = 500, height = 500)
[25]:
../_images/PKTOOLS_pyjeo_introduction2_74_0.png
[ ]:
jim.geometry.cropBand(0)
jim.properties.nrOfBand()
[26]:
Image("../images/singleband.png" , width = 500, height = 500)
[26]:
../_images/PKTOOLS_pyjeo_introduction2_76_0.png

jim is changed in-place

[ ]:
jim.np(band = 0)
[ ]:
jim.geometry.cropBand(0)
[ ]:
jim.properties.nrOfBand()

jim has been changed inplace

[27]:
Image("../images/singleband.png" , width = 500, height = 500)
[27]:
../_images/PKTOOLS_pyjeo_introduction2_82_0.png

Processing Jim images with built in functions

[ ]:
jim = pj.Jim('../data/modis_ndvi_2010.tif', band2plane = True)
[28]:
Image("../images/multiband.png" , width = 500, height = 500)
[28]:
../_images/PKTOOLS_pyjeo_introduction2_85_0.png
[ ]:
jim.geometry.reducePlane('median')
[29]:
Image("../images/singleband.png" , width = 500, height = 500)
[29]:
../_images/PKTOOLS_pyjeo_introduction2_87_0.png

Processing Jim images with third party libraries

[ ]:
from scipy import ndimage
jim.np()[:] = ndimage.gaussian_filter(jim.np(), 2)
[ ]:
jim = pj.Jim('../data/modis_ndvi_2010.tif', band2plane = True)
[30]:
Image("../images/multiplane.png" , width = 500, height = 500)
[30]:
../_images/PKTOOLS_pyjeo_introduction2_91_0.png
[ ]:
jim.xr()

Naming dimensions manually

[ ]:
jim.properties.setDimension({'plane': ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'], 'band' : ['NDVI']})

Naming dimensions is particularly useful when converting Jim objects to xarray Datasets

[ ]:
jim.xr()

We can use the great plot capabilities of xarray

[ ]:
jim.xr().NDVI.plot(col = 'time')
[ ]:

We can use the geometry function cropPlane using either indices or named dimensions (e.g, ‘Jul’)

[ ]:
july = pj.geometry.cropPlane(jim, 'Jul')