SDM1 : Montane woodcreper - Random Forest Model using GRASS

The r.learn.ml add-on use the python scikit-learn package to train machine learning models and perform prediction on raster layer.

Open the bash terminal, migrate in the directory, and open the jupter lab

cd /media/sf_LVM_shared/my_SE_data/exercise
wget https://raw.githubusercontent.com/selvaje/SE_data/master/exercise/SDM1_MWood_GRASSmodel.ipynb
jupyter lab SDM1_MWood_GRASSmodel.ipynb

Install r.learn.ml add-on.

Preparation

cd /home/user
git clone https://github.com/OSGeo/grass-addons.git grass_addons
sudo apt install subversion
grass  --text /home/user/my_SE_data/exercise/grassdb/south_america/PERMANENT --exec g.extension extension=r.learn.ml url=/home/user/grass_addons/src/raster/r.learn.ml
[11]:
%%bash

grass  --text --tmp-location /media/sf_LVM_shared/my_SE_data/exercise/geodata//dem/SA_elevation_mn_GMTED2010_mn_crop_msk.tif --exec <<'EOF'

# set g.region small for testing the script
g.region  n=0  s=-10 e=-60 w=-70


r.external -e input=/media/sf_LVM_shared/my_SE_data/exercise/geodata/cloud/SA_meanannual_crop_msk.tif output=SA_meanannual --o --q
r.external -e input=/media/sf_LVM_shared/my_SE_data/exercise/geodata/cloud/SA_intra_crop_msk.tif          output=SA_intra --o --q

for var in  pr tasmin tasmax ; do
for stat in stdev mean; do
r.external -e input=/media/sf_LVM_shared/my_SE_data/exercise/geodata/climate/CHELSA_${var}_1981-2010_V.2.1_land_crop_${stat}_msk.tif output=SA_${var}_${stat}
done
done

r.external -e input=/media/sf_LVM_shared/my_SE_data/exercise/geodata/dem/SA_elevation_mn_GMTED2010_mn_crop_msk.tif          output=SA_elevation --o --q
r.external -e input=/media/sf_LVM_shared/my_SE_data/exercise/geodata/dem/SA_elevation_mn_GMTED2010_mn_crop_tri_msk.tif      output=SA_tri --o --q
r.external -e input=/media/sf_LVM_shared/my_SE_data/exercise/geodata/dem/SA_elevation_mn_GMTED2010_mn_crop_aspect_msk.tif   output=aspect --o --q
r.mapcalc "SA_cos_aspect = ( cos(aspect))"
r.mapcalc "SA_sin_aspect = ( sin(aspect))"

r.external -e input=/media/sf_LVM_shared/my_SE_data/exercise/geodata/dem/SA_elevation_mn_GMTED2010_mn_crop_slope_msk.tif    output=SA_slope --o --q

r.external -e input=/media/sf_LVM_shared/my_SE_data/exercise/geodata/vegetation/SA_tree_mn_percentage_GFC2013_crop_msk.tif              output=SA_tree --o --q

v.in.ascii input=/media/sf_LVM_shared/my_SE_data/exercise/geodata/SDM/woodcreper_presence_absence.txt  output=pres_abs format=point separator=" " x=1 y=2 skip=1 columns=" x double precision, y double precision, PA integer"
v.info -c pres_abs

i.group group=group_rast input=$(g.list type=raster pattern="SA_*" separator=comma)

g.list rast -p
g.list group -p
i.group group=group_rast -l

r.learn.ml -f cv=10 group=group_rast  trainingpoints=pres_abs  field=PA  output=rf_classification classifier=RandomForestClassifier n_estimators=400 n_jobs=2
r.out.gdal --o -c -m -f createopt="COMPRESS=DEFLATE,ZLEVEL=9" type=Byte format=GTiff nodata=255  input=rf_classification  output=/media/sf_LVM_shared/my_SE_data/exercise/geodata/SDM/woodcreper_presence_absence.tif


r.learn.ml -p group=group_rast  trainingpoints=pres_abs  field=PA  output=rf_class_prob classifier=RandomForestClassifier n_estimators=200 n_jobs=2
g.list rast -p
r.out.gdal --o -c -m -f createopt="COMPRESS=DEFLATE,ZLEVEL=9" type=Float32 format=GTiff nodata=-9999  input=rf_class_prob_0  output=/media/sf_LVM_shared/my_SE_data/exercise/geodata/SDM/woodcreper_prob_absence.tif
r.out.gdal --o -c -m -f createopt="COMPRESS=DEFLATE,ZLEVEL=9" type=Float32 format=GTiff nodata=-9999  input=rf_class_prob_1  output=/media/sf_LVM_shared/my_SE_data/exercise/geodata/SDM/woodcreper_prob_presence.tif



EOF
INTEGER|cat
DOUBLE PRECISION|x
DOUBLE PRECISION|y
INTEGER|PA
----------------------------------------------
raster files available in mapset <PERMANENT>:
SA_cos_aspect   SA_meanannual   SA_sin_aspect   SA_tasmax_stdev SA_tree
SA_elevation    SA_pr_mean      SA_slope        SA_tasmin_mean  SA_tri
SA_intra        SA_pr_stdev     SA_tasmax_mean  SA_tasmin_stdev aspect

----------------------------------------------
imagery group files available in mapset <PERMANENT>:
group_rast

group <group_rast> references the following raster maps
-------------
<SA_cos_aspect@PERMANENT>      <SA_elevation@PERMANENT>
<SA_intra@PERMANENT>           <SA_meanannual@PERMANENT>
<SA_pr_mean@PERMANENT>         <SA_pr_stdev@PERMANENT>
<SA_sin_aspect@PERMANENT>      <SA_slope@PERMANENT>
<SA_tasmax_mean@PERMANENT>     <SA_tasmax_stdev@PERMANENT>
<SA_tasmin_mean@PERMANENT>     <SA_tasmin_stdev@PERMANENT>
<SA_tree@PERMANENT>            <SA_tri@PERMANENT>
-------------
----------------------------------------------
raster files available in mapset <PERMANENT>:
SA_cos_aspect       SA_pr_stdev         SA_tasmin_mean      rf_class_prob
SA_elevation        SA_sin_aspect       SA_tasmin_stdev     rf_class_prob_0
SA_intra            SA_slope            SA_tree             rf_class_prob_1
SA_meanannual       SA_tasmax_mean      SA_tri              rf_classification
SA_pr_mean          SA_tasmax_stdev     aspect

Starting GRASS GIS...
Creating new GRASS GIS location <tmploc>...
Cleaning up temporary files...

          __________  ___   __________    _______________
         / ____/ __ \/   | / ___/ ___/   / ____/  _/ ___/
        / / __/ /_/ / /| | \__ \\_  \   / / __ / / \__ \
       / /_/ / _, _/ ___ |___/ /__/ /  / /_/ // / ___/ /
       \____/_/ |_/_/  |_/____/____/   \____/___//____/

Welcome to GRASS GIS 8.2.1
GRASS GIS homepage:                      https://grass.osgeo.org
This version running through:            Bash Shell (/bin/bash)
Help is available with the command:      g.manual -i
See the licence terms with:              g.version -c
See citation options with:               g.version -x
Start the GUI with:                      g.gui wxpython
When ready to quit enter:                exit

Reading band 1 of 1...
Link to raster map <SA_pr_stdev> created.
Default region for this location updated
Region for the current mapset updated
Reading band 1 of 1...
Link to raster map <SA_pr_mean> created.
Default region for this location updated
Region for the current mapset updated
Reading band 1 of 1...
Link to raster map <SA_tasmin_stdev> created.
Default region for this location updated
Region for the current mapset updated
Reading band 1 of 1...
Link to raster map <SA_tasmin_mean> created.
Default region for this location updated
Region for the current mapset updated
Reading band 1 of 1...
Link to raster map <SA_tasmax_stdev> created.
Default region for this location updated
Region for the current mapset updated
Reading band 1 of 1...
Link to raster map <SA_tasmax_mean> created.
Default region for this location updated
Region for the current mapset updated
Scanning input for column types...
Number of columns: 3
Number of data rows: 52255
Importing points...

Populating table...
Building topology for vector map <pres_abs@PERMANENT>...
Registering primitives...
Displaying column types/names for database connection of layer <1>:
Adding raster map <SA_cos_aspect@PERMANENT> to group
Adding raster map <SA_elevation@PERMANENT> to group
Adding raster map <SA_intra@PERMANENT> to group
Adding raster map <SA_meanannual@PERMANENT> to group
Adding raster map <SA_pr_mean@PERMANENT> to group
Adding raster map <SA_pr_stdev@PERMANENT> to group
Adding raster map <SA_sin_aspect@PERMANENT> to group
Adding raster map <SA_slope@PERMANENT> to group
Adding raster map <SA_tasmax_mean@PERMANENT> to group
Adding raster map <SA_tasmax_stdev@PERMANENT> to group
Adding raster map <SA_tasmin_mean@PERMANENT> to group
Adding raster map <SA_tasmin_stdev@PERMANENT> to group
Adding raster map <SA_tree@PERMANENT> to group
Adding raster map <SA_tri@PERMANENT> to group
Group <group_rast> references the following raster maps:
Extracting training data
Removing samples with NaN values in the raster feature variables...

Fitting model using RandomForestClassifier

Cross validation global performance measures......:
accuracy: 0.946 +/-SD 0.003
precision: 0.947 +/-SD 0.005
recall: 0.976 +/-SD 0.004
f1: 0.961 +/-SD 0.002
kappa: 0.877 +/-SD 0.008
balanced_accuracy: 0.931 +/-SD 0.005
roc_auc: 0.967 +/-SD 0.003
matthews_corrcoef: 0.878 +/-SD 0.008

Feature importances
id Raster Importance
0 SA_cos_aspect@PERMANENT 0.0004
1 SA_elevation@PERMANENT 0.1231
2 SA_intra@PERMANENT 0.0061
3 SA_meanannual@PERMANENT 0.0301
4 SA_pr_mean@PERMANENT 0.0288
5 SA_pr_stdev@PERMANENT 0.001
6 SA_sin_aspect@PERMANENT 0.0006
7 SA_slope@PERMANENT 0.0008
8 SA_tasmax_mean@PERMANENT 0.0366
9 SA_tasmax_stdev@PERMANENT 0.0009
10 SA_tasmin_mean@PERMANENT 0.0024
11 SA_tasmin_stdev@PERMANENT 0.028
12 SA_tree@PERMANENT 0.0322
13 SA_tri@PERMANENT 0.0022

Predicting classification/regression raster...
Checking GDAL data type and nodata value...

Using GDAL data type <Byte>
Exporting raster data to GTiff format...

r.out.gdal complete. File
</media/sf_LVM_shared/my_SE_data/exercise/geodata/SDM/woodcreper_presence_absence.tif>
created.
Group <group_rast> references the following raster maps:
Extracting training data
Removing samples with NaN values in the raster feature variables...

Fitting model using RandomForestClassifier

Predicting classification/regression raster...
Predicting class probabilities...
Checking GDAL data type and nodata value...

Using GDAL data type <Float32>
Exporting raster data to GTiff format...

r.out.gdal complete. File
</media/sf_LVM_shared/my_SE_data/exercise/geodata/SDM/woodcreper_prob_absence.tif>
created.
Checking GDAL data type and nodata value...

Using GDAL data type <Float32>
Exporting raster data to GTiff format...

r.out.gdal complete. File
</media/sf_LVM_shared/my_SE_data/exercise/geodata/SDM/woodcreper_prob_presence.tif>
created.
Cleaning up default sqlite database ...
Cleaning up temporary files...
Done.

Goodbye from GRASS GIS

[ ]: