2.12. Hemalatha Velappan: Classification of different tree species plantations using deep learning

2.12.1. The goal of this work is to develop a model to identify planted forests and the tree species growing there. The model is developed using the

2.12.1.1. (1) known locations of planted forests based on literature and personal communications,

2.12.1.2. (2) image analysis and feature extraction of planted trees

2.12.1.3. (3) spectral signatures unique to each species

[1]:

import pandas as pd
import numpy as np
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
import scipy
from osgeo import ogr
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split

[52]:

#Loading plantation shapefile location

%cd /media/sf_LVM_shared/my_SE_data/Plantation_datasets/Peru_Plantation_Shapefile-Updated

/media/sf_LVM_shared/my_SE_data/Plantation_datasets/Peru_Plantation_Shapefile-Updated

[9]:

!gdalinfo -mm SentinelMap.tif

Driver: GTiff/GeoTIFF
Files: SentinelMap.tif
Size is 898, 770
Coordinate System is:
GEOGCRS["WGS 84",
    DATUM["World Geodetic System 1984",
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["geodetic latitude (Lat)",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["geodetic longitude (Lon)",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4326]]
Data axis to CRS axis mapping: 2,1
Origin = (-76.889769608227439,-7.186342609899349)
Pixel Size = (0.000269494585236,-0.000269494585236)
Metadata:
  AREA_OR_POINT=Area
Image Structure Metadata:
  COMPRESSION=LZW
  INTERLEAVE=PIXEL
Corner Coordinates:
Upper Left  ( -76.8897696,  -7.1863426) ( 76d53'23.17"W,  7d11'10.83"S)
Lower Left  ( -76.8897696,  -7.3938534) ( 76d53'23.17"W,  7d23'37.87"S)
Upper Right ( -76.6477635,  -7.1863426) ( 76d38'51.95"W,  7d11'10.83"S)
Lower Right ( -76.6477635,  -7.3938534) ( 76d38'51.95"W,  7d23'37.87"S)
Center      ( -76.7687665,  -7.2900980) ( 76d46' 7.56"W,  7d17'24.35"S)
Band 1 Block=256x256 Type=UInt16, ColorInterp=Gray
  Description = B1
    Computed Min/Max=1005.000,8417.000
Band 2 Block=256x256 Type=UInt16, ColorInterp=Undefined
  Description = B2
    Computed Min/Max=696.000,9929.000
Band 3 Block=256x256 Type=UInt16, ColorInterp=Undefined
  Description = B3
    Computed Min/Max=471.000,10214.000
Band 4 Block=256x256 Type=UInt16, ColorInterp=Undefined
  Description = B4
    Computed Min/Max=269.000,11283.000
Band 5 Block=256x256 Type=UInt16, ColorInterp=Undefined
  Description = B5
    Computed Min/Max=269.000,10690.000
Band 6 Block=256x256 Type=UInt16, ColorInterp=Undefined
  Description = B6
    Computed Min/Max=419.000,11095.000
Band 7 Block=256x256 Type=UInt16, ColorInterp=Undefined
  Description = B7
    Computed Min/Max=437.000,11857.000
Band 8 Block=256x256 Type=UInt16, ColorInterp=Undefined
  Description = B8
    Computed Min/Max=390.000,11342.000
Band 9 Block=256x256 Type=UInt16, ColorInterp=Undefined
  Description = B8A
    Computed Min/Max=355.000,12318.000
Band 10 Block=256x256 Type=UInt16, ColorInterp=Undefined
  Description = B9
    Computed Min/Max=56.000,2030.000
Band 11 Block=256x256 Type=UInt16, ColorInterp=Undefined
  Description = B10
    Computed Min/Max=3.000,31.000
Band 12 Block=256x256 Type=UInt16, ColorInterp=Undefined
  Description = B11
    Computed Min/Max=78.000,13317.000
Band 13 Block=256x256 Type=UInt16, ColorInterp=Undefined
  Description = B12
    Computed Min/Max=24.000,16040.000

2.12.1.3.1. Performing zonal statistics on the polygon shapefile wrt the sentinel satellite image

[3]:

!pkextractogr -f CSV -i SentinelMap.tif -s Peru_XY/Peru_with_XY.shp -r allpoints  -r mean -r stdev -o extracted2.csv

processing layer Peru_with_XY
0...10...20...30...40...50...60...70...80...90...100 - done.

[3]:

predictors = pd.read_csv("extracted2.csv")
predictors.head(10)

[3]:

	OBJECTID	FUENTE/SOU	DOCREG	FECREG	OBSERV	ZONUTM	ORIGEN	TIPCOM	NUMREG	NOMTIT/Tit	...	b4	b5	b6	b7	b8	b9	b10	b11	b12	Classification
0	99039	GORE San Martín	APP-RNPF	27:59.0	ara autorizar el aprovechamiento la U.O.G.F. ...	18	1	1	22-SAM/REG-PLT-2019-050	CARBAJAL VIGO, SEBASTIAN	...	798	1965	2724	2588	2981	475	11	1301	493	10
1	99039	GORE San Martín	APP-RNPF	27:59.0	ara autorizar el aprovechamiento la U.O.G.F. ...	18	1	1	22-SAM/REG-PLT-2019-050	CARBAJAL VIGO, SEBASTIAN	...	796	2069	2740	2661	3053	475	11	1269	460	10
2	99039	GORE San Martín	APP-RNPF	27:59.0	ara autorizar el aprovechamiento la U.O.G.F. ...	18	1	1	22-SAM/REG-PLT-2019-050	CARBAJAL VIGO, SEBASTIAN	...	805	2114	2717	2656	3130	447	10	1309	485	10
3	99039	GORE San Martín	APP-RNPF	27:59.0	ara autorizar el aprovechamiento la U.O.G.F. ...	18	1	1	22-SAM/REG-PLT-2019-050	CARBAJAL VIGO, SEBASTIAN	...	741	1731	2253	2031	2501	447	10	1197	440	10
4	99039	GORE San Martín	APP-RNPF	27:59.0	ara autorizar el aprovechamiento la U.O.G.F. ...	18	1	1	22-SAM/REG-PLT-2019-050	CARBAJAL VIGO, SEBASTIAN	...	854	2127	2660	2651	3144	461	9	1496	580	10
5	99039	GORE San Martín	APP-RNPF	27:59.0	ara autorizar el aprovechamiento la U.O.G.F. ...	18	1	1	22-SAM/REG-PLT-2019-050	CARBAJAL VIGO, SEBASTIAN	...	756	1901	2482	2427	2841	461	9	1302	492	10
6	99039	GORE San Martín	APP-RNPF	27:59.0	ara autorizar el aprovechamiento la U.O.G.F. ...	18	1	1	22-SAM/REG-PLT-2019-050	CARBAJAL VIGO, SEBASTIAN	...	739	1792	2374	2293	2648	438	9	1283	518	10
7	99039	GORE San Martín	APP-RNPF	27:59.0	ara autorizar el aprovechamiento la U.O.G.F. ...	18	1	1	22-SAM/REG-PLT-2019-050	CARBAJAL VIGO, SEBASTIAN	...	797	2108	2745	2731	3185	488	10	1364	502	10
8	99039	GORE San Martín	APP-RNPF	27:59.0	ara autorizar el aprovechamiento la U.O.G.F. ...	18	1	1	22-SAM/REG-PLT-2019-050	CARBAJAL VIGO, SEBASTIAN	...	738	1755	2448	2214	2763	472	9	1178	439	10
9	99039	GORE San Martín	APP-RNPF	27:59.0	ara autorizar el aprovechamiento la U.O.G.F. ...	18	1	1	22-SAM/REG-PLT-2019-050	CARBAJAL VIGO, SEBASTIAN	...	817	1925	2571	2396	2900	472	9	1404	533	10

10 rows × 56 columns

2.12.1.3.2. The following are the tree species and the corresponding label numbers

2.12.1.3.2.1. Acrocarpus fraxinifolius 1

2.12.1.3.2.2. Calycophyllum spruceanum 2

2.12.1.3.2.3. Cedrela Mixed 3

2.12.1.3.2.4. Guazuma crinita 4

2.12.1.3.2.5. Miconia barbeyana 5

2.12.1.3.2.6. Ochroma pyramidale 6

2.12.1.3.2.7. Other Mixed 7

2.12.1.3.2.8. Swietenia Cedrela Mixed 8

2.12.1.3.2.9. Swietenia macrophylla 9

2.12.1.3.2.10. Swietenia Mixed 10

[4]:

Desired_columns = ['X', 'Y', 'b0', 'b1','b2', 'b3','b4', 'b5', 'b6', 'b7', 'b8', 'b9', 'b10', 'b11', 'b12', 'Classification']
print(Desired_columns)

['X', 'Y', 'b0', 'b1', 'b2', 'b3', 'b4', 'b5', 'b6', 'b7', 'b8', 'b9', 'b10', 'b11', 'b12', 'Classification']

[5]:

Desired_Output = predictors[Desired_columns]
Desired_Output.head()

[5]:

	X	Y	b0	b1	b2	b3	b4	b5	b6	b7	b8	b9	b10	b11	b12	Classification
0	-76.75757	-7.1862	1230	932	792	507	798	1965	2724	2588	2981	475	11	1301	493	10
1	-76.75757	-7.1862	1230	922	787	479	796	2069	2740	2661	3053	475	11	1269	460	10
2	-76.75757	-7.1862	1227	930	806	495	805	2114	2717	2656	3130	447	10	1309	485	10
3	-76.75757	-7.1862	1227	916	761	476	741	1731	2253	2031	2501	447	10	1197	440	10
4	-76.75757	-7.1862	1232	943	823	535	854	2127	2660	2651	3144	461	9	1496	580	10

[6]:

Desired_Output = Desired_Output.to_numpy()

2.12.1.3.3. The input and output variables are split between training and testing by 70:30

2.12.1.4. All the 14 columns are X variables. The final column that has categorical numbers is the target or Y variable

[7]:

#Split the data
X_train, X_test, y_train, y_test = train_test_split(Desired_Output[:,:14], Desired_Output[:,15], test_size=0.30, random_state=0)
X_train = torch.FloatTensor(X_train)
y_train = torch.LongTensor(y_train)-1
X_test = torch.FloatTensor(X_test)
y_test = torch.LongTensor(y_test)-1
print('X_train.shape: {}, X_test.shape: {}, y_train.shape: {}, y_test.shape: {}'.format(X_train.shape, X_test.shape, y_train.shape, y_test.shape))

X_train.shape: torch.Size([6942, 14]), X_test.shape: torch.Size([2976, 14]), y_train.shape: torch.Size([6942]), y_test.shape: torch.Size([2976])

2.12.1.4.1. The feedforward module with 3 hidden layers are built with different activation functions applied to the layers

[8]:

# Creating a feedforward module

class Feedforward(torch.nn.Module):
    def __init__(self, input_size, hidden_size, output_size=10):
        super(Feedforward, self).__init__()
        self.input_size = input_size
        self.hidden_size  = hidden_size
        self.fc1 = torch.nn.Linear(self.input_size, self.hidden_size)
        self.fc2 = torch.nn.Linear(self.hidden_size, self.hidden_size)
        self.relu = torch.nn.ReLU()
        self.fc3 = torch.nn.Linear(self.hidden_size, output_size)
        self.sigmoid = torch.nn.Sigmoid()
        self.tanh = torch.nn.Tanh()
    def forward(self, x):
        hidden = self.relu(self.fc1(x))
        hidden = self.relu(self.fc2(hidden))
        output = self.tanh(self.fc3(hidden))

        return output

[9]:

model = Feedforward(14, 256) #input_size = 14 and hidden_size = 256
optimizer = torch.optim.SGD(model.parameters(), lr=0.001)
loss_function = nn.CrossEntropyLoss()

[10]:

epochs = 27 #27 is chosen because choosing higher numbers make python to crash and after 26 the loss is stabilized
aggregated_losses = []

for i in range(epochs):
    i += 1
    y_pred = model(X_train)
    single_loss = loss_function(y_pred, y_train)
    aggregated_losses.append(single_loss)

    if i%25 == 1:
        print(f'epoch: {i:3} loss: {single_loss.item():10.8f}')

    optimizer.zero_grad()
    single_loss.backward()
    optimizer.step()

print(f'epoch: {i:3} loss: {single_loss.item():10.10f}')

epoch:   1 loss: 2.96000528
epoch:  26 loss: 2.80332470
epoch:  27 loss: 2.8033246994

[11]:

with torch.no_grad():
    y_val = model(X_test)
    loss = loss_function(y_val, y_test)
print(f'Loss: {loss:.8f}')

Loss: 2.82178712

[12]:

from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

print(confusion_matrix(y_test, y_val.argmax(dim=1)))
print(classification_report(y_test,y_val.argmax(dim=1)))
print(accuracy_score(y_test, y_val.argmax(dim=1)))

[[  0  52   0   0   0   0   0   0   0]
 [  0  17   0   0   0   0   0   0   0]
 [  0  10   0   0   0   0   0   0   0]
 [  0 269   0   0   0   0   0   0   0]
 [  0 869   0   0   0   0   0   0   0]
 [  0 922   0   0   0   0   0   0   0]
 [  0 293   0   0   0   0   0   0   0]
 [  0   8   0   0   0   0   0   0   0]
 [  0 536   0   0   0   0   0   0   0]]
              precision    recall  f1-score   support

           0       0.00      0.00      0.00        52
           1       0.01      1.00      0.01        17
           2       0.00      0.00      0.00        10
           3       0.00      0.00      0.00       269
           5       0.00      0.00      0.00       869
           6       0.00      0.00      0.00       922
           7       0.00      0.00      0.00       293
           8       0.00      0.00      0.00         8
           9       0.00      0.00      0.00       536

    accuracy                           0.01      2976
   macro avg       0.00      0.11      0.00      2976
weighted avg       0.00      0.01      0.00      2976

0.00571236559139785

/usr/lib/python3/dist-packages/sklearn/metrics/_classification.py:1272: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))

2.12.1.4.2. Creating a single-layer perceptron model

[13]:

# Create the model
class Perceptron(torch.nn.Module):
    def __init__(self,input_size, output_size,use_activation_fn=None):
        super(Perceptron, self).__init__()
        self.fc = nn.Linear(input_size,output_size)
        self.relu = torch.nn.ReLU() # instead of Heaviside step fn
        self.sigmoid = torch.nn.Sigmoid()
        self.tanh = torch.nn.Tanh()
        self.use_activation_fn=use_activation_fn
    def forward(self, x):
        output = self.fc(x)
        if self.use_activation_fn=='sigmoid':
            output = self.sigmoid(output) # To add the non-linearity. Try training you Perceptron with and without the non-linearity
        elif self.use_activation_fn=='tanh':
            output = self.tanh(output)
        elif self.use_activation_fn=='relu':
            output = self.relu(output)

        return output

[46]:

model = Perceptron(input_size=14, output_size=10, use_activation_fn='tanh')
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr = 0.01)

[47]:

epochs = 300
aggregated_losses = []

for i in range(epochs):
    i += 1
    y_pred = model(X_train)
    single_loss = loss_function(y_pred, y_train)
    aggregated_losses.append(single_loss)

    if i%25 == 1:
        print(f'epoch: {i:3} loss: {single_loss.item():10.8f}')

    optimizer.zero_grad()
    single_loss.backward()
    optimizer.step()

print(f'epoch: {i:3} loss: {single_loss.item():10.10f}')

epoch:   1 loss: 2.86867380
epoch:  26 loss: 2.86806893
epoch:  51 loss: 2.86806870
epoch:  76 loss: 2.86806870
epoch: 101 loss: 2.86806870
epoch: 126 loss: 2.86806870
epoch: 151 loss: 2.86801815
epoch: 176 loss: 2.86790514
epoch: 201 loss: 2.86790514
epoch: 226 loss: 2.86790514
epoch: 251 loss: 2.86790514
epoch: 276 loss: 2.86790514
epoch: 300 loss: 2.8679051399

[48]:

with torch.no_grad():
    y_val = model(X_test)
    loss = loss_function(y_val, y_test)
print(f'Loss: {loss:.8f}')

Loss: 2.87435365

[49]:

from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

print(confusion_matrix(y_test, y_val.argmax(dim=1)))
print(classification_report(y_test,y_val.argmax(dim=1)))
print(accuracy_score(y_test, y_val.argmax(dim=1)))

[[ 52   0   0   0   0   0   0   0   0]
 [ 17   0   0   0   0   0   0   0   0]
 [ 10   0   0   0   0   0   0   0   0]
 [269   0   0   0   0   0   0   0   0]
 [869   0   0   0   0   0   0   0   0]
 [922   0   0   0   0   0   0   0   0]
 [293   0   0   0   0   0   0   0   0]
 [  8   0   0   0   0   0   0   0   0]
 [536   0   0   0   0   0   0   0   0]]
              precision    recall  f1-score   support

           0       0.02      1.00      0.03        52
           1       0.00      0.00      0.00        17
           2       0.00      0.00      0.00        10
           3       0.00      0.00      0.00       269
           5       0.00      0.00      0.00       869
           6       0.00      0.00      0.00       922
           7       0.00      0.00      0.00       293
           8       0.00      0.00      0.00         8
           9       0.00      0.00      0.00       536

    accuracy                           0.02      2976
   macro avg       0.00      0.11      0.00      2976
weighted avg       0.00      0.02      0.00      2976

0.01747311827956989

2.12.2. Results:

2.12.2.1. Using single-layer perceptron and multi-layer feedforward deep learning models did not yield high accuracy in distinguishing different tree plantations using Sentinel-2 dataset. There could be multiple reasons behind this. Since spectral band information from Sentinel 2 satellite image constitute most of the X variables and because of the high spectral similarity between tree species, the model couldn’t find a way to distinguish the species. It can also because of the poor modeling parameters such as poor selection of activation functions, learning rate, epoch numbers etc. which could have affected the model as well. In the future, more input parameters like texture metrics, band information from sentinel-1 etc. can be added into the model for better variance. Further different modelling parameters can be explored to improve the accuracy.

[55]:

!jupyter nbconvert --to html /media/sf_LVM_shared/my_SE_data/exercise/Final_Project.ipynb

[NbConvertApp] Converting notebook /media/sf_LVM_shared/my_SE_data/exercise/Final_Project.ipynb to html
[NbConvertApp] Writing 642237 bytes to /media/sf_LVM_shared/my_SE_data/exercise/Final_Project.html

[ ]: