Estimation of tree height using GEDI dataset - Perceptron tree prediction - 2023

jupyter-notebook Tree_Height_06Perceptron_pred_2023.ipynb
[12]:
# Ref: https://medium.com/biaslyai/pytorch-introduction-to-neural-network-feedforward-neural-network-model-e7231cff47cb

'''
Packages (Installation from scratch)
conda create -n geo_comp2 python=3.8
conda activate geo_comp2
conda install pytorch==1.12.0 torchvision==0.13.0 cudatoolkit=11.3 -c pytorch
conda install -c conda-forge jupyterlab
conda install -c conda-forge matplotlib
conda install -c anaconda scikit-learn
conda install -c anaconda pandas
conda install -c anaconda scipy



if you have gpus: conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
for cpu only: conda install pytorch torchvision torchaudio cpuonly -c pytorch
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
conda install -c anaconda scikit-learn
conda install pandas
'''
import os
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
import scipy
import pandas as pd
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split
from scipy import stats

"""
Fix random seeds.
"""
seed=31
torch.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
np.random.seed(seed)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)
Using device: cuda

We already have seen how to design a simple Perceptron to perform a regression. Also, we observed that the normalization of the data and target should match the structure of our model. However, just guaranteeing the data is in the correct range is not enough. Let’s see how it works.

[4]:
# #What are we trying to beat
# data = pd.read_csv('./tree_height/txt/eu_x_y_height_predictors.txt',  sep=" ")
# # data = pd.read_csv('./tree_height/txt/eu_x_y_height.txt',  sep=" ")
# print(data.shape)
# print(data.head())

# y_true = data['h']
# y_pred = data['forest_height']

# slope, intercept, r_value, p_value, std_err = scipy.stats.linregress(y_pred, y_true)

# fig,ax=plt.subplots(1,1,figsize=(5,5))
# ax.scatter(y_pred, y_true)
# ax.set_xlabel('Prediction')
# ax.set_ylabel('True')
# ax.set_title('slope: {:.4f}, r_value: {:.4f}'.format(slope, r_value))
# plt.show()

Let’s start by loading the data we used in the previous class

[5]:
### Try the the tree height with Perceptron
# data = pd.read_csv('./tree_height/txt/eu_x_y_height_predictors.txt',  sep=" ")
# data = pd.read_csv('./tree_height/txt/eu_x_y_height.txt',  sep=" ")
data = pd.read_csv("./tree_height_2/txt/eu_x_y_height_predictors_select.txt", sep=" ",  index_col=False)
print(data.shape)
print(data.head())
(1267239, 23)
   ID         X          Y        h  BLDFIE_WeigAver  CECSOL_WeigAver  \
0   1  6.050001  49.727499  3139.00             1540               13
1   2  6.050002  49.922155  1454.75             1491               12
2   3  6.050002  48.602377   853.50             1521               17
3   4  6.050009  48.151979  3141.00             1526               16
4   5  6.050010  49.588410  2065.25             1547               14

   CHELSA_bio18  CHELSA_bio4  convergence        cti  ...  forestheight  \
0          2113         5893   -10.486560 -238043120  ...            23
1          1993         5912    33.274361 -208915344  ...            19
2          2124         5983     0.045293 -137479792  ...            21
3          2569         6130   -33.654274 -267223072  ...            27
4          2108         5923    27.493824 -107809368  ...            25

   glad_ard_SVVI_max  glad_ard_SVVI_med  glad_ard_SVVI_min  northness  \
0         276.871094          46.444092         347.665405   0.042500
1         -49.526367          19.552734        -130.541748   0.182780
2          93.257324          50.743652         384.522461   0.036253
3         542.401367         202.264160         386.156738   0.005139
4         136.048340         146.835205         198.127441   0.028847

   ORCDRC_WeigAver  outlet_dist_dw_basin  SBIO3_Isothermality_5_15cm  \
0                9                780403                   19.798992
1               16                772777                   20.889412
2               14                898820                   20.695877
3               15                831824                   19.375000
4               17                796962                   18.777500

   SBIO4_Temperature_Seasonality_5_15cm  treecover
0                            440.672211         85
1                            457.756195         85
2                            481.879700         62
3                            479.410278         85
4                            457.880066         85

[5 rows x 23 columns]
[6]:
#Keep just few columns
data = data.iloc[:,1:4]
print(data.shape)
print(data.head())
(1267239, 3)
          X          Y        h
0  6.050001  49.727499  3139.00
1  6.050002  49.922155  1454.75
2  6.050002  48.602377   853.50
3  6.050009  48.151979  3141.00
4  6.050010  49.588410  2065.25

Once again, let’s take a look at the distributions and range of our data

[7]:
#Explore the raw data
data = data.to_numpy()
fig,ax = plt.subplots(1,3,figsize=(15,5))
ax[0].hist(data[:,0],50)
ax[1].hist(data[:,1],50)
ax[2].hist(data[:,2],50)
[7]:
(array([8.96640e+04, 5.91560e+04, 6.27730e+04, 7.68820e+04, 9.90510e+04,
        1.20582e+05, 1.36417e+05, 1.50469e+05, 1.55594e+05, 1.31737e+05,
        9.07160e+04, 4.94050e+04, 2.35680e+04, 1.03910e+04, 4.47900e+03,
        1.93800e+03, 9.18000e+02, 5.20000e+02, 3.57000e+02, 2.92000e+02,
        2.79000e+02, 1.95000e+02, 2.02000e+02, 1.98000e+02, 1.64000e+02,
        1.59000e+02, 1.41000e+02, 1.14000e+02, 1.22000e+02, 9.30000e+01,
        1.02000e+02, 9.70000e+01, 6.80000e+01, 6.90000e+01, 5.20000e+01,
        3.40000e+01, 3.30000e+01, 3.80000e+01, 2.40000e+01, 2.20000e+01,
        2.40000e+01, 1.80000e+01, 1.40000e+01, 1.60000e+01, 1.00000e+01,
        9.00000e+00, 1.40000e+01, 5.00000e+00, 6.00000e+00, 8.00000e+00]),
 array([   84.75 ,   372.675,   660.6  ,   948.525,  1236.45 ,  1524.375,
         1812.3  ,  2100.225,  2388.15 ,  2676.075,  2964.   ,  3251.925,
         3539.85 ,  3827.775,  4115.7  ,  4403.625,  4691.55 ,  4979.475,
         5267.4  ,  5555.325,  5843.25 ,  6131.175,  6419.1  ,  6707.025,
         6994.95 ,  7282.875,  7570.8  ,  7858.725,  8146.65 ,  8434.575,
         8722.5  ,  9010.425,  9298.35 ,  9586.275,  9874.2  , 10162.125,
        10450.05 , 10737.975, 11025.9  , 11313.825, 11601.75 , 11889.675,
        12177.6  , 12465.525, 12753.45 , 13041.375, 13329.3  , 13617.225,
        13905.15 , 14193.075, 14481.   ]),
 <a list of 50 Patch objects>)
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_9_1.png

Now we normalize the data

[8]:
#Normalize the data
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
data = scaler.fit_transform(data)

fig,ax = plt.subplots(1,3,figsize=(15,5))
ax[0].hist(data[:,0],50)
ax[1].hist(data[:,1],50)
ax[2].hist(data[:,2],50)
[8]:
(array([8.96640e+04, 5.91560e+04, 6.27730e+04, 7.68820e+04, 9.90510e+04,
        1.20582e+05, 1.36417e+05, 1.50469e+05, 1.55594e+05, 1.31886e+05,
        9.05670e+04, 4.94050e+04, 2.35680e+04, 1.03910e+04, 4.47900e+03,
        1.93800e+03, 9.18000e+02, 5.20000e+02, 3.57000e+02, 2.92000e+02,
        2.79000e+02, 1.95000e+02, 2.02000e+02, 1.98000e+02, 1.64000e+02,
        1.59000e+02, 1.41000e+02, 1.14000e+02, 1.22000e+02, 9.30000e+01,
        1.02000e+02, 9.70000e+01, 6.80000e+01, 6.90000e+01, 5.20000e+01,
        3.40000e+01, 3.30000e+01, 3.80000e+01, 2.40000e+01, 2.20000e+01,
        2.40000e+01, 1.80000e+01, 1.40000e+01, 1.60000e+01, 1.00000e+01,
        9.00000e+00, 1.40000e+01, 5.00000e+00, 6.00000e+00, 8.00000e+00]),
 array([0.  , 0.02, 0.04, 0.06, 0.08, 0.1 , 0.12, 0.14, 0.16, 0.18, 0.2 ,
        0.22, 0.24, 0.26, 0.28, 0.3 , 0.32, 0.34, 0.36, 0.38, 0.4 , 0.42,
        0.44, 0.46, 0.48, 0.5 , 0.52, 0.54, 0.56, 0.58, 0.6 , 0.62, 0.64,
        0.66, 0.68, 0.7 , 0.72, 0.74, 0.76, 0.78, 0.8 , 0.82, 0.84, 0.86,
        0.88, 0.9 , 0.92, 0.94, 0.96, 0.98, 1.  ]),
 <a list of 50 Patch objects>)
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_11_1.png
[9]:
#Split the data
X_train, X_test, y_train, y_test = train_test_split(data[:,:2], data[:,2], test_size=0.30, random_state=0)
X_train = torch.FloatTensor(X_train)
y_train = torch.FloatTensor(y_train)
X_test = torch.FloatTensor(X_test)
y_test = torch.FloatTensor(y_test)
print('X_train.shape: {}, X_test.shape: {}, y_train.shape: {}, y_test.shape: {}'.format(X_train.shape, X_test.shape, y_train.shape, y_test.shape))
X_train.shape: torch.Size([887067, 2]), X_test.shape: torch.Size([380172, 2]), y_train.shape: torch.Size([887067]), y_test.shape: torch.Size([380172])

Finally, let’s create and initialize our Perceptron

[10]:
# Create the model
class Perceptron(torch.nn.Module):
    def __init__(self,input_size, output_size,use_activation_fn=None):
        super(Perceptron, self).__init__()
        self.fc = nn.Linear(input_size,output_size)
        self.relu = torch.nn.ReLU() # instead of Heaviside step fn
        self.sigmoid = torch.nn.Sigmoid()
        self.tanh = torch.nn.Tanh()
        self.use_activation_fn=use_activation_fn
    def forward(self, x):
        output = self.fc(x)
        if self.use_activation_fn=='sigmoid':
            output = self.sigmoid(output) # To add the non-linearity. Try training you Perceptron with and without the non-linearity
        elif self.use_activation_fn=='tanh':
            output = self.tanh(output)
        elif self.use_activation_fn=='relu':
            output = self.relu(output)

        return output
[13]:
#Just in case, remember to always delete previously existent models before starting a new session
if 'model' in globals():
    print('Deleting previous model')
    del model, criterion, optimizer
model = Perceptron(input_size=2, output_size=1, use_activation_fn='sigmoid')
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr = 0.01)

Let’s train and check our model’s performance

[14]:
model.train()
epoch = 5000
all_loss=[]
for epoch in range(epoch):
    optimizer.zero_grad()
    # Forward pass
    y_pred = model(X_train)
    # Compute Loss
    loss = criterion(y_pred.squeeze(), y_train)

    # Backward pass
    loss.backward()
    optimizer.step()

    all_loss.append(loss.item())

model.eval()
with torch.no_grad():
    y_pred = model(X_test)
    after_train = criterion(y_pred.squeeze(), y_test)
    print('Test loss after Training' , after_train.item())

    y_pred = y_pred.detach().numpy().squeeze()
    slope, intercept, r_value, p_value, std_err = scipy.stats.linregress(y_pred, y_test)

    # Fit line
    # x = np.arange(-150,150)

    fig,ax=plt.subplots(1,2,figsize=(10,5))
    ax[1].plot(all_loss)
    ax[0].scatter(y_pred, y_test)
    # ax[0].plot(x, intercept + slope*x, 'r', label='fitted line')
    ax[0].set_xlabel('Prediction')
    ax[0].set_ylabel('True')
    ax[0].set_title('slope: {:.3f}, r_value: {:.3f}'.format(slope, r_value))
Test loss after Training 0.005871600937098265
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_17_1.png

Well, they all performed badly. What can we do to make this model better? One thing that immediatly jumps to our eyes is the fact that this data is very skewed. There are many ways to normalized sweked data. Check out this great post about normalization and popular power transformations power transformations. Here I will introduce another one: the Quantile Transformation. Also, if you would like to see a great comparison between transformations, check out this skelarn post

For now, let’s use the power transformation just for our target variable (ie, tree height).

[15]:
from sklearn.preprocessing import QuantileTransformer
qt = QuantileTransformer(
    n_quantiles=500, output_distribution="normal", random_state=0
)

# X_train, X_test = train_test_split(data, test_size=0.5)
# data = qt.fit_transform(data)
data[:,2] = qt.fit_transform(data[:,2].reshape(-1,1)).squeeze()
[16]:
#Normalize the data
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
data = scaler.fit_transform(data)
data[:,2] = ((data[:,2]-data[:,2].min())/(data[:,2].max()-data[:,2].min()))*2-1
# data[:,2] = data[:,2]/np.quantile(np.abs(data[:,2]),0.99)
data[:,2] = data[:,2]/data[:,2].max()
# data = data/np.quantile(np.abs(data),0.99)
# data = data/data.max()
# print(data.shape)
fig,ax = plt.subplots(1,3,figsize=(15,5))
ax[0].hist(data[:,0],50)
ax[1].hist(data[:,1],50)
ax[2].hist(data[:,2],50)
[16]:
(array([6.40000e+01, 0.00000e+00, 0.00000e+00, 0.00000e+00, 8.00000e+00,
        5.00000e+00, 1.60000e+01, 4.10000e+01, 1.04000e+02, 4.76000e+02,
        1.75900e+03, 2.88100e+03, 4.45900e+03, 6.93700e+03, 9.86300e+03,
        1.22210e+04, 2.18220e+04, 3.01310e+04, 4.54470e+04, 5.62710e+04,
        6.77220e+04, 7.39930e+04, 8.85820e+04, 1.03028e+05, 1.08294e+05,
        1.03491e+05, 9.61700e+04, 9.32570e+04, 8.00070e+04, 6.65440e+04,
        5.62140e+04, 4.55490e+04, 3.18110e+04, 1.88250e+04, 1.54770e+04,
        1.08020e+04, 6.05500e+03, 4.45700e+03, 2.01500e+03, 1.54800e+03,
        4.38000e+02, 1.48000e+02, 4.60000e+01, 1.80000e+01, 3.00000e+00,
        5.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 2.35000e+02]),
 array([-1.  , -0.96, -0.92, -0.88, -0.84, -0.8 , -0.76, -0.72, -0.68,
        -0.64, -0.6 , -0.56, -0.52, -0.48, -0.44, -0.4 , -0.36, -0.32,
        -0.28, -0.24, -0.2 , -0.16, -0.12, -0.08, -0.04,  0.  ,  0.04,
         0.08,  0.12,  0.16,  0.2 ,  0.24,  0.28,  0.32,  0.36,  0.4 ,
         0.44,  0.48,  0.52,  0.56,  0.6 ,  0.64,  0.68,  0.72,  0.76,
         0.8 ,  0.84,  0.88,  0.92,  0.96,  1.  ]),
 <a list of 50 Patch objects>)
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_21_1.png
[17]:
# Let's use all the data as one big minibatch

#Split the data
X_train, X_test, y_train, y_test = train_test_split(data[:,:2], data[:,2], test_size=0.30, random_state=0)
X_train = torch.FloatTensor(X_train)
y_train = torch.FloatTensor(y_train)
X_test = torch.FloatTensor(X_test)
y_test = torch.FloatTensor(y_test)
print('X_train.shape: {}, X_test.shape: {}, y_train.shape: {}, y_test.shape: {}'.format(X_train.shape, X_test.shape, y_train.shape, y_test.shape))
print('X_train.min: {}, X_test.shape: {}, y_train.shape: {}, y_test.shape: {}'.format(X_train.min(), X_test.min(), y_train.min(), y_test.min()))
print('X_train.max: {}, X_test.shape: {}, y_train.shape: {}, y_test.shape: {}'.format(X_train.max(), X_test.max(), y_train.max(), y_test.max()))
X_train.shape: torch.Size([887067, 2]), X_test.shape: torch.Size([380172, 2]), y_train.shape: torch.Size([887067]), y_test.shape: torch.Size([380172])
X_train.min: 0.0, X_test.shape: 0.0, y_train.shape: -1.0, y_test.shape: -1.0
X_train.max: 1.0, X_test.shape: 1.0, y_train.shape: 1.0, y_test.shape: 1.0
[23]:
# model.train()
epoch = 2000
hid_dim_range = [1]
lr_range = [0.5,0.1,0.01]#,0.05,0.001]

for hid_dim in hid_dim_range:
    for lr in lr_range:
        print('\nhid_dim: {}, lr: {}'.format(hid_dim, lr))
        if 'model' in globals():
            print('Deleting previous model')
            del model, criterion, optimizer
        model = Perceptron(2, hid_dim,use_activation_fn='tanh').to(device)
        criterion = torch.nn.MSELoss()
        optimizer = torch.optim.SGD(model.parameters(), lr = lr)

        all_loss_train=[]
        all_loss_val=[]
        for epoch in range(epoch):
            model.train()
            optimizer.zero_grad()
            # Forward pass
            y_pred = model(X_train.to(device))
            # Compute Loss
            loss = criterion(y_pred.squeeze(), y_train.to(device))

            # Backward pass
            loss.backward()
            optimizer.step()

            all_loss_train.append(loss.item())

            model.eval()
            with torch.no_grad():
                y_pred = model(X_test.to(device))
                # Compute Loss
                loss = criterion(y_pred.squeeze(), y_test.to(device))
                all_loss_val.append(loss.item())

                if epoch%100==0:
                    y_pred = y_pred.cpu().detach().numpy().squeeze()
                    slope, intercept, r_value, p_value, std_err = scipy.stats.linregress(y_pred, y_test)
                    print('Epoch {}, train_loss: {:.4f}, val_loss: {:.4f}, r_value: {:.4f}'.format(epoch,all_loss_train[-1],all_loss_val[-1],r_value))

        fig,ax=plt.subplots(1,2,figsize=(10,5))
        ax[0].plot(all_loss_train)
        ax[0].plot(all_loss_val)

        ax[1].scatter(y_pred.cpu(), y_test.cpu())
        ax[1].set_xlabel('Prediction')
        ax[1].set_ylabel('True')
        ax[1].set_title('slope: {:.4f}, r_value: {:.4f}'.format(slope, r_value))
        plt.show()

hid_dim: 1, lr: 0.5
Deleting previous model
Epoch 0, train_loss: 0.2698, val_loss: 0.0818, r_value: -0.0894
Epoch 100, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 200, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 300, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 400, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 500, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 600, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 700, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 800, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 900, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1000, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1100, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1200, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1300, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1400, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1500, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1600, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1700, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1800, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1900, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_23_1.png

hid_dim: 1, lr: 0.1
Deleting previous model
Epoch 0, train_loss: 0.6674, val_loss: 0.6119, r_value: 0.0545
Epoch 100, train_loss: 0.0375, val_loss: 0.0375, r_value: 0.0678
Epoch 200, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.0834
Epoch 300, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0876
Epoch 400, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0887
Epoch 500, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0890
Epoch 600, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0891
Epoch 700, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 800, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 900, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1000, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1100, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1200, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1300, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1400, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1500, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1600, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1700, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1800, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
Epoch 1900, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0892
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_23_3.png

hid_dim: 1, lr: 0.01
Deleting previous model
Epoch 0, train_loss: 0.2034, val_loss: 0.1965, r_value: -0.0886
Epoch 100, train_loss: 0.0656, val_loss: 0.0652, r_value: -0.0892
Epoch 200, train_loss: 0.0594, val_loss: 0.0590, r_value: -0.0890
Epoch 300, train_loss: 0.0546, val_loss: 0.0543, r_value: -0.0889
Epoch 400, train_loss: 0.0509, val_loss: 0.0506, r_value: -0.0888
Epoch 500, train_loss: 0.0479, val_loss: 0.0476, r_value: -0.0887
Epoch 600, train_loss: 0.0455, val_loss: 0.0453, r_value: -0.0886
Epoch 700, train_loss: 0.0437, val_loss: 0.0435, r_value: -0.0885
Epoch 800, train_loss: 0.0423, val_loss: 0.0421, r_value: -0.0883
Epoch 900, train_loss: 0.0412, val_loss: 0.0410, r_value: -0.0881
Epoch 1000, train_loss: 0.0403, val_loss: 0.0402, r_value: -0.0878
Epoch 1100, train_loss: 0.0396, val_loss: 0.0395, r_value: -0.0875
Epoch 1200, train_loss: 0.0391, val_loss: 0.0390, r_value: -0.0869
Epoch 1300, train_loss: 0.0387, val_loss: 0.0386, r_value: -0.0861
Epoch 1400, train_loss: 0.0384, val_loss: 0.0383, r_value: -0.0848
Epoch 1500, train_loss: 0.0382, val_loss: 0.0381, r_value: -0.0824
Epoch 1600, train_loss: 0.0380, val_loss: 0.0379, r_value: -0.0775
Epoch 1700, train_loss: 0.0378, val_loss: 0.0378, r_value: -0.0654
Epoch 1800, train_loss: 0.0377, val_loss: 0.0376, r_value: -0.0317
Epoch 1900, train_loss: 0.0376, val_loss: 0.0376, r_value: 0.0292
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_23_5.png

Feel free to try other hyperparameters, but we already have seen that there is just so much a Perceptron can do. It’s time to scale up this problem. (BACK TO SLIDES)

Feedfoward neural network

image.png

[24]:
# Try with FF
class Feedforward(torch.nn.Module):
    def __init__(self, input_size, hidden_size, output_size=1):
        super(Feedforward, self).__init__()
        self.input_size = input_size
        self.hidden_size  = hidden_size
        self.fc1 = torch.nn.Linear(self.input_size, self.hidden_size)
        self.fc2 = torch.nn.Linear(self.hidden_size, self.hidden_size)
        # self.fc3 = torch.nn.Linear(self.hidden_size, self.hidden_size)
        # self.fc4 = torch.nn.Linear(self.hidden_size, self.hidden_size)
        self.relu = torch.nn.ReLU()
        self.fc3 = torch.nn.Linear(self.hidden_size, output_size)
        self.sigmoid = torch.nn.Sigmoid()
        self.tanh = torch.nn.Tanh()
    def forward(self, x):
        hidden = self.relu(self.fc1(x))
        hidden = self.relu(self.fc2(hidden))
        # hidden = self.relu(self.fc3(hidden))
        # hidden = self.relu(self.fc4(hidden))
        output = self.tanh(self.fc3(hidden))

        return output
[37]:
# model.train()
epochs = 2000
hid_dim_range = [128,256,512]
lr_range = [0.75,0.5,0.1,0.01,0.05,0.001]

#Let's create a place to save these models, so we can
path_to_save_models = './models'
if not os.path.exists(path_to_save_models):
    os.makedirs(path_to_save_models)

for hid_dim in hid_dim_range:
    for lr in lr_range:
        print('\nhid_dim: {}, lr: {}'.format(hid_dim, lr))
        if 'model' in globals():
            print('Deleting previous model')
            del model, criterion, optimizer
        model = Feedforward(2, hid_dim).to(device)
        criterion = torch.nn.MSELoss()
        optimizer = torch.optim.SGD(model.parameters(), lr = lr)

        all_loss_train=[]
        all_loss_val=[]
        all_r_train=[]
        all_r_val=[]
        for epoch in range(epochs):
            model.train()
            optimizer.zero_grad()
            # Forward pass
            y_pred = model(X_train.to(device))
            # Compute Loss
            loss = criterion(y_pred.squeeze(), y_train.to(device))

            # Backward pass
            loss.backward()
            optimizer.step()

            all_loss_train.append(loss.item())
            y_pred = y_pred.cpu().detach().numpy().squeeze()
            _, _, r_value_train, _, _ = scipy.stats.linregress(y_pred, y_train)
            all_r_train.append(r_value_train)

            model.eval()
            with torch.no_grad():
                y_pred = model(X_test.to(device))
                # Compute Loss
                loss = criterion(y_pred.squeeze(), y_test.to(device))
                all_loss_val.append(loss.item())

                y_pred = y_pred.cpu().detach().numpy().squeeze()
                slope, intercept, r_value, p_value, std_err = scipy.stats.linregress(y_pred, y_test)
                all_r_val.append(r_value)

                if epoch%200==0:
                    print('Epoch {}, train_loss: {:.4f}, val_loss: {:.4f}, r_value: {:.4f}'.format(epoch,all_loss_train[-1],all_loss_val[-1],r_value))

        fig,ax=plt.subplots(1,3,figsize=(15,5))
        ax[0].plot(np.log10(all_loss_train))
        ax[0].plot(np.log10(all_loss_val))
        ax[0].set_title('Loss')

        ax[1].plot(all_r_train)
        ax[1].plot(all_r_val)
        ax[1].set_title('Pearson Corr')

        ax[2].scatter(y_pred, y_test.cpu())
        ax[2].set_xlabel('Prediction')
        ax[2].set_ylabel('True')
        ax[2].set_title('slope: {:.4f}, r_value: {:.4f}'.format(slope, r_value))
        plt.show()

        name_to_save = os.path.join(path_to_save_models,'model_SGD_' + str(epochs) + '_lr' + str(lr) + '_hid_dim' + str(hid_dim))
        print('Saving model to ', name_to_save)
        model_state = {
                    'epoch': epoch + 1,
                    'state_dict': model.state_dict(),
                    'optimizer' : optimizer.state_dict(),
            }
        torch.save(model_state, name_to_save +'.pt')

hid_dim: 128, lr: 0.75
Deleting previous model
Epoch 0, train_loss: 0.0454, val_loss: 0.4352, r_value: 0.0631
Epoch 200, train_loss: 0.0374, val_loss: 0.0374, r_value: 0.0945
Epoch 400, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1037
Epoch 600, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1094
Epoch 800, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.1149
Epoch 1000, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.1206
Epoch 1200, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.1267
Epoch 1400, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.1396
Epoch 1600, train_loss: 0.0374, val_loss: 0.0374, r_value: 0.1543
Epoch 1800, train_loss: 0.0374, val_loss: 0.0374, r_value: 0.1659
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_1.png
Saving model to  ./models/model_SGD_2000_lr0.75_hid_dim128

hid_dim: 128, lr: 0.5
Deleting previous model
Epoch 0, train_loss: 0.0487, val_loss: 0.4800, r_value: 0.0552
Epoch 200, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1086
Epoch 400, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1185
Epoch 600, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1298
Epoch 800, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1370
Epoch 1000, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1464
Epoch 1200, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1546
Epoch 1400, train_loss: 0.0369, val_loss: 0.0369, r_value: 0.1608
Epoch 1600, train_loss: 0.0369, val_loss: 0.0369, r_value: 0.1650
Epoch 1800, train_loss: 0.0369, val_loss: 0.0368, r_value: 0.1683
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_3.png
Saving model to  ./models/model_SGD_2000_lr0.5_hid_dim128

hid_dim: 128, lr: 0.1
Deleting previous model
Epoch 0, train_loss: 0.0570, val_loss: 0.0402, r_value: 0.0208
Epoch 200, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1067
Epoch 400, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1161
Epoch 600, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1234
Epoch 800, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1294
Epoch 1000, train_loss: 0.0370, val_loss: 0.0369, r_value: 0.1349
Epoch 1200, train_loss: 0.0369, val_loss: 0.0369, r_value: 0.1402
Epoch 1400, train_loss: 0.0369, val_loss: 0.0369, r_value: 0.1449
Epoch 1600, train_loss: 0.0369, val_loss: 0.0368, r_value: 0.1493
Epoch 1800, train_loss: 0.0368, val_loss: 0.0368, r_value: 0.1534
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_5.png
Saving model to  ./models/model_SGD_2000_lr0.1_hid_dim128

hid_dim: 128, lr: 0.01
Deleting previous model
Epoch 0, train_loss: 0.0375, val_loss: 0.0374, r_value: 0.0806
Epoch 200, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.0877
Epoch 400, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0932
Epoch 600, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.0978
Epoch 800, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1000
Epoch 1000, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1015
Epoch 1200, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1029
Epoch 1400, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1041
Epoch 1600, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1053
Epoch 1800, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1064
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_7.png
Saving model to  ./models/model_SGD_2000_lr0.01_hid_dim128

hid_dim: 128, lr: 0.05
Deleting previous model
Epoch 0, train_loss: 0.0401, val_loss: 0.0379, r_value: -0.0494
Epoch 200, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1058
Epoch 400, train_loss: 0.0372, val_loss: 0.0371, r_value: 0.1110
Epoch 600, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1164
Epoch 800, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1197
Epoch 1000, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1228
Epoch 1200, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1253
Epoch 1400, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1276
Epoch 1600, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1298
Epoch 1800, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1319
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_9.png
Saving model to  ./models/model_SGD_2000_lr0.05_hid_dim128

hid_dim: 128, lr: 0.001
Deleting previous model
Epoch 0, train_loss: 0.0405, val_loss: 0.0404, r_value: 0.0694
Epoch 200, train_loss: 0.0375, val_loss: 0.0374, r_value: 0.0754
Epoch 400, train_loss: 0.0374, val_loss: 0.0374, r_value: 0.0784
Epoch 600, train_loss: 0.0374, val_loss: 0.0374, r_value: 0.0808
Epoch 800, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.0828
Epoch 1000, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.0845
Epoch 1200, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.0860
Epoch 1400, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0873
Epoch 1600, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0884
Epoch 1800, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0893
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_11.png
Saving model to  ./models/model_SGD_2000_lr0.001_hid_dim128

hid_dim: 256, lr: 0.75
Deleting previous model
Epoch 0, train_loss: 0.0393, val_loss: 0.4495, r_value: -0.0478
Epoch 200, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.1175
Epoch 400, train_loss: 0.0375, val_loss: 0.0374, r_value: 0.1349
Epoch 600, train_loss: 0.0375, val_loss: 0.0374, r_value: 0.1429
Epoch 800, train_loss: 0.0375, val_loss: 0.0374, r_value: 0.1477
Epoch 1000, train_loss: 0.0375, val_loss: 0.0374, r_value: 0.1511
Epoch 1200, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.1536
Epoch 1400, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.1558
Epoch 1600, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.1577
Epoch 1800, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.1592
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_13.png
Saving model to  ./models/model_SGD_2000_lr0.75_hid_dim256

hid_dim: 256, lr: 0.5
Deleting previous model
Epoch 0, train_loss: 0.0378, val_loss: 0.0639, r_value: 0.0716
Epoch 200, train_loss: 0.0368, val_loss: 0.0368, r_value: 0.1494
Epoch 400, train_loss: 0.0366, val_loss: 0.0365, r_value: 0.1743
Epoch 600, train_loss: 0.0368, val_loss: 0.0368, r_value: 0.1793
Epoch 800, train_loss: 0.0368, val_loss: 0.0368, r_value: 0.1833
Epoch 1000, train_loss: 0.0368, val_loss: 0.0367, r_value: 0.1862
Epoch 1200, train_loss: 0.0367, val_loss: 0.0367, r_value: 0.1886
Epoch 1400, train_loss: 0.0367, val_loss: 0.0367, r_value: 0.1907
Epoch 1600, train_loss: 0.0367, val_loss: 0.0366, r_value: 0.1926
Epoch 1800, train_loss: 0.0367, val_loss: 0.0366, r_value: 0.1942
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_15.png
Saving model to  ./models/model_SGD_2000_lr0.5_hid_dim256

hid_dim: 256, lr: 0.1
Deleting previous model
Epoch 0, train_loss: 0.0399, val_loss: 0.0468, r_value: 0.0123
Epoch 200, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1260
Epoch 400, train_loss: 0.0370, val_loss: 0.0369, r_value: 0.1341
Epoch 600, train_loss: 0.0369, val_loss: 0.0369, r_value: 0.1406
Epoch 800, train_loss: 0.0369, val_loss: 0.0368, r_value: 0.1465
Epoch 1000, train_loss: 0.0368, val_loss: 0.0368, r_value: 0.1516
Epoch 1200, train_loss: 0.0368, val_loss: 0.0367, r_value: 0.1561
Epoch 1400, train_loss: 0.0367, val_loss: 0.0367, r_value: 0.1602
Epoch 1600, train_loss: 0.0367, val_loss: 0.0366, r_value: 0.1640
Epoch 1800, train_loss: 0.0366, val_loss: 0.0366, r_value: 0.1673
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_17.png
Saving model to  ./models/model_SGD_2000_lr0.1_hid_dim256

hid_dim: 256, lr: 0.01
Deleting previous model
Epoch 0, train_loss: 0.0468, val_loss: 0.0427, r_value: 0.0789
Epoch 200, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1083
Epoch 400, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1199
Epoch 600, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1256
Epoch 800, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1288
Epoch 1000, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1312
Epoch 1200, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1332
Epoch 1400, train_loss: 0.0370, val_loss: 0.0369, r_value: 0.1352
Epoch 1600, train_loss: 0.0370, val_loss: 0.0369, r_value: 0.1372
Epoch 1800, train_loss: 0.0370, val_loss: 0.0369, r_value: 0.1390
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_19.png
Saving model to  ./models/model_SGD_2000_lr0.01_hid_dim256

hid_dim: 256, lr: 0.05
Deleting previous model
Epoch 0, train_loss: 0.0537, val_loss: 0.0397, r_value: -0.0593
Epoch 200, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1251
Epoch 400, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1323
Epoch 600, train_loss: 0.0370, val_loss: 0.0369, r_value: 0.1380
Epoch 800, train_loss: 0.0369, val_loss: 0.0369, r_value: 0.1434
Epoch 1000, train_loss: 0.0369, val_loss: 0.0368, r_value: 0.1475
Epoch 1200, train_loss: 0.0368, val_loss: 0.0368, r_value: 0.1514
Epoch 1400, train_loss: 0.0368, val_loss: 0.0368, r_value: 0.1549
Epoch 1600, train_loss: 0.0368, val_loss: 0.0367, r_value: 0.1579
Epoch 1800, train_loss: 0.0367, val_loss: 0.0367, r_value: 0.1608
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_21.png
Saving model to  ./models/model_SGD_2000_lr0.05_hid_dim256

hid_dim: 256, lr: 0.001
Deleting previous model
Epoch 0, train_loss: 0.0478, val_loss: 0.0473, r_value: 0.0918
Epoch 200, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.0910
Epoch 400, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0931
Epoch 600, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0953
Epoch 800, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.0974
Epoch 1000, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.0993
Epoch 1200, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1010
Epoch 1400, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1027
Epoch 1600, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1042
Epoch 1800, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1056
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_23.png
Saving model to  ./models/model_SGD_2000_lr0.001_hid_dim256

hid_dim: 512, lr: 0.75
Deleting previous model
Epoch 0, train_loss: 0.0963, val_loss: 1.0358, r_value: 0.0391
Epoch 200, train_loss: 1.0363, val_loss: 1.0358, r_value: 0.0392
Epoch 400, train_loss: 1.0363, val_loss: 1.0358, r_value: 0.0392
Epoch 600, train_loss: 1.0363, val_loss: 1.0358, r_value: 0.0394
Epoch 800, train_loss: 1.0363, val_loss: 1.0358, r_value: 0.0394
Epoch 1000, train_loss: 1.0363, val_loss: 1.0358, r_value: 0.0396
Epoch 1200, train_loss: 1.0363, val_loss: 1.0358, r_value: 0.0397
Epoch 1400, train_loss: 1.0363, val_loss: 1.0358, r_value: 0.0399
Epoch 1600, train_loss: 1.0363, val_loss: 1.0358, r_value: 0.0401
Epoch 1800, train_loss: 1.0363, val_loss: 1.0358, r_value: 0.0403
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_25.png
Saving model to  ./models/model_SGD_2000_lr0.75_hid_dim512

hid_dim: 512, lr: 0.5
Deleting previous model
Epoch 0, train_loss: 0.0477, val_loss: 1.0095, r_value: 0.0546
Epoch 200, train_loss: 0.0369, val_loss: 0.0368, r_value: 0.1452
Epoch 400, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1620
Epoch 600, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1669
Epoch 800, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1699
Epoch 1000, train_loss: 0.0370, val_loss: 0.0369, r_value: 0.1725
Epoch 1200, train_loss: 0.0370, val_loss: 0.0369, r_value: 0.1749
Epoch 1400, train_loss: 0.0370, val_loss: 0.0369, r_value: 0.1769
Epoch 1600, train_loss: 0.0369, val_loss: 0.0369, r_value: 0.1787
Epoch 1800, train_loss: 0.0369, val_loss: 0.0369, r_value: 0.1804
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_27.png
Saving model to  ./models/model_SGD_2000_lr0.5_hid_dim512

hid_dim: 512, lr: 0.1
Deleting previous model
Epoch 0, train_loss: 0.0394, val_loss: 0.0739, r_value: 0.0666
Epoch 200, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1298
Epoch 400, train_loss: 0.0369, val_loss: 0.0369, r_value: 0.1405
Epoch 600, train_loss: 0.0369, val_loss: 0.0368, r_value: 0.1489
Epoch 800, train_loss: 0.0368, val_loss: 0.0367, r_value: 0.1565
Epoch 1000, train_loss: 0.0367, val_loss: 0.0367, r_value: 0.1628
Epoch 1200, train_loss: 0.0366, val_loss: 0.0366, r_value: 0.1681
Epoch 1400, train_loss: 0.0366, val_loss: 0.0365, r_value: 0.1725
Epoch 1600, train_loss: 0.0365, val_loss: 0.0365, r_value: 0.1761
Epoch 1800, train_loss: 0.0365, val_loss: 0.0364, r_value: 0.1791
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_29.png
Saving model to  ./models/model_SGD_2000_lr0.1_hid_dim512

hid_dim: 512, lr: 0.01
Deleting previous model
Epoch 0, train_loss: 0.0389, val_loss: 0.0385, r_value: -0.0888
Epoch 200, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1009
Epoch 400, train_loss: 0.0372, val_loss: 0.0371, r_value: 0.1133
Epoch 600, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1194
Epoch 800, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1233
Epoch 1000, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1274
Epoch 1200, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1304
Epoch 1400, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1327
Epoch 1600, train_loss: 0.0370, val_loss: 0.0369, r_value: 0.1347
Epoch 1800, train_loss: 0.0370, val_loss: 0.0369, r_value: 0.1366
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_31.png
Saving model to  ./models/model_SGD_2000_lr0.01_hid_dim512

hid_dim: 512, lr: 0.05
Deleting previous model
Epoch 0, train_loss: 0.0421, val_loss: 0.0434, r_value: 0.0455
Epoch 200, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1327
Epoch 400, train_loss: 0.0369, val_loss: 0.0369, r_value: 0.1436
Epoch 600, train_loss: 0.0369, val_loss: 0.0368, r_value: 0.1507
Epoch 800, train_loss: 0.0368, val_loss: 0.0367, r_value: 0.1562
Epoch 1000, train_loss: 0.0367, val_loss: 0.0367, r_value: 0.1609
Epoch 1200, train_loss: 0.0367, val_loss: 0.0367, r_value: 0.1649
Epoch 1400, train_loss: 0.0367, val_loss: 0.0366, r_value: 0.1680
Epoch 1600, train_loss: 0.0366, val_loss: 0.0366, r_value: 0.1704
Epoch 1800, train_loss: 0.0367, val_loss: 0.0367, r_value: 0.1720
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_33.png
Saving model to  ./models/model_SGD_2000_lr0.05_hid_dim512

hid_dim: 512, lr: 0.001
Deleting previous model
Epoch 0, train_loss: 0.0465, val_loss: 0.0455, r_value: -0.0101
Epoch 200, train_loss: 0.0377, val_loss: 0.0376, r_value: 0.0107
Epoch 400, train_loss: 0.0375, val_loss: 0.0374, r_value: 0.0807
Epoch 600, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.0939
Epoch 800, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0984
Epoch 1000, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1008
Epoch 1200, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1027
Epoch 1400, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1043
Epoch 1600, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1059
Epoch 1800, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1075
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_28_35.png
Saving model to  ./models/model_SGD_2000_lr0.001_hid_dim512
[38]:
# Try deeper FF
class Feedforward(torch.nn.Module):
    def __init__(self, input_size, hidden_size):
        super(Feedforward, self).__init__()
        self.input_size = input_size
        self.hidden_size  = hidden_size
        self.fc1 = torch.nn.Linear(self.input_size, self.hidden_size)
        self.fc2 = torch.nn.Linear(self.hidden_size, self.hidden_size)
        self.fc3 = torch.nn.Linear(self.hidden_size, self.hidden_size)
        self.fc4 = torch.nn.Linear(self.hidden_size, self.hidden_size)
        self.relu = torch.nn.ReLU()
        self.fc5 = torch.nn.Linear(self.hidden_size, 1)
        self.sigmoid = torch.nn.Sigmoid()
        self.tanh = torch.nn.Tanh()
    def forward(self, x):
        hidden = self.relu(self.fc1(x))
        hidden = self.relu(self.fc2(hidden))
        hidden = self.relu(self.fc3(hidden))
        hidden = self.relu(self.fc4(hidden))
        output = self.tanh(self.fc5(hidden))

        return output
[44]:
# model.train()
epoch = 2000
hid_dim_range = [128,256,512]
lr_range = [0.5,0.1,0.01,0.05]

for hid_dim in hid_dim_range:
    for lr in lr_range:
        print('\nhid_dim: {}, lr: {}'.format(hid_dim, lr))
        if 'model' in globals():
            print('Deleting previous model')
            del model, criterion, optimizer
        model = Feedforward(2, hid_dim).to(device)
        criterion = torch.nn.MSELoss()
        optimizer = torch.optim.SGD(model.parameters(), lr = lr)

        all_loss_train=[]
        all_loss_val=[]
        all_r_train=[]
        all_r_val=[]
        for epoch in range(epoch):
            model.train()
            optimizer.zero_grad()
            # Forward pass
            y_pred = model(X_train.to(device))
            # Compute Loss
            loss = criterion(y_pred.squeeze(), y_train.to(device))

            # Backward pass
            loss.backward()
            optimizer.step()

            all_loss_train.append(loss.item())
            y_pred = y_pred.cpu().detach().numpy().squeeze()
            _, _, r_value_train, _, _ = scipy.stats.linregress(y_pred, y_train)
            all_r_train.append(r_value_train)

            model.eval()
            with torch.no_grad():
                y_pred = model(X_test.to(device))
                # Compute Loss
                loss = criterion(y_pred.squeeze(), y_test.to(device))
                all_loss_val.append(loss.item())

                y_pred = y_pred.cpu().detach().numpy().squeeze()
                slope, intercept, r_value, p_value, std_err = scipy.stats.linregress(y_pred, y_test)
                all_r_val.append(r_value)

                if epoch%200==0:
                    print('Epoch {}, train_loss: {:.4f}, val_loss: {:.4f}, r_value: {:.4f}'.format(epoch,all_loss_train[-1],all_loss_val[-1],r_value))


        fig,ax=plt.subplots(1,3,figsize=(15,5))
        ax[0].plot(np.log10(all_loss_train))
        ax[0].plot(np.log10(all_loss_val))
        ax[0].set_title('Loss')

        ax[1].plot(all_r_train)
        ax[1].plot(all_r_val)
        ax[1].set_title('Pearson Corr')

        ax[2].scatter(y_pred, y_test.cpu())
        ax[2].set_xlabel('Prediction')
        ax[2].set_ylabel('True')
        ax[2].set_title('slope: {:.4f}, r_value: {:.4f}'.format(slope, r_value))
        plt.show()

        name_to_save = os.path.join(path_to_save_models,'model_SGD_Deeper' + str(epochs) + '_lr' + str(lr) + '_hid_dim' + str(hid_dim))
        print('Saving model to ', name_to_save)
        model_state = {
                    'epoch': epoch + 1,
                    'state_dict': model.state_dict(),
                    'optimizer' : optimizer.state_dict(),
            }
        torch.save(model_state, name_to_save +'.pt')

hid_dim: 128, lr: 0.5
Deleting previous model
Epoch 0, train_loss: 0.0503, val_loss: 0.0428, r_value: -0.0330
Epoch 200, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1042
Epoch 400, train_loss: 0.0372, val_loss: 0.0371, r_value: 0.1125
Epoch 600, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1212
Epoch 800, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1307
Epoch 1000, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1403
Epoch 1200, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1507
Epoch 1400, train_loss: 0.0370, val_loss: 0.0369, r_value: 0.1604
Epoch 1600, train_loss: 0.0369, val_loss: 0.0368, r_value: 0.1686
Epoch 1800, train_loss: 0.0368, val_loss: 0.0368, r_value: 0.1752
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_30_1.png
Saving model to  ./models/model_SGD_Deeper2000_lr0.5_hid_dim128

hid_dim: 128, lr: 0.1
Deleting previous model
Epoch 0, train_loss: 0.0437, val_loss: 0.0401, r_value: -0.0549
Epoch 200, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.0946
Epoch 400, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.0976
Epoch 600, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1006
Epoch 800, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1029
Epoch 1000, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1048
Epoch 1200, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1068
Epoch 1400, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1089
Epoch 1600, train_loss: 0.0372, val_loss: 0.0371, r_value: 0.1111
Epoch 1800, train_loss: 0.0372, val_loss: 0.0371, r_value: 0.1131
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_30_3.png
Saving model to  ./models/model_SGD_Deeper2000_lr0.1_hid_dim128

hid_dim: 128, lr: 0.01
Deleting previous model
Epoch 0, train_loss: 0.0484, val_loss: 0.0477, r_value: -0.0230
Epoch 200, train_loss: 0.0375, val_loss: 0.0375, r_value: 0.0971
Epoch 400, train_loss: 0.0375, val_loss: 0.0374, r_value: 0.0967
Epoch 600, train_loss: 0.0375, val_loss: 0.0374, r_value: 0.0958
Epoch 800, train_loss: 0.0374, val_loss: 0.0374, r_value: 0.0956
Epoch 1000, train_loss: 0.0374, val_loss: 0.0374, r_value: 0.0958
Epoch 1200, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.0962
Epoch 1400, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.0966
Epoch 1600, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.0971
Epoch 1800, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.0977
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_30_5.png
Saving model to  ./models/model_SGD_Deeper2000_lr0.01_hid_dim128

hid_dim: 128, lr: 0.05
Deleting previous model
Epoch 0, train_loss: 0.0399, val_loss: 0.0391, r_value: 0.1013
Epoch 200, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.1016
Epoch 400, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.1008
Epoch 600, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1022
Epoch 800, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1036
Epoch 1000, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1047
Epoch 1200, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1058
Epoch 1400, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1067
Epoch 1600, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1076
Epoch 1800, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1089
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_30_7.png
Saving model to  ./models/model_SGD_Deeper2000_lr0.05_hid_dim128

hid_dim: 256, lr: 0.5
Deleting previous model
Epoch 0, train_loss: 0.0380, val_loss: 0.0379, r_value: -0.0041
Epoch 200, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1165
Epoch 400, train_loss: 0.0372, val_loss: 0.0371, r_value: 0.1286
Epoch 600, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1419
Epoch 800, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1547
Epoch 1000, train_loss: 0.0370, val_loss: 0.0369, r_value: 0.1651
Epoch 1200, train_loss: 0.0369, val_loss: 0.0368, r_value: 0.1733
Epoch 1400, train_loss: 0.0368, val_loss: 0.0367, r_value: 0.1790
Epoch 1600, train_loss: 0.0367, val_loss: 0.0366, r_value: 0.1836
Epoch 1800, train_loss: 0.0366, val_loss: 0.0366, r_value: 0.1875
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_30_9.png
Saving model to  ./models/model_SGD_Deeper2000_lr0.5_hid_dim256

hid_dim: 256, lr: 0.1
Deleting previous model
Epoch 0, train_loss: 0.0406, val_loss: 0.0385, r_value: 0.0544
Epoch 200, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1062
Epoch 400, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1109
Epoch 600, train_loss: 0.0372, val_loss: 0.0371, r_value: 0.1143
Epoch 800, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1170
Epoch 1000, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1204
Epoch 1200, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1233
Epoch 1400, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1267
Epoch 1600, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1299
Epoch 1800, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1329
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_30_11.png
Saving model to  ./models/model_SGD_Deeper2000_lr0.1_hid_dim256

hid_dim: 256, lr: 0.01
Deleting previous model
Epoch 0, train_loss: 0.0381, val_loss: 0.0380, r_value: -0.0578
Epoch 200, train_loss: 0.0376, val_loss: 0.0375, r_value: 0.0614
Epoch 400, train_loss: 0.0375, val_loss: 0.0374, r_value: 0.0917
Epoch 600, train_loss: 0.0375, val_loss: 0.0374, r_value: 0.0954
Epoch 800, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.0965
Epoch 1000, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.0973
Epoch 1200, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.0981
Epoch 1400, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0988
Epoch 1600, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0994
Epoch 1800, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.0999
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_30_13.png
Saving model to  ./models/model_SGD_Deeper2000_lr0.01_hid_dim256

hid_dim: 256, lr: 0.05
Deleting previous model
Epoch 0, train_loss: 0.0396, val_loss: 0.0388, r_value: -0.0103
Epoch 200, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.0993
Epoch 400, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1028
Epoch 600, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1056
Epoch 800, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1081
Epoch 1000, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1098
Epoch 1200, train_loss: 0.0372, val_loss: 0.0371, r_value: 0.1117
Epoch 1400, train_loss: 0.0372, val_loss: 0.0371, r_value: 0.1134
Epoch 1600, train_loss: 0.0372, val_loss: 0.0371, r_value: 0.1147
Epoch 1800, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1161
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_30_15.png
Saving model to  ./models/model_SGD_Deeper2000_lr0.05_hid_dim256

hid_dim: 512, lr: 0.5
Deleting previous model
Epoch 0, train_loss: 0.0380, val_loss: 0.0386, r_value: 0.0792
Epoch 200, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1242
Epoch 400, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1427
Epoch 600, train_loss: 0.0370, val_loss: 0.0369, r_value: 0.1593
Epoch 800, train_loss: 0.0368, val_loss: 0.0368, r_value: 0.1723
Epoch 1000, train_loss: 0.0367, val_loss: 0.0367, r_value: 0.1812
Epoch 1200, train_loss: 0.0366, val_loss: 0.0365, r_value: 0.1874
Epoch 1400, train_loss: 0.0365, val_loss: 0.0364, r_value: 0.1923
Epoch 1600, train_loss: 0.0364, val_loss: 0.0364, r_value: 0.1958
Epoch 1800, train_loss: 0.0363, val_loss: 0.0363, r_value: 0.1988
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_30_17.png
Saving model to  ./models/model_SGD_Deeper2000_lr0.5_hid_dim512

hid_dim: 512, lr: 0.1
Deleting previous model
Epoch 0, train_loss: 0.0382, val_loss: 0.0379, r_value: -0.0825
Epoch 200, train_loss: 0.0372, val_loss: 0.0371, r_value: 0.1135
Epoch 400, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1212
Epoch 600, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1263
Epoch 800, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1311
Epoch 1000, train_loss: 0.0370, val_loss: 0.0369, r_value: 0.1357
Epoch 1200, train_loss: 0.0369, val_loss: 0.0369, r_value: 0.1402
Epoch 1400, train_loss: 0.0369, val_loss: 0.0369, r_value: 0.1446
Epoch 1600, train_loss: 0.0369, val_loss: 0.0368, r_value: 0.1490
Epoch 1800, train_loss: 0.0368, val_loss: 0.0368, r_value: 0.1534
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_30_19.png
Saving model to  ./models/model_SGD_Deeper2000_lr0.1_hid_dim512

hid_dim: 512, lr: 0.01
Deleting previous model
Epoch 0, train_loss: 0.0378, val_loss: 0.0377, r_value: -0.0290
Epoch 200, train_loss: 0.0375, val_loss: 0.0374, r_value: 0.1043
Epoch 400, train_loss: 0.0374, val_loss: 0.0373, r_value: 0.1052
Epoch 600, train_loss: 0.0373, val_loss: 0.0373, r_value: 0.1059
Epoch 800, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1067
Epoch 1000, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1074
Epoch 1200, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1081
Epoch 1400, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1089
Epoch 1600, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1096
Epoch 1800, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1103
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_30_21.png
Saving model to  ./models/model_SGD_Deeper2000_lr0.01_hid_dim512

hid_dim: 512, lr: 0.05
Deleting previous model
Epoch 0, train_loss: 0.0406, val_loss: 0.0390, r_value: 0.0797
Epoch 200, train_loss: 0.0373, val_loss: 0.0372, r_value: 0.1048
Epoch 400, train_loss: 0.0372, val_loss: 0.0372, r_value: 0.1098
Epoch 600, train_loss: 0.0372, val_loss: 0.0371, r_value: 0.1141
Epoch 800, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1171
Epoch 1000, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1196
Epoch 1200, train_loss: 0.0371, val_loss: 0.0371, r_value: 0.1221
Epoch 1400, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1244
Epoch 1600, train_loss: 0.0371, val_loss: 0.0370, r_value: 0.1267
Epoch 1800, train_loss: 0.0370, val_loss: 0.0370, r_value: 0.1289
../_images/CASESTUDY_Tree_Height_06Perceptron_pred_2023_30_23.png
Saving model to  ./models/model_SGD_Deeper2000_lr0.05_hid_dim512
[ ]: