User Tools

Site Tools


wikistudholland:utwente_haris

TITLE

Evaluation of Bias correction method for satellite based precipitation (CMORPH) product

(Ahmed Ibrahim, Haris Akram Bhatti, Norhakim Yusof, Omar Mohamed)

INTRODUCTION

General framework of the analysis

Background: Hydrological studies particularly related to rainfall-runoff modeling and flood forecasting relies upon precipitation. The ideal source of precipitation data is by installation of rain gauge stations at high density for longer time span. Traditionally, the network of ground based rainfall observation is sparse due to limited financial resources and socio economic issues. Also, rainfall records are often incomplete due to poor maintenance. Under such conditions, gauges may not record data at high spatio-temporal scales. Therefore, an alternative and reliable data source is needed for rainfall-runoff modeling.

With the advancement in remote sensing, satellite based precipitation estimates are gaining attraction in the field of hydrology particularly in rainfall-runoff modeling. However, satellite based estimates commonly are affected by systematic (bias) errors. The bias persists when the estimates are aggregated over time and hence may cause large uncertainty in hydrologic modeling. The success of the simulation study is based on the accuracy of input rainfall that drives the hydrologic model. However, the adequacy of using satellite based precipitation to replace gauge data in the hydrologic model is still questionable.

Keywords: Bias, CMORPH, precipitation, hydrology

Project objectives

  1. To evaluate bias between the in-situ (gauge) rainfall and satellite (CMORPH) data
  2. To apply bias correction to CMPORPH data

METADATA

  • Daily CMORPH precipitation (raster) data (in excel form) at 8 km spatial resolution from 2003-2010
  • Daily in-situ precipitation (point) data (in excel form) of six stations from 2003-2010

METHOD

In this study, we estimated multiplicative bias correction factor for the daily uncorrected CMORPH data. The ratio between gauge and uncorrected CMORPH observations are calculated and multiplied with the uncorrected data to obtain the biased corrected rainfall.

  • We applied two methods i.e. moving window (MW) and sequential window (SW) to remove bias in the CMORPH product. MW method is tested with three different approaches (forward window, central window and backward window). Analysis of different sampling windows (5,7,9,…. 31 days) were tested for MW and SW methods to reduce the effect of sampling variability and scale discrepancies between gauge and CMORPH observations.
    1. Each method is assessed by statistical indices like RMSE, SD and correlation (Taylor's diagram) and the optimal method (with sampling window) is selected for further use (figure 1).
    2. Pixel-based bias estimates for the selected bias removal scheme are then spatially interpolated using IDW interpolation to yield a spatial field of bias factors. The uncorrected CMORPH field was multiplied by the bias field to result in bias corrected CMORPH estimates.
    3. To validate the bias corrected CMORPOH rainfall field, cross validation method recommended by World Meteorological Organization is applied. This procedure involves withholding one gauge at a time and calculating the bias field without this particular gauge. This procedure is repeated systematically for all 6 gauges but every time the validation data set is selected from the remaining data used in the previous step. This process guarantees that each gauge is withdrawn once. The bias-corrected CMORPH estimates at the withheld gauge are then compared against the corresponding gauge observations to examine the performance of the proposed technique.

    Computational implementation

Describe your code, what it does …

# Functions to read and implement the moving window
# this code is used to read data in excel files (two files, image data and point data) 
# implement the window functions with different sizes
# correct the data using window values and original data
# write the outputs in two files ( one for window values and one for the corrected values)
 
# Function to round decimal numbers
 
specify_decimal <- function(x, k) format(round(x, k), nsmall=k)
 
# for excel files
library(gdata) 
 
# Main logic
# 	read Excel Files
# 	process data
# 	Write output
ReadData <- function(path1)
{
# Read The PixelFile
#assign the output path
outpath=paste(path1,"Output/",sep = "")
# Create the output Directory
dir.create(outpath)
 
# Read the all pixel files (Image data)
files=list.files(path = path1, pattern = "*_pixel.xlsx")
#Loop for all of these Files
for(fi in 1:length(files))
{
#Read the file name (Exclude the _pixel.xlsx 11 char
stop=nchar(files[[fi]])-11
filename=substr(files[[fi]], 1, stop)
 
 
filepath1=paste(path1,filename,sep = "")
pixelpath=paste(filepath1,"pixel.xlsx",sep = "_")
pixeldataframe<-  read.xls(pixelpath)
 
#assign the point file
pointpath=paste(filepath1,"point.xlsx",sep = "_")
#continue if the point file not exist
if(!file.exists(pointpath))
next
 
pointdataframe<-  read.xls(pointpath)
 
ncols=ncol(pixeldataframe)
nrows=nrow(pixeldataframe)
 
 
# get the years / Days 
Years=as.vector(pixeldataframe[1,2:ncols])
yearno=length(Years)
Days=pixeldataframe[2:nrows,1]
dayno=length(Days)
#Create 4 empty vectors 
DaysVector <- rep(NA, dayno*yearno)
YearsVector <- rep(NA, dayno*yearno)
PixelsVector <- rep(NA, dayno*yearno)
PointsVector <- rep(NA, dayno*yearno)
 
#loop, to fill the vectors with the data
for (col in 1:yearno) {
 
  sind=(col-1)*dayno+1
  eind=col*dayno
  DaysVector[sind:eind]=Days
  YearsVector[sind:eind]=Years[[col]]
  PixelsVector[sind:eind]=as.vector(pixeldataframe[2:nrows,col+1])
  PointsVector[sind:eind]=as.vector(pointdataframe[2:nrows,col+1])
 
 }
 
 # create the data frame with the 4 vectors
dfd=data.frame(DaysVector,YearsVector,PixelsVector,PointsVector)
# process data
wholedata<-MakeWindow(dfd)
 
#write window values
filepath=paste(outpath,filename,sep = "")
output=paste(filepath,"all.csv",sep = "_")
write.csv(wholedata[[1]], file = output)
 
# write corrected data
print("correcting data")
filepath=paste(outpath,filename,sep = "")
output1=paste(filepath,"Corrected.csv",sep = "_")
write.csv(wholedata[[2]], file = output1)
 
}
return(0)
} 
 
MakeWindow<-function(dframe)
{
# copy the current frame
  dframecpy=dframe
  # initialize the non value
  nanva=1
 
  #create the window sizes array {3, 5 ......,31}
  windows<-seq(3,31, 2)
  x=""
  #create the vector of window method array
  methods=c("MF","MC","MB","S")
 
  # for each method 
  for(ty in 1:4)
  {
  # within each method , for each window size 
    for(wind in windows)
    {
#assign the column name by concatenating the method name and window size
      x=paste(methods[ty],wind)
# compute the window value and corrected value 
      lists<-Window(dframe[1:nrow(dframe), 3],dframe[1:nrow(dframe), 4],wind,ty,nanva)
 
	  #add the window value column to the data frame
      dframe[,x] =lists[[1]]
	  #add the corrected value column to the data frame
      dframecpy[,x]=lists[[2]]
 
    }
  }
 
  #create new list 
  ret=list(dframe,dframecpy)
return(ret)
 
}
 
#window
#this function implements the 
#parameter 
#lst1:first list
# lst2:second list
# win:window length
# type: window type (1 :Moving wind and forward , 2 :moving window and centre , 3 :moving window and backward, 4: Sequential window)
#nanval: NAN value
#Output : list with two items, First is the new value based on window type and length , second is the corrected data
 
Window<-function(lst1,lst2,win,type,nanval)
{
#Initialize two vectors (results and correction)
  ResFactor <- rep(NA, length(lst1))
  Correctedvalue <- rep(NA, length(lst1))
#for moving window and Forward
   if(type==1)
  {
 
    for(var in 1:length(lst1))
    {
#compute the window
      swin=var
      ewin=var+win-1
 
      # when the current window less than window size
      if(ewin >length(lst1)){ ewin=swin}
 
	  #Initialize the variables
      sum1=0
      sum2=0
      res=0
 
	for(winvar in swin:ewin)
	{
	  sum1=sum1+lst1[[winvar]]
	  sum2=sum2+lst2[[winvar]]
	}
 
	if(sum2==0) {res=0}
	else if(sum1==0) {res=nanval}
	else {res=sum2/sum1 }
 
#Round the result
      ResFactor[[var]]=specify_decimal(res, 2)
      #Correct Data
      if(res==1)
      Correctedvalue[[var]]=lst2[[var]]
      else
      Correctedvalue[[var]]=specify_decimal(lst1[[var]]*res,2)
 
    }
 
  }
  else if(type==2) #for moving window and Center
  {
 
 
    for(var in 1:length(lst1))
    {
      #compute the window
      swin=var-as.integer(win/2)
      ewin=var+as.integer(win/2)
 
      #Intialize the variables
      sum1=0
      sum2=0
      res=0
      if(swin<1 || ewin >length(lst1))
      {
      swin=var
      ewin=var
      }
	for(winvar in swin:ewin)
	{
 
	  sum1=sum1+lst1[[winvar]]
	  sum2=sum2+lst2[[winvar]]
	}
 
	if(sum2==0) {res=0}
	else if(sum1==0) {res=nanval}
	else {res=sum2/sum1 }
 
      ResFactor[[var]]=specify_decimal(res, 2)
       #Correct Data
      if(res==1)
      Correctedvalue[[var]]=lst2[[var]]
      else
      Correctedvalue[[var]]=specify_decimal(lst1[[var]]*res,2)
    }
 
  }
   else if(type==3) #for moving window and Backword
  {
 
    for(var in 1:length(lst1))
    {
      #compute the window
      swin=var-win+1
      ewin=var
 
      #need Confirm
      #if(ewin >length(lst1){ ewin=length(lst1)}
      sum1=0
      sum2=0
      res=0
      if(swin<1)
      {
      swin=var
       }
	for(winvar in swin:ewin)
	{
	  sum1=sum1+lst1[[winvar]]
	  sum2=sum2+lst2[[winvar]]
	}
	if(sum2==0) {res=0}
	else if(sum1==0) {res=nanval}
	else {res=sum2/sum1 }
 
      ResFactor[[var]]=specify_decimal(res, 2)
       #Correct Data
      if(res==1)
      Correctedvalue[[var]]=lst2[[var]]
      else
      Correctedvalue[[var]]=specify_decimal(lst1[[var]]*res,2)
    }
 
  }
 
  else if(type==4) #for Sequential
  {
    #Initialize the counter
    var=1
    while(var <=length(lst1))
    {
      #compute the window
      swin=var
      ewin=var+win-1
 
 
      sum1=0
      sum2=0
      res=0
      if(ewin >length(lst1))
      {
      ewin=var
      }
 
	for(winvar in swin:ewin)
	{
 
	  sum1=sum1+lst1[[winvar]]
	  sum2=sum2+lst2[[winvar]]
	}
 
	if(sum2==0) {res=0}
	else if(sum1==0) {res=nanval}
	else {res=sum2/sum1 }
 
	for(winvar in swin:ewin)
	{
	  ResFactor[[winvar]]=specify_decimal(res, 2)
	   #Correct Data
      if(res==1)
      Correctedvalue[[var]]=lst2[[winvar]]
      else
      Correctedvalue[[var]]=specify_decimal(lst1[[winvar]]*res,2)
 
	}
 
	var=ewin+1
 
    }
 
  }
 
  # create the  list to be returned
  retres = list(ResFactor,Correctedvalue)
  return(retres)
 
}
 
#function Read 
 
args <- commandArgs(trailingOnly = TRUE)
cat("Done", ReadData(args[1]), "\n")
 
## This code plots Taylor's diagram
 
require (plotrix)
 
t1<-read.table("Tay_Zege.txt")
 
taylor.diagram(t1[,1],t1[,2],add = FALSE, col = "red",pch = 18, pos.cor = TRUE, xlab = "Standard deviation (mm)", ylab = "Standard deviation (mm)", main = "", show.gamma = T,ngamma = 6, gamma.col = 8, sd.arcs = 1, ref.sd = T,grad.corr.lines = c(0.2, 0.4, 0.6, 0.8, 0.9), pcex = 1,cex.axis = 1, normalize = FALSE)
 
taylor.diagram(t1[,1],t1[,3],add=T,col = "blue4",pch = 0)
taylor.diagram(t1[,1],t1[,4],add=T,col = "blueviolet",pch = 0)
taylor.diagram(t1[,1],t1[,5],add=T,col = "cornflowerblue",pch = 0)
taylor.diagram(t1[,1],t1[,6],add=T,col = "cyan",pch = 0)
taylor.diagram(t1[,1],t1[,7],add=T,col = "cadetblue",pch = 0)
taylor.diagram(t1[,1],t1[,8],add=T,col = "red",pch = 0)
taylor.diagram(t1[,1],t1[,9],add=T,col = "red4",pch = 0)
taylor.diagram(t1[,1],t1[,10],add=T,col = "orange",pch = 0)
taylor.diagram(t1[,1],t1[,11],add=T,col = "orange4",pch = 0)
taylor.diagram(t1[,1],t1[,12],add=T,col = "green",pch = 0)
taylor.diagram(t1[,1],t1[,13],add=T,col = "greenyellow",pch = 0)
taylor.diagram(t1[,1],t1[,14],add=T,col = "grey",pch = 0)
taylor.diagram(t1[,1],t1[,15],add=T,col = "khaki",pch = 0)
taylor.diagram(t1[,1],t1[,16],add=T,col = "khaki4",pch = 0)
taylor.diagram(t1[,1],t1[,17],add=T,col = "pink",pch = 0)
 
taylor.diagram(t1[,1],t1[,18],add=T,col = "blue4",pch = 1)
taylor.diagram(t1[,1],t1[,19],add=T,col = "blueviolet",pch = 1)
taylor.diagram(t1[,1],t1[,20],add=T,col = "cornflowerblue",pch = 1)
taylor.diagram(t1[,1],t1[,21],add=T,col = "cyan",pch = 1)
taylor.diagram(t1[,1],t1[,22],add=T,col = "cadetblue",pch = 1)
taylor.diagram(t1[,1],t1[,23],add=T,col = "red",pch = 1)
taylor.diagram(t1[,1],t1[,24],add=T,col = "red4",pch = 1)
taylor.diagram(t1[,1],t1[,25],add=T,col = "orange",pch = 1)
taylor.diagram(t1[,1],t1[,26],add=T,col = "orange4",pch = 1)
taylor.diagram(t1[,1],t1[,27],add=T,col = "green",pch = 1)
taylor.diagram(t1[,1],t1[,28],add=T,col = "greenyellow",pch = 1)
taylor.diagram(t1[,1],t1[,29],add=T,col = "grey",pch = 1)
taylor.diagram(t1[,1],t1[,30],add=T,col = "khaki",pch = 1)
taylor.diagram(t1[,1],t1[,31],add=T,col = "khaki4",pch = 1)
taylor.diagram(t1[,1],t1[,32],add=T,col = "pink",pch = 1)
 
taylor.diagram(t1[,1],t1[,33],add=T,col = "blue4",pch = 2)
taylor.diagram(t1[,1],t1[,34],add=T,col = "blueviolet",pch = 2)
taylor.diagram(t1[,1],t1[,35],add=T,col = "cornflowerblue",pch = 2)
taylor.diagram(t1[,1],t1[,36],add=T,col = "cyan",pch = 2)
taylor.diagram(t1[,1],t1[,37],add=T,col = "cadetblue",pch = 2)
taylor.diagram(t1[,1],t1[,38],add=T,col = "red",pch = 2)
taylor.diagram(t1[,1],t1[,39],add=T,col = "red4",pch = 2)
taylor.diagram(t1[,1],t1[,40],add=T,col = "orange",pch = 2)
taylor.diagram(t1[,1],t1[,41],add=T,col = "orange4",pch = 2)
taylor.diagram(t1[,1],t1[,42],add=T,col = "green",pch = 2)
taylor.diagram(t1[,1],t1[,43],add=T,col = "greenyellow",pch = 2)
taylor.diagram(t1[,1],t1[,44],add=T,col = "grey",pch = 2)
taylor.diagram(t1[,1],t1[,45],add=T,col = "khaki",pch = 2)
taylor.diagram(t1[,1],t1[,46],add=T,col = "khaki4",pch = 2)
taylor.diagram(t1[,1],t1[,47],add=T,col = "pink",pch = 2)
 
lpos1<-1.32*sd(t1[,1])
lpos2<-1.8*sd(t1[,1])
# add a legend
legend(lpos1,lpos2,legend=c("Uncor-CMORPH","3","5","7","9","11","13","15","17","19","21","23","25","27","29","31"),pch=c(18,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),col=c("red","blue4","blueviolet","cornflowerblue","cyan","cadetblue","red","red4","orange","orange4","green","greenyellow","grey","khaki","khaki4","pink"))
 
## R code for Time-Series Precipitation idw interpolation ##
 
library(gstat)
library(rgdal)
library(maptools)
 
data1<- read.csv("Data_6m5stn.csv", header=TRUE)
#stn <- read.csv("Station_coor.csv", header=TRUE)
 
stn <- data1[2:4]
 
stn$x = as.numeric((stn[["LON"]]))
stn$y = as.numeric((stn[["LAT"]]))
 
coordinates(stn) = ~x+y
proj4string(stn) = "+proj=utm +zone=37 +ellps=WGS84 +datum=WGS84 +units=m +no_defs"
 
##Generate new grid##
 
pixels <- 200
 
stn.grd <- expand.grid(x=seq(from=36,to=39,length.out=pixels),
                        y=seq(from=10,to=13,length.out=pixels))
 
grd.pts <- SpatialPixels(SpatialPoints((stn.grd)))
ngrid <- as(grd.pts, "SpatialGrid")
 
proj4string(ngrid) = "+proj=utm +zone=37 +ellps=WGS84 +datum=WGS84 +units=m +no_defs"
 
 
##IDW For-loop## 
 
library(raster)
 
wordir <- "/home/user/Project"
 
for (i in 1:30) {
col<- paste(data1[i+4], "~1", sep="")
idw <- idw(as.formula(col), stn, ngrid)
r <- raster(idw["var1.pred"])
 
plot(r, col=topo.colors(100), legend=FALSE, axes=FALSE)
title(paste("Precipitation_idw", 20030600+i, sep = ""))
 
minr <- as.numeric(format(round(minValue(r), 2), nsmall=2))
maxr <- as.numeric(format(round(maxValue(r), 2), nsmall=2))
 
r.range <- c(minr, maxr)
plot(r, legend.only=TRUE, col=topo.colors(100),
     legend.width=1, legend.shrink=0.75,
     axis.args=list(at=seq(minr, maxr, 0.2),
                    labels=seq(minr, maxr, 0.2),  
                    cex.axis=0.6),
     legend.args=list(text='Precipitation (corrected)', side=4, font=2, line=2.5, cex=0.8),border="grey50" )
 
writeRaster(r, filename=paste(wordir, "/img/Preci_idw_", 20030600+i, ".tif", sep = ""), format='GTiff', overwrite=TRUE) #create GoTiFF rasters
 
dev.copy(jpeg, file = paste( wordir, "/img/Preci_idw_", 20030600+i, ".jpg", #save plot to jpg
sep = ""), height = 17, width = 17, units = "cm", quality = 100,
res=150, bg="white") ; dev.off()
 
rm(idw)
}
 
##Extract pixel value from point coordinate##
 
setwd("/home/user/Project/img")
im <-  list.files(pattern='*.tif') # folder that contain all the save idw rasters
no <- length(im) 
 
imst <- stack(im[1:no])
 
x <- 37.255 #x,y -> station 6
y <- 11.93
 
data_stn6 <- data.frame(x,y)
coordinates(data_stn6) <- c("x", "y")
 
proj4string(data_stn6) = "+proj=utm +zone=37 +ellps=WGS84 +datum=WGS84 +units=m +no_defs"
 
ex <- extract(imst, data_stn6)
 
table <- t(ex) # transpose to columns
 
write.table(table, file = "Pixel_stn6_1.csv", sep = ",", col.names = "BF_Pixel", row.names =F,  
            qmethod = "double")  #save extract pixel values
 
##Combine with uncorrected pixel value and station 6 data##
 
data2<- read.csv("Zege_uncorrected_pixel.csv", header=TRUE) # uncorrected pixel & station 6 data
data_ex<- read.csv("Pixel_stn6_1.csv", header=TRUE)
 
raw_pix <- data2[1:30,1] # uncorrected pixel
 
stn6 <- data2[1:30,2] # station 6 values
 
 
data_ex$Un_Pix <- raw_pix
data_ex$Stn6 <- stn6
 
 
 
##plot line graph for comparison (Pixel vs Station)## 
 
##Pixel correction##
 
#data_plot <- read.csv("Line_plot.csv", header=TRUE)
 
bf<- as.numeric(data_ex[,1])
Un_pix <-as.numeric(data_ex[,2])
 
Corr <- bf * Un_pix
 
data_ex$Coor_Pixel <- Corr
 
write.table(data_ex, file = "Line_plot.csv", sep = ",", col.names = c("BF_Pixel", "Un_Pixel", "Point", "Coor_Pixel"), row.names=F, 
           qmethod = "double")
 
library(ggplot2)
library(reshape)
 
x  <- seq(1, 30, 1)
Corrected_CMORPH <- data_ex[,4]
Uncorrected_CMORPH <- data_ex[,2]
Gauge_Observation <- data_ex[,3]
df <- data.frame(x,Uncorrected_CMORPH, Corrected_CMORPH,Gauge_Observation)
 
d2 <- melt(df, id="x") #reshape the table structure into id, variable, value columns
ggplot(d2, aes(x, value, colour=variable)) + 
  geom_line() +
  geom_line() +
  scale_colour_manual(values=c("red", "blue", "green"), name="Type of line") +
  ggtitle(expression(atop("Comparison between Corrected Pixel vs Station 6")))+
  scale_x_continuous("June 2003") + 
  scale_y_continuous("Precipitation (mm)")
 
## Final Taylor's diagram for cross validation by leave one out method
 
require (plotrix)
 
t1<-read.table("Final_Tay_1")
taylor.diagram(t1[,1],t1[,2],add = FALSE, col = "red",pch = 18, pos.cor = F,  main = "", show.gamma = T,ngamma = 6, gamma.col = 8, sd.arcs = 1, ref.sd = T,grad.corr.lines = c(0.2, 0.4, 0.6, 0.8, 0.9), pcex = 1,cex.axis = 1, normalize = FALSE)
 
taylor.diagram(t1[,1],t1[,3],add=T,col = "blue",pch = 18)
 
lpos1<-0.4*sd(t1[,1])
lpos2<-2.1*sd(t1[,1])
# add a legend
legend(lpos1,lpos2,legend=c("Uncor-CMORPH","Corrected-CMORPH"),pch=c(18,18),col=c("red","blue"))

RESULTS and DISCUSSION

  • Fig 1. Taylor's diagram showing statistical indices for MW approach for Zege rain gauge station. Triangle, circle and rectangle represents FW, CW and BW schemes respectively. Sampling windows are shown by different colours.

Similar plots are drawn for 5 other stations and SW approach (see step 2 of method)

  • Fig 2. Bias factor spatial field after applying IDW interpolation. Results are shown (by dropping Zege station) for June 1, 2003. The uncorrected CMORPH field is multiplied by the bias field to result in bias corrected CMORPH estimates.

Similar plots are drawn for the entire June, 2003 (see step 3 of method)

  • Fig 3. Time series comparison of gauge observations (i.e., the bench mark) to the uncorrected and corrected CMORPH data. Analysis is shown for the pixel overlain on Zege station. Analysis period is June 1 - June 30, 2003 (see step 4 of method).

  • Fig 4. Taylor's diagram showing results of cross validation at Zege rain gauge station for the period June 1 - June 30, 2003 (see step 4 of method).

wikistudholland/utwente_haris.txt · Last modified: 2021/01/20 20:36 (external edit)