--- title: 'Getting started with irg' author: 'Alec L. Robitaille' date: '`r Sys.Date()`' output: rmarkdown::html_vignette: toc: true vignette: > %\VignetteIndexEntry{Getting started with irg} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r knitrsetup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = '#>', eval = FALSE ) options(scipen = 9999) ``` The `irg` package opts for a tabular calculation of the instantaneous rate of green-up (IRG) as opposed to a raster based approach. Sampling MODIS imagery is left up to the user and a prerequisite for all functions. The main input (`DT`) for all functions is a [`data.table`](https://github.com/Rdatatable/data.table) of an NDVI time series. The sampling unit (`id`) is flexible (a decision for the user) though we would anticipate points or polygons, or maybe a pixel. All functions leverage the speed of `data.table` to efficiently filter, scale and model NDVI time series, and calculate IRG. ## Installation Install with CRAN ```{r, eval = FALSE} # Install install.packages('irg') ``` or R-universe ```{r, eval = FALSE} # Enable the robitalec universe options(repos = c( robitalec = 'https://robitalec.r-universe.dev', CRAN = 'https://cloud.r-project.org')) # Install install.packages('irg') ``` ### Packages `irg` depends on three packages (and `stats`): * [`data.table`](https://github.com/Rdatatable/data.table) for all tabular processing * [`RcppRoll`](https://github.com/kevinushey/RcppRoll) for fast rolling medians in `filter_roll`. * [`chk`](https://github.com/pre-processing-r/chk) for internal checks. ### Input data `irg` requires an NDVI time series in a `data.table`. Though names can be different and specified at input, but the default names and required columns are: * id: sampling unit identification (point id, polygon id, ...) * yr: year of sample * DayOfYear: julian day * NDVI: sampled NDVI value * SummaryQA: Quality reliability of pixel SummaryQA details: * 0: Good data, use with confidence * 1: Marginal data, useful but look at detailed QA for more information * 2: Pixel covered with snow/ice * 3: Pixel is cloudy Let's take a look at the example data. ```{r extdata, eval = TRUE} library(irg) library(data.table) ndvi <- fread(system.file('extdata', 'sampled-ndvi-MODIS-MOD13Q1.csv', package = 'irg')) # or look at the help page ?ndvi ``` ```{r printdata, eval = TRUE, echo = FALSE} knitr::kable(ndvi[90:95]) ``` #### data.frame If your data is a `data.frame`, convert it by reference: ```{r setdt} # Pretend DF <- as.data.frame(ndvi) # Convert by reference setDT(DF) ``` #### Sampling NDVI Though `irg` is not involved in the sampling step, it is important that the input data matches the package's expectations. We used the incredible [Google Earth Engine](https://earthengine.google.com/) to sample MODIS NDVI (MOD13Q1.006). There are also R packages specific to MODIS ([`MODIStsp`](https://github.com/ropensci/MODIStsp)) and general purpose raster operations ([`raster`](https://github.com/rspatial/raster)), and an R interface to Earth Engine [`rgee`](https://github.com/r-spatial/rgee). Update: we recently added the `use_example_ee_script()` function which offers an example script for extracting NDVI in Earth Engine. There are two versions, one for sampling MODIS MOD13Q1 and another for sampling Landsat 8. #### Temporal extent Filtering steps in `irg` use a baseline 'winterNDVI' and upper quantile as described by Bischoff et al. (2012). These steps require multiple years of sampled NDVI for each `id`. In addition, make sure to include all samples throughout the year, leaving the filtering for `irg`. ## Workflow ```{r functGraphViz, eval = TRUE, echo = FALSE, self_contained = FALSE} library(DiagrammeR) g <- grViz( " digraph irg_functions { graph [rankdir=LR, compound=TRUE, fontsize = 28] node[shape=none, fontsize=28] subgraph cluster_filt{ label= '1)'; labeljust='l'; Filtering -> filter_ndvi [dir=none] filter_ndvi -> filter_qa [dir=none] filter_ndvi -> filter_winter [dir=none] filter_ndvi -> filter_roll [dir=none] filter_ndvi -> filter_top [dir=none] filter_top -> filter_roll -> filter_winter -> filter_qa [dir=back] {rank=same; filter_qa; filter_winter; filter_roll; filter_top} } subgraph cluster_scal{ label= '2)' labeljust='l'; Scaling -> scale_doy [dir=none] Scaling -> scale_ndvi [dir=none] } subgraph cluster_mod{ label= '3)' labeljust='l'; Modeling -> model_start [dir=none] Modeling -> model_params [dir=none] Modeling -> model_ndvi [dir=none] model_ndvi -> model_params -> model_start [dir=back] {rank=same; model_ndvi; model_params; model_start} } subgraph cluster_irg{ label= '4)' labeljust='l'; IRG -> calc_irg [dir=none] } Filtering -> Scaling -> Modeling -> IRG # irg -> Filtering # irg -> Scaling # irg -> Modeling # irg -> IRG } ", width = 700, height = 600) g ``` ```{r, eval = TRUE, echo = FALSE} fs <- data.table(functions = as.character(lsf.str('package:irg')))[, arguments := paste(unlist(formalArgs(functions)), collapse = ', ' ), by = functions] ``` There are `r nrow(fs[grepl('filter', functions)])` filtering functions, `r nrow(fs[grepl('scale', functions)])` scaling functions, `r nrow(fs[grepl('model', functions)])` modeling functions and `r nrow(fs[grepl('irg', functions)])` IRG functions. The `irg::irg` function is a wrapper for all steps - filtering, scaling, modeling and calculating IRG in one step. At this point, only defaults. Here's 5 rows from the result. For options, head to the steps below. ```{r, eval = TRUE} out <- irg(ndvi) ``` ```{r, eval = TRUE, echo = FALSE} knitr::kable(out[between(t, 0.4, 0.5)][1:5, .(id, yr, t, fitted, irg)]) ``` ## Filtering There are `r nrow(fs[grepl('filter', functions)])` filtering functions. ```{r, echo = FALSE, eval = TRUE} # fs[grepl('qa', functions), order := 1] # fs[grepl('winter', functions), order := 2] # fs[grepl('roll', functions), order := 3] # fs[grepl('top', functions), order := 4] knitr::kable(fs[grepl('filter', functions), .(functions, arguments)]) ``` ```{r, eval = FALSE} # Load data.table library(data.table) library(irg) # Read in example data ndvi <- fread(system.file('extdata', 'sampled-ndvi-MODIS-MOD13Q1.csv', package = 'irg')) # Filter NDVI time series filter_qa(ndvi, qa = 'SummaryQA', good = c(0, 1)) filter_winter(ndvi, probs = 0.025, limits = c(60L, 300L), doy = 'DayOfYear', id = 'id') filter_roll(ndvi, window = 3L, id = 'id', method = 'median') filter_top(ndvi, probs = 0.925, id = 'id') ``` ## Scaling Two scaling functions are use to scale the day of year column and filtered NDVI time series between 0-1. ```{r} # Scale variables scale_doy(ndvi, doy = 'DayOfYear') scale_ndvi(ndvi) ``` ## Modeling Three functions are used to model the NDVI times series to a double logistic curve, as described by Bischoff et al. (2012). $$fitted = \frac{1}{1 + e^ \frac{xmidS - t}{scalS}} - \frac{1}{1 + e^ \frac{xmidA - t}{scalA}}$$ Two options from this point are available: fitting NDVI and calculating IRG for observed data only, or for the full year. To calculate for every day of every year, specify `returns = 'models'` in `model_params`, `observed = FALSE` in `model_ndvi` and assign the output of `model_ndvi`. ```{r} # Guess starting parameters model_start(ndvi, id = 'id', year = 'yr') # Double logistic model parameters given starting parameters for nls mods <- model_params( ndvi, returns = 'models', id = 'id', year = 'yr', xmidS = 'xmidS_start', xmidA = 'xmidA_start', scalS = 0.05, scalA = 0.01 ) # Fit double log to NDVI fit <- model_ndvi(mods, observed = FALSE) ``` Alternatively, to calculate for the observed data only, specify `returns = 'columns'` in `model_params` and `observed = TRUE` in `model_ndvi`. ```{r} # Guess starting parameters model_start(ndvi, id = 'id', year = 'yr') # Double logistic model parameters given starting parameters for nls model_params( ndvi, returns = 'columns', id = 'id', year = 'yr', xmidS = 'xmidS_start', xmidA = 'xmidA_start', scalS = 0.05, scalA = 0.01 ) # Fit double log to NDVI model_ndvi(ndvi, observed = TRUE) ``` ## IRG $$IRG = \frac{e ^ \frac{t + xmidS}{scalS}}{2 scalS e ^ \frac{t + xmidS}{scalS} + scalS e ^ \frac{2t}{scalS} + scalS e ^ \frac{2midS}{scalS}}$$ Finally, calculate IRG: ```{r} # Calculate IRG for each day of the year calc_irg(fit) # or for observed data calc_irg(ndvi) ```