Extracting values from raster image using terra package in R

terra

visualization

code

Author

Nyamisi Peter

Published

May 10, 2024

Introduction

When working with raster images, you may need to obtain the values of certain points from the raster for further analysis. This is often achieved by extracting specific values at particular points. The R packages terra (Hijmans, 2024) and tidyterra (Hernangómez, 2023) provide tools to efficient extract these values, whether you generate random points or have pre-defined locations. Below is a simplified overview of the process:

Before starting, ensure you have both terra and tidyterra installed and loaded into your R environment. Other packages such as sf also will be used in this post.

require(tidyterra)
require(terra)
require(sf)
require(tidyverse)

Data

We will use chlorophyll-a data collected along the Pemba Channel in Tanzania coastal waters. The data will be loaded using rast() function of terra package (Hijmans, 2024).

pemba = rast("pemba_crop.nc") |> 
  rename(flags = 1, chl = 2, land_mask = 3)

pemba

class       : SpatRaster 
dimensions  : 244, 244, 3  (nrow, ncol, nlyr)
resolution  : 0.004496807, 0.004496807  (x, y)
extent      : 38.90161, 39.99883, -5.699434, -4.602213  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 
source      : pemba_crop.nc 
names       : flags, chl, land_mask

We may then plot our data and visualize the distribution of the values in a map;

ggplot() +
  geom_spatraster(data = pemba |> 
                    filter(chl < 1) |> 
                    select(lyr = chl))+
  scale_fill_gradientn(colours = oce::oceColors9A(120))+
  coord_sf(expand = F) +
  theme(legend.position = "none",
        axis.title = element_blank(),
        axis.text = element_blank(),
        axis.ticks = element_blank())

Define your points

To generate random points within our water-covered study area, we will use the spatSample() function from the terra package. Focusing on the chl layer of our dataset, we will produce 100 random points using the method = "random" argument. To ensure that only points within the water body are considered, we will employ na.rm = TRUE to exclude any points falling on land where chlorophyll data is absent (represented as NA). The resulting output will be a SpatVector point geometry containing the generated 100 points and their corresponding chlorophyll values.

 pemba.data = spatSample(pemba |> select(chl), 
                  size = 100, 
                  method = "random",
                  na.rm = T, 
                  as.points = T,
                  values = T)

pemba.data

 class       : SpatVector 
 geometry    : points 
 dimensions  : 100, 1  (geometries, attributes)
 extent      : 39.04326, 39.98309, -5.697185, -4.635939  (xmin, xmax, ymin, ymax)
 coord. ref. : lon/lat WGS 84 
 names       :    chl
 type        :  <num>
 values      : 0.2281
               0.1174
               0.1519

Plotting points

Next, we will visualize our points on a map to observe their distribution. As shown in Figure 1, all our randomly generated points fall within the water, with none located on land.

ggplot()+
  ggspatial::annotation_map_tile(type = "osm", zoom = 8)+
  ggspatial::layer_spatial(data = pemba.data)+
  scale_x_continuous(breaks = seq(39.2,40,0.4), 
                     labels = metR::LonLabel(lon = seq(39.2,40,0.4)))+
  scale_y_continuous(breaks = seq(-5.6,-4.6,0.4),
                     labels = metR::LatLabel(lat = seq(-5.6,-4.6,0.4)))

Figure 1: The map of Pemba Channel showing the generated random points

The extracted spatvector points include chlorophyll values for each point, as illustrated in Figure 2.

pemba.data |> 
  ggplot() +
  geom_spatvector(aes(color = chl, size = chl))+
  scale_x_continuous(breaks = seq(39.2,40,0.4), 
                     labels = metR::LonLabel(lon = seq(39.2,40,0.4)))+
  scale_y_continuous(breaks = seq(-5.6,-4.6,0.4),
                     labels = metR::LatLabel(lat = seq(-5.6,-4.6,0.4)))+
  theme_bw()

Figure 2: The value of chlorophyll-a extracted from each generated random point

Extracting points as dataframe

Along with extracting your data in spatvector format, you may occasionally prefer to have your values in a dataframe. This can be achieved directly during data extraction. Instead of using the argument as.point = T, which returns the data as a spatvector, use the argument as.df = T to obtain the data in a dataframe format. Additionally, include the argument xy = T to capture the longitude and latitude coordinates for each point.

pemba.df = spatSample(pemba |> select(chl), 
                  size = 200, 
                  method = "random",
                  na.rm = T, 
                  # as.points = T,
                  as.df = T,
                  values = T,
                  xy = T) |> 
  rename(lon = x, lat = y) |> 
  as_tibble()

pemba.df

# A tibble: 200 × 3
     lon   lat   chl
   <dbl> <dbl> <dbl>
 1  39.4 -4.90 0.145
 2  39.4 -5.49 0.135
 3  39.5 -5.30 0.149
 4  39.5 -5.67 0.125
 5  39.5 -5.25 0.138
 6  39.2 -5.31 0.152
 7  39.4 -5.02 0.118
 8  39.6 -5.67 0.143
 9  39.3 -5.54 0.169
10  39.3 -5.66 0.175
# ℹ 190 more rows

Now that your points are in a dataframe, you can use them for any analysis you wish to conduct!

Thank you for following and do not miss the next post!!

Consultated references

Hernangómez, D., 2023. Using the tidyverse with terra objects: The tidyterra package. Journal of Open Source Software 8, 5751. https://doi.org/10.21105/joss.05751

Hijmans, R.J., 2024. Terra: Spatial data analysis.