library(sf)
library(dplyr)
library(ggplot2)
library(rnaturalearth)
library(rnaturalearthdata)
data(world.cities, package = "maps")

world <- ne_countries(scale = "medium", returnclass = "sf")
afr_capitals <- world.cities %>% filter(capital == 1) %>% 
  st_as_sf(coords = c("long", "lat"), crs = 4326) %>% 
  st_intersection(., world %>% filter(continent == "Africa"))
p <- world %>% filter(continent == "Africa") %>% 
  ggplot() + geom_sf(aes(fill = name), lwd = 0.2) + 
  geom_sf(data = afr_capitals, col = "blue", size = 0.5) + 
  scale_fill_grey(guide = FALSE) + theme_minimal()
ggsave(here::here("external/slides/figures/africa_capitals.png"), 
       width = 5, height = 4, dpi = 300, bg = "transparent")

Today’s topics

  • control structures with emphasis on *apply functions

*apply

  • A special form of looping
  • Intended for applying a function to data. Uses anonymous function.
  • 3 main kinds: sapply, lapply, apply

sapply

sapply iterates over input and returns a vector.

v <- 1:10
sapply(v, function(x) x + 10) ## adds 10 to each element in v.
 [1] 11 12 13 14 15 16 17 18 19 20

Use { } for more complicated functions. BUT be careful with order of { }, ( )

v1 <- 1:10
v2 <- sapply(v1, function(x){
  y <- x^2 
  return(y)
}) #
print(v2)
 [1]   1   4   9  16  25  36  49  64  81 100

sapply

If you don’t specify return, the last object created will be returned.

v1 <- 1:10
v2 <- sapply(v1, function(x){
  y <- x^2  ## y will be returned
}) #
print(v2)
 [1]   1   4   9  16  25  36  49  64  81 100

lapply

  • Similar to sapply, except final object is returned as list.
  • Useful if you need to store more complex objects (data.frame, plot, raster etc.)
v1 <- 1:10
v2 <- lapply(v1, function(x){
  y <- x^2  ## y will be returned
}) #
print(v2)
[[1]]
[1] 1

[[2]]
[1] 4

[[3]]
[1] 9

[[4]]
[1] 16

[[5]]
[1] 25

[[6]]
[1] 36

[[7]]
[1] 49

[[8]]
[1] 64

[[9]]
[1] 81

[[10]]
[1] 100

apply

apply works well for 2D data, when you want to apply function over a row or column.

v1 <- sample(1:100, 10)
v2 <- sample(1:100, 10)
DF <- data.frame(v1, v2) ## data frame columns will take names of vectors
DF
   v1 v2
1  56 97
2  70  1
3  87 11
4  79 86
5  55 42
6  85 69
7  30 70
8  57 71
9  90  3
10 52 50

Use apply to get column max value. The index 2 means “apply function to columns”.

colMax <- apply(DF, 2, FUN = max)
colMax
v1 v2 
90 97 

Use apply to get row max value. The index 1 means “apply function to rows”.

rowMax <- apply(DF, 1, FUN = max)
rowMax
 [1] 97 70 87 86 55 85 70 71 90 52

We can use apply or sapply to create a new column in a data frame.

DF$rowMax <- apply(DF, 1, FUN = max)
DF
   v1 v2 rowMax
1  56 97     97
2  70  1     70
3  87 11     87
4  79 86     86
5  55 42     55
6  85 69     85
7  30 70     70
8  57 71     71
9  90  3     90
10 52 50     52