class: center, middle, inverse, title-slide .title[ # Geospatial Analysis with R ] .subtitle[ ## Class 11 ] --- <img src="figures/tanzam_rainfall.png" width="100%" style="display: block; margin: auto;" /> --- ```r library(dplyr) library(ggplot2) library(sf) tz <- st_read(here::here("external/data/tanzania.geojson")) sagcot <- st_read(here::here("external/data/sagcotcl.geojson")) zambia <- st_read( system.file("extdata/districts.geojson", package = "geospaar") ) %>% st_union() prec <- geodata::worldclim_global(var = "prec", res = 2.5, path = "external/data/") precsum <- terra::app(prec, sum) tz_sagcot <- terra::vect(st_union(sagcot, zambia)) prec_tzzam <- terra::mask(terra::crop(precsum, tz_sagcot), tz_sagcot) prec_stars <- stars::st_as_stars(prec_tzzam) p <- ggplot() + geom_sf(data = tz) + stars::geom_stars(data = prec_stars) + scale_fill_viridis_c(name = "Rainfall (mm)", na.value = "transparent") + geom_sf(data = zambia, fill = "transparent") + geom_sf(data = sagcot, fill = "transparent") + labs(x = NULL, y = NULL) + theme_linedraw() ggsave(p, filename = "docs/figures/tanzam_rainfall.png", height = 4, width = 7, units = "in", dpi = 300) ``` --- ## Reading and writing data ### File paths Let's read in a csv a few different ways. Full path - clear for you, bad for code sharing. ```r data_tib <- read.csv( "/Users/lestes/Dropbox/teaching/geog246346/geospaar/inst/extdata/cdf_corn.csv" ) str(data_tib) ``` ``` ## 'data.frame': 5999 obs. of 20 variables: ## $ tenant : chr "planet" "planet" "planet" "planet" ... ## $ site_id : chr "UNLEN_CSP1_IMZ1" "UNLEN_CSP1_IMZ1" "UNLEN_CSP1_IMZ1" "UNLEN_CSP1_IMZ1" ... ## $ local : chr "2020-06-05 10:00:00-05:00" "2020-06-06 10:00:00-05:00" "2020-06-07 10:00:00-05:00" "2020-06-08 10:00:00-05:00" ... ## $ lat : num 41.2 41.2 41.2 41.2 41.2 ... ## $ long : num -96.5 -96.5 -96.5 -96.5 -96.5 ... ## $ site_group : chr "Lincoln" "Lincoln" "Lincoln" "Lincoln" ... ## $ Tair : num 27.4 25.5 29.6 27.7 26.7 ... ## $ Tabove : num 27.93 17.43 9.18 8.58 18.71 ... ## $ Tbelow : num 28 24.3 27 25.9 26.6 ... ## $ CGDD : int 345 363 381 393 402 414 425 439 455 471 ... ## $ b1r : num 0.098 0.052 0.0493 0.0484 0.0452 ... ## $ b2r : num 0.0822 0.0762 0.0739 0.0722 0.0676 ... ## $ b3r : num 0.2013 0.0911 0.0863 0.0848 0.0803 ... ## $ b4r : num 0.1447 0.1001 0.0991 0.0958 0.0899 ... ## $ b5r : num 0.0946 0.225 0.2378 0.246 0.2434 ... ## $ b6r : num 0.139 0.282 0.332 0.348 0.334 ... ## $ b7r : num 0.123 0.233 0.259 0.265 0.259 ... ## $ NDVI : num -0.0209 0.4754 0.54 0.5686 0.5755 ... ## $ CropType : chr "Corn" "Corn" "Corn" "Corn" ... ## $ PlantingDate: chr "04/20/2020" "04/20/2020" "04/20/2020" "04/20/2020" ... ``` --- ```r data_tib <- readr::read_csv( "/Users/lestes/Dropbox/teaching/geog246346/geospaar/inst/extdata/cdf_corn.csv" ) ``` ``` ## Rows: 5999 Columns: 20 ## ── Column specification ──────────────────────────────────────────── ## Delimiter: "," ## chr (5): tenant, site_id, site_group, CropType, PlantingDate ## dbl (14): lat, long, Tair, Tabove, Tbelow, CGDD, b1r, b2r, b3r,... ## dttm (1): local ## ## ℹ Use `spec()` to retrieve the full column specification for this data. ## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. ``` ```r str(data_tib) ``` ``` ## spc_tbl_ [5,999 × 20] (S3: spec_tbl_df/tbl_df/tbl/data.frame) ## $ tenant : chr [1:5999] "planet" "planet" "planet" "planet" ... ## $ site_id : chr [1:5999] "UNLEN_CSP1_IMZ1" "UNLEN_CSP1_IMZ1" "UNLEN_CSP1_IMZ1" "UNLEN_CSP1_IMZ1" ... ## $ local : POSIXct[1:5999], format: "2020-06-05 15:00:00" ... ## $ lat : num [1:5999] 41.2 41.2 41.2 41.2 41.2 ... ## $ long : num [1:5999] -96.5 -96.5 -96.5 -96.5 -96.5 ... ## $ site_group : chr [1:5999] "Lincoln" "Lincoln" "Lincoln" "Lincoln" ... ## $ Tair : num [1:5999] 27.4 25.5 29.6 27.7 26.7 ... ## $ Tabove : num [1:5999] 27.93 17.43 9.18 8.58 18.71 ... ## $ Tbelow : num [1:5999] 28 24.3 27 25.9 26.6 ... ## $ CGDD : num [1:5999] 345 363 381 393 402 414 425 439 455 471 ... ## $ b1r : num [1:5999] 0.098 0.052 0.0493 0.0484 0.0452 ... ## $ b2r : num [1:5999] 0.0822 0.0762 0.0739 0.0722 0.0676 ... ## $ b3r : num [1:5999] 0.2013 0.0911 0.0863 0.0848 0.0803 ... ## $ b4r : num [1:5999] 0.1447 0.1001 0.0991 0.0958 0.0899 ... ## $ b5r : num [1:5999] 0.0946 0.225 0.2378 0.246 0.2434 ... ## $ b6r : num [1:5999] 0.139 0.282 0.332 0.348 0.334 ... ## $ b7r : num [1:5999] 0.123 0.233 0.259 0.265 0.259 ... ## $ NDVI : num [1:5999] -0.0209 0.4754 0.54 0.5686 0.5755 ... ## $ CropType : chr [1:5999] "Corn" "Corn" "Corn" "Corn" ... ## $ PlantingDate: chr [1:5999] "04/20/2020" "04/20/2020" "04/20/2020" "04/20/2020" ... ## - attr(*, "spec")= ## .. cols( ## .. tenant = col_character(), ## .. site_id = col_character(), ## .. local = col_datetime(format = ""), ## .. lat = col_double(), ## .. long = col_double(), ## .. site_group = col_character(), ## .. Tair = col_double(), ## .. Tabove = col_double(), ## .. Tbelow = col_double(), ## .. CGDD = col_double(), ## .. b1r = col_double(), ## .. b2r = col_double(), ## .. b3r = col_double(), ## .. b4r = col_double(), ## .. b5r = col_double(), ## .. b6r = col_double(), ## .. b7r = col_double(), ## .. NDVI = col_double(), ## .. CropType = col_character(), ## .. PlantingDate = col_character() ## .. ) ## - attr(*, "problems")=<externalptr> ``` --- ## Working directory `"."` - Working directory. Use `getwd()` (from console) - Usually set to project folder. ```r getwd() ## if in an RMD, this will show the folder of the RMD ``` ``` ## [1] "/Users/lestes/Dropbox/teaching/geog246346/geospaar/docs" ``` Use `.` to start a file path from the working directory ```r list.files(".") ## ``` ```r data_tib <- readr::read_csv("./inst/extdata/cdf_corn.csv") ``` - Use ".." to go up one folder level ```r list.files(".") ## files in working directory list.files("..") ## files in folder one level up ``` --- ## User directory `"~"` - Set by environment variable - Use command below to see value ```r path.expand("~") ``` ``` ## [1] "/Users/lestes" ``` ```r data_tib <- readr::read_csv( "~/Dropbox/teaching/geog246346/geospaar/inst/extdata/cdf_corn.csv" ) ``` --- ## Writing files - Use `write.csv` or `readr::write_csv` to write ```r readr::write_csv(data_tib, file = "temp.csv") ## by default writes to wd() ``` --- ## Saving/loading files - If you want to save an R object, like a `data.frame`, `tibble` etc. - Use save, and `.rda` extension ```r save(data_tib, file = "temp.rda") ## by default writes to wd() ``` ```r data_tib <- NULL load(file = "temp.rda") ## loads file back to environment ``` --- ## Dates with `lubridate` - The main function you want to use is `as_date`, which can convert a character to date format. ```r library(lubridate) date1 <- as_date("2022-03-01") ## date in standard YYYY-MM-DD format print(date1) ``` ``` ## [1] "2022-03-01" ``` --- ## Dates with `lubridate` - More challenging with unclear date formats. ```r date2 <- as_date("3/1/22") ## is month or date first? ``` ``` ## Warning: All formats failed to parse. No formats found. ``` ```r date2 ``` ``` ## [1] NA ``` Include format as shown below. See [formats in this link](https://epirhandbook.com/en/working-with-dates.html) ```r date2 <- as_date("3/1/22", format = "%m/%d/%y" ) date2 ``` ``` ## [1] "2022-03-01" ``` We can also write dates in desired format ```r date2_char <- as.character(date2, format = "%A %B %d, %Y") date2_char ``` ``` ## [1] "Tuesday March 01, 2022" ``` --- ## Date formats <img src="figures/class10_date_formats.png" width="100%" style="display: block; margin: auto;" /> --- How can we read in this date? ```r date3 <- as_date("Apr 3, 1999", format = "...") date3 ``` ``` ## [1] NA ``` --- ## Exercises - Use `lapply` to make three `data.frame`s captured in a list `l`, each composed of one randomly sampled column `v1` (selecting from integers 1:10, with length = 20), and the second being `v2` composed of lowercase letters, randomly selected using `sample`, also of length 20. - The iterator in the `lapply` should be 10, 20, 30, which become the random seeds for the sampling (in the body of the `lapply`) - After making `l`, use a `for` loop to iterate through each element of `l`, writing each out to a folder `external/data/` in your project. - Change the name of each as part of the iteration, so that `l[[1]]` is written out as `external/data/dataset1.csv`, etc. Hint: you can use `paste0` to make each file path and name. - After writing these out, use another `lapply` to read back in the three datasets into a new list `l2`. Bonus: Use `dir` to programmatically read in the file paths from your `external/data` folder.