第 34 章 空间数据可视化

王江浩 北京城市实验室

Robert J. Hijmans 75 开发了 raster 包用于网格空间数据的读、写、操作、分析和建模,同时维护了空间数据分析的网站 https://www.rspatial.org。Edzer Pebesma 76 和 Roger Bivand 等创建了 sp 包定义了空间数据类型和方法,提供了大量的空间数据操作方法,同时维护了空间数据对象 sp 的绘图网站 https://edzer.github.io/sp/,他们也一起合作写了新书 Spatial Data Science,提供了在线 网页版 书籍及其 源代码。Edzer Pebesma 后来开发了 sf 包重新定义了空间数据对象和操作方法,并维护了空间数据分析、建模和可视化网站 https://www.r-spatial.org/

课程案例学习

  1. 2018-Introduction to Geospatial Raster and Vector Data with R 空间数据分析课程
  2. Peter Ellis 新西兰大选和普查数据 More cartograms of New Zealand census data: district and city level
  3. 2017-Mapping oil production by country in R 石油产量在全球的分布
  4. 2017-How to highlight countries on a map 高亮地图上的国家
  5. 2017-Mapping With Sf: Part 3
  6. Data Visualization Shiny Apps 数据可视化核密度估计 In this app I identify crime hotspots using a bivariate density estimation strategy
  7. Association of Statisticians of American Religious Bodies (ASARB) viridis USA map
  8. 出租车行车轨迹数据
  9. Geospatial processing with Clickhouse-CARTO Blog

34.1 空间数据

空间数据存储在数据库中,比如 PostGIS,它是对象关系数据库 PostgreSQL 在空间数据库方面的扩展。

34.1.1 data.frame

data("quakes")
quakes
##         lat   long depth mag stations
## 1    -20.42 181.62   562 4.8       41
## 2    -20.62 181.03   650 4.2       15
## 3    -26.00 184.10    42 5.4       43
## 4    -17.97 181.66   626 4.1       19
## 5    -20.42 181.96   649 4.0       11
....

34.1.2 sp

sp-gallery

空间数据对象,以类 sp 方式存储 [88]

library(sp)
data("meuse")
coordinates(meuse) <- ~x+y
proj4string(meuse) <- CRS("+init=epsg:28992")
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum Amersfoort in Proj4 definition
class(meuse)
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
## Warning in proj4string(meuse): CRS object has comment, which is lost in output
## [1] "+proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs"
plot(meuse, axes = TRUE)
sp 对象

图 10.7: sp 对象

library(rgdal)
crs.longlat <- CRS("+init=epsg:4326")
meuse.longlat <- spTransform(meuse, crs.longlat)
plot(meuse.longlat, axes = TRUE)
sp 对象

图 10.8: sp 对象

library(maptools)
fname <- system.file("shapes/sids.shp", package = "maptools")
p4s <- CRS("+proj=longlat +datum=NAD27")
nc <- readShapePoly(fname, proj4string = p4s)
## Warning: readShapePoly is deprecated; use rgdal::readOGR or sf::st_read
plot(nc, axes = TRUE, col = grey(1 - nc$SID79 / 57))

maptools 提供的 readShapePoly 函数去读取 shp 文件的方式已经过时,推荐使用 rgdal::readOGR 或者 sf::st_read 方式读取

# Trellis maps
arrow <- list("SpatialPolygonsRescale",
  layout.north.arrow(2),
  offset = c(-76, 34), scale = 0.5, which = 2
)
spplot(nc, c("SID74", "SID79"),
  as.table = TRUE,
  scales = list(draw = T), sp.layout = arrow
)

34.1.3 sf

library(sf)
library(ggplot2)
nc <- read_sf(system.file("gpkg/nc.gpkg", package = "sf"))
nc2 <- nc |> 
  dplyr::select(SID74, SID79) |> 
  tidyr::gather(VAR, SID, -geom)
ggplot() +
  geom_sf(data = nc2, aes(fill = SID)) +
  facet_wrap(~VAR, ncol = 1)

34.1.4 raster

raster 包定义了获取和操作空间 raster 类型数据集的类和方法,rasterVis 补充加强了 raster 包在数据可视化和交互方面的功能。可视化是基于 lattice 的

raster 包的开发已经被作者 Robert J. Hijmans 迁移到 Github 上啦,官方文档 https://www.rspatial.org/

星号 * 标记的是 S3 方法

methods(plot)
##  [1] plot,ANY,ANY-method                       
##  [2] plot,color,ANY-method                     
##  [3] plot,Extent,missing-method                
##  [4] plot,Raster,ANY-method                    
##  [5] plot,Raster,Raster-method                 
##  [6] plot,Spatial,missing-method               
##  [7] plot,SpatialGrid,missing-method           
##  [8] plot,SpatialGridDataFrame,missing-method  
##  [9] plot,SpatialLines,missing-method          
## [10] plot,SpatialMultiPoints,missing-method    
## [11] plot,SpatialPixels,missing-method         
## [12] plot,SpatialPixelsDataFrame,missing-method
## [13] plot,SpatialPoints,missing-method         
## [14] plot,SpatialPolygons,missing-method       
## [15] plot.acf*                                 
## [16] plot.bclust*                              
## [17] plot.classIntervals*                      
## [18] plot.data.frame*                          
## [19] plot.decomposed.ts*                       
## [20] plot.default                              
## [21] plot.dendrogram*                          
## [22] plot.density*                             
## [23] plot.ecdf                                 
## [24] plot.factor*                              
## [25] plot.formula*                             
## [26] plot.function                             
## [27] plot.ggplot*                              
## [28] plot.gtable*                              
## [29] plot.hcl_palettes*                        
## [30] plot.hclust*                              
## [31] plot.histogram*                           
## [32] plot.HoltWinters*                         
## [33] plot.ica*                                 
## [34] plot.isoreg*                              
## [35] plot.lm*                                  
## [36] plot.medpolish*                           
## [37] plot.mlm*                                 
## [38] plot.ppr*                                 
## [39] plot.prcomp*                              
## [40] plot.princomp*                            
## [41] plot.profile.nls*                         
## [42] plot.R6*                                  
## [43] plot.raster*                              
## [44] plot.sf*                                  
## [45] plot.sfc_CIRCULARSTRING*                  
## [46] plot.sfc_GEOMETRY*                        
## [47] plot.sfc_GEOMETRYCOLLECTION*              
## [48] plot.sfc_LINESTRING*                      
## [49] plot.sfc_MULTILINESTRING*                 
## [50] plot.sfc_MULTIPOINT*                      
## [51] plot.sfc_MULTIPOLYGON*                    
## [52] plot.sfc_POINT*                           
## [53] plot.sfc_POLYGON*                         
## [54] plot.sfg*                                 
## [55] plot.shingle*                             
## [56] plot.SOM*                                 
## [57] plot.somgrid*                             
## [58] plot.spec*                                
## [59] plot.stars*                               
## [60] plot.stars_proxy*                         
## [61] plot.stepfun                              
## [62] plot.stft*                                
## [63] plot.stl*                                 
## [64] plot.svm*                                 
## [65] plot.table*                               
## [66] plot.trans*                               
## [67] plot.trellis*                             
## [68] plot.ts                                   
## [69] plot.tskernel*                            
## [70] plot.TukeyHSD*                            
## [71] plot.tune*                                
## [72] plot.units*                               
## [73] plot.wk_crc*                              
## [74] plot.wk_rct*                              
## [75] plot.wk_wkb*                              
## [76] plot.wk_wkt*                              
## [77] plot.wk_xy*                               
## see '?methods' for accessing help and source code

查看函数的定义

getAnywhere(plot.raster)
## A single object matching 'plot.raster' was found
## It was found in the following places
##   registered S3 method for plot from namespace graphics
##   namespace:graphics
## with value
## 
## function (x, y, xlim = c(0, ncol(x)), ylim = c(0, nrow(x)), xaxs = "i", 
##     yaxs = "i", asp = 1, add = FALSE, ...) 
## {
##     if (!add) {
##         plot.new()
##         plot.window(xlim = xlim, ylim = ylim, asp = asp, xaxs = xaxs, 
##             yaxs = yaxs)
##     }
##     rasterImage(x, 0, 0, ncol(x), nrow(x), ...)
## }
## <bytecode: 0x55db4c04eac8>
## <environment: namespace:graphics>

rasterImage 函数来绘制图像,如果想知道 rasterImage 的内容可以继续看 getAnywhere(rasterImage)

getAnywhere(rasterImage)
## A single object matching 'rasterImage' was found
## It was found in the following places
##   package:graphics
##   namespace:graphics
## with value
## 
## function (image, xleft, ybottom, xright, ytop, angle = 0, interpolate = TRUE, 
##     ...) 
## {
##     .External.graphics(C_raster, if (inherits(image, "nativeRaster")) image else as.raster(image), 
##         as.double(xleft), as.double(ybottom), as.double(xright), 
##         as.double(ytop), as.double(angle), as.logical(interpolate), 
##         ...)
##     invisible()
## }
## <bytecode: 0x55db4b9fbb48>
## <environment: namespace:graphics>

通过查看函数的帮助 ?rasterImage ,我们需要重点关注一下 参数 image 传递的 raster 对象

plot(c(100, 250), c(300, 450), type = "n", xlab = "", ylab = "")
image <- as.raster(matrix(0:1, ncol = 5, nrow = 3))
rasterImage(image, 100, 300, 150, 350, interpolate = FALSE)
rasterImage(image, 100, 400, 150, 450)
rasterImage(image, 200, 300, 200 + xinch(.5), 300 + yinch(.3),
  interpolate = FALSE
)
rasterImage(image, 200, 400, 250, 450, 
            angle = 15, interpolate = FALSE)
raster 图像

图 34.1: raster 图像

library(raster)
meuse.test <- raster(x = system.file("external/test.grd", package="raster"))
class(meuse.test)
## [1] "RasterLayer"
## attr(,"package")
## [1] "raster"
plot(meuse.test, legend = F)
raster 对象

图 10.14: raster 对象

34.1.5 stars

Edzer Pebesma 开发了 stars 包

# https://resources.rstudio.com/rstudio-conf-2019/spatial-data-science-in-the-tidyverse
library(abind)
library(sf)
library(cubelyr)
library(stars)
x <- read_stars(system.file("tif/L7_ETMs.tif", package = "stars"))

ggplot() +
  geom_stars(data = x) +
  coord_equal() +
  facet_wrap(~band) +
  theme_bw() +
  scale_fill_viridis_c() +
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_discrete(expand = c(0, 0))

34.2 可视化

34.2.1 斐济地震带分布

相比于 plotlyecharts4r 更加轻量,这得益于 JavaScript 库 Apache ECharts。 前者 MIT 协议,后者采用 Apache-2.0 协议,都可以商用。Apache ECharts 是 Apache 旗下顶级开源项目,由百度前端技术团队贡献,中文文档也比较全,学习起来门槛会低一些。

library(echarts4r)
quakes |>
  e_charts(x = long) |>
  e_geo(
    roam = TRUE,
    boundingCoords = list(
      c(185, -10),
      c(165, -40)
    )
  ) |>
  e_scatter(
    serie = lat,
    size = mag, # 点的大小映射到震级
    # legend = F, # 是否移除图例
    name = "斐济地震带",
    coord_system = "geo"
  ) |>
  e_visual_map(
    serie = mag, scale = e_scale,
    inRange = list(color = terrain.colors(10))
  ) |>
  e_tooltip()

图 34.2: 斐济地震带

34.2.2 美国各城镇失业率

# 数据来源 https://datasets.flowingdata.com/unemployment09.csv
unemp <- read.csv(
  file = "http://datasets.flowingdata.com/unemployment09.csv",
  header = FALSE, stringsAsFactors = FALSE
)
names(unemp) <- c(
  "id", "state_fips", "county_fips", "name", "year",
  "?", "?", "?", "rate"
)
unemp$county <- tolower(gsub(" County, [A-Z]{2}", "", unemp$name))
unemp$state <- gsub("^.*([A-Z]{2}).*$", "\\1", unemp$name)

county_df <- map_data("county")
names(county_df) <- c("long", "lat", "group", "order", "state_name", "county")
county_df$state <- state.abb[match(county_df$state_name, tolower(state.name))]
county_df$state_name <- NULL

state_df <- map_data("state")
# Combine together
choropleth <- merge(county_df, unemp, by = c("state", "county"))
choropleth <- choropleth[order(choropleth$order), ]
choropleth$rate_d <- cut(choropleth$rate, breaks = c(seq(0, 10, by = 2), 35))

library(ggthemes)
ggplot(choropleth, aes(long, lat, group = group)) +
  geom_polygon(aes(fill = rate_d), colour = alpha("white", 1 / 4), size = 0.2) +
  geom_polygon(data = state_df, colour = "white", fill = NA) +
  scale_fill_brewer(palette = "PuRd") +
  labs(
    fill = "ratio", title = "ratio of unemployment by county, 2009",
    caption = "data source: http://datasets.flowingdata.com/unemployment09.csv"
  ) +
  coord_map("polyconic") +
  theme_map()
2009年美国各城镇失业率

图 34.3: 2009年美国各城镇失业率

# 来自帮助文档 ?map 
library(mapproj)    # mapproj is used for  projection="polyconic"
# color US county map by 2009 unemployment rate
# match counties to map using FIPS county codes
# Based on J's solution to the "Choropleth Challenge"
# http://blog.revolutionanalytics.com/2009/11/choropleth-challenge-result.html

# load data
# unemp includes data for some counties not on the "lower 48 states" county
# map, such as those in Alaska, Hawaii, Puerto Rico, and some tiny Virginia cities
data(unemp)
data(county.fips)

# define color buckets
colors <- c("#F1EEF6", "#D4B9DA", "#C994C7", "#DF65B0", "#DD1C77", "#980043")
unemp$colorBuckets <- as.numeric(cut(unemp$unemp, c(0, 2, 4, 6, 8, 10, 100)))
leg.txt <- c("<2%", "2-4%", "4-6%", "6-8%", "8-10%", ">10%")

# align data with map definitions by (partial) matching state,county
# names, which include multiple polygons for some counties
cnty.fips <- county.fips$fips[match(
  map("county", plot = FALSE)$names,
  county.fips$polyname
)]
colorsmatched <- unemp$colorBuckets[match(cnty.fips, unemp$fips)]

# draw map
map("county",
  col = colors[colorsmatched], fill = TRUE, resolution = 0,
  lty = 0, projection = "polyconic"
)
map("state",
  col = "white", fill = FALSE, add = TRUE, lty = 1, lwd = 0.2,
  projection = "polyconic"
)
title("unemployment by county, 2009")
legend("topright", leg.txt, horiz = TRUE, fill = colors)

美国各地区失业率地图,配不同颜色, colormap 适合给静态图配色

参考文献

[88]
E. J. Pebesma and R. S. Bivand, “Classes and methods for spatial data in R,” R News, vol. 5, no. 2, pp. 9–13, 2005,Available: https://cran.r-project.org/doc/Rnews/Rnews_2005-2.pdf