我有一个大约 20,000 行的数据框,其中包含分布在世界各地的点的纬度和经度。我需要确定这些点的人口密度,因此我考虑使用 GHSL 人口数据。然而,我不确定如何从 GHSL 人口数据中提取该信息,或者是否可能。 (
由于您尚未提供任何示例数据、有关使用哪个 GHSL 年份或您的首选语言的信息,因此这里是使用示例数据的 R 表示。
我使用了 30arcsec 分辨率 GHSL(~100m 像元大小),因为 3arcsec 选项太大,我的计算机无法处理。如果您需要更精细的分辨率,可以逐个下载 GHSL 图块,但这会很费力。鉴于您的点数据是全球范围内的,我怀疑 ~100m 就足够了。另请注意,此示例假设您的坐标为 WGS84/EPSG:4326。
首先,加载所需的包并创建示例点df:
library(rnaturalearth)
library(terra)
library(sf)
library(dplyr)
library(ggplot2)
# Load world polygons and project to WGS84 Pseudo Mercator/EPSG:3857
world <- ne_countries() %>%
filter(!admin == "Antarctica") %>%
st_transform(3857)
# Create example points df using world and convert coordinates to WGS84:EPSG4326
set.seed(1)
df <- st_sample(world, size = 20000, type = "random") %>%
st_as_sf() %>%
st_transform(4326) %>%
mutate(ID = 1:n(),
lon = st_coordinates(.)[,1],
lat = st_coordinates(.)[,2]) %>%
data.frame() %>%
select(-x)
head(df)
# ID lon lat
# 1 1 26.2272108 -1.853224
# 2 2 146.9548044 65.990806
# 3 3 -105.8491530 73.182944
# 4 4 -116.4395691 70.634154
# 5 5 97.1429112 34.367544
# 6 6 -0.8282728 46.360984
接下来,将 GHSL 加载为 SpatRaster:
# Load WGS84/EPSG:4326 GHSL, 30 arcsec, 202 from working directory,
# previously unzipped and downloaded from
# https://human-settlement.emergency.copernicus.eu/download.php
ghsl <- rast("data/GHS_BUILT_S_E2020_GLOBE_R2023A_4326_30ss_V1_0.tif")
ghsl <- ghsl * 1
names(ghsl) <- "pop_density_2020"
ghsl
# class : SpatRaster
# dimensions : 21384, 43201, 1 (nrow, ncol, nlyr)
# resolution : 0.008333333, 0.008333333 (x, y)
# extent : -180.0012, 180.0071, -89.10042, 89.09958 (xmin, xmax, ymin, ymax)
# coord. ref. : lon/lat WGS 84 (EPSG:4326)
# source : spat_391c2813558f_14620.tif
# varname : GHS_BUILT_S_E2020_GLOBE_R2023A_4326_30ss_V1_0
# name : pop_density_2020
# min value : 0
# max value : 848108
现在从 df 创建一个 sf 对象,从 ghsl 中提取单元格值,并将结果连接到 sf_points:
# Create sf from df
sf_points <- st_as_sf(df, coords = c("lon", "lat"), crs = 4326)
# Return population density per point
pop_den <- extract(ghsl, sf_points)
# Join population density to sf_points
sf_points <- left_join(sf_points, pop_den, by = "ID")
# Examine result
filter(sf_points, pop_density_2020 > 0)
# Simple feature collection with 2542 features and 2 fields
# Geometry type: POINT
# Dimension: XY
# Bounding box: xmin: -155.8613 ymin: -55.07594 xmax: 177.6207 ymax: 72.03816
# Geodetic CRS: WGS 84
# First 10 features:
# ID pop_density_2020 geometry
# 1 22 1365 POINT (-91.87298 35.95215)
# 2 45 21 POINT (21.92427 41.46684)
# 3 53 19 POINT (-80.00786 37.48202)
# 4 62 165 POINT (76.22779 19.11223)
# 5 65 88 POINT (101.8265 24.57473)
# 6 68 1712 POINT (-114.7794 42.88764)
# 7 72 366 POINT (-87.25635 33.31995)
# 8 73 76958 POINT (127.4405 36.84436)
# 9 81 720 POINT (-79.93223 -1.357552)
# 10 107 30 POINT (15.16695 -13.33818)