每个并行化中的 sf_distance

问题描述 投票:0回答:1



身份证 变量 SFC 点对象 邮政编码
1 10 点(543611.8 6389285) 2324
2 15 点(513611.8 6349285) 2324
3 12 点(533611.8 6359285) 2329

大约 200 万个观察值


身份证 变量 SFC 多边形对象
1 10 多边形 Z ((455302.7 6252026 9.09, 455292.6 6252034 9.09, 455274.8 6252036 9.9, 455246 6252113 14.25, 455286.1 6252124 14.15, 455293.5 6252126 14.13, 455317.8 6252068 14.13, 455331.5 6252073 14.13, 455345.5 6252044 14.78, 455302.7 6252026 9.09))


我已经计算出了要做的事情之间的距离,并且我还分割了 Total_df,以便在由邮政编码决定的较小子集上进行。




    registerDoParallel(cores = 6)    

# Use foreach to loop over list.dfs in parallel
foreach(d = 1:length(list.dfs), .packages = "sf", .combine = 'c') %dopar% {
  # Get the data frame at position 'd' in the list
  df <- list.dfs[[d]]
  # Open a list to store combined inner results 
  grand_list <- list()
  # Initialize an empty list to store the results of the inner loop
  inner_results <- list()
  # zip_code 
  zipcode <- sort(unique(Total_df$zipcode))

  # Use a regular for loop to iterate over zipcode
  for(i in zipcode) {
    cat(i, "\n")
    start_time <- Sys.time()
    # Subset the data
    subset_df <- Total_df[Total_df$zipcode == i, ]
    if(nrow(subset_df) > 0) {
      # Calculate distances
      distances <- sf::st_distance(subset_df, df)
      # Define the 'miin' function, or replace it with an appropriate function
      miin <- function(x) min(x, na.rm = TRUE)
      # Calculate minimum distances
      min_distances <- apply(distances, 1, miin)
      # Store minimum distances in a new column
      subset_df$min_distances <- min_distances
    end_time <- Sys.time()
    print(paste("Time for municipality Forest", i, ": ", end_time - start_time))
    # Store the updated subset_df in the inner_results list
    inner_results[[i]] <- subset_df
  # Combine the results of the inner loop using do.call
  grand_list[[d]] <- do.call(rbind, inner_results)


r geospatial distance parallel-foreach


registerDoParallel(cores = 6)

# Use foreach to loop over list.dfs in parallel
grand_list <- foreach(df = list.dfs, .packages = "sf") %dopar% {

  # Initialize an empty list to store the results of the inner loop
  inner_results <- list()

  # zip_code
  zipcode <- sort(unique(Total_df$zipcode))

  # Use a regular for loop to iterate over zipcode
  for(i in zipcode) {
    cat(i, "\n")
    start_time <- Sys.time()

    # Subset the data
    subset_df <- Total_df[Total_df$zipcode == i, ]

    if(nrow(subset_df) > 0) {
      # Calculate distances
      distances <- sf::st_distance(subset_df, df)

      # Define the 'miin' function, or replace it with an appropriate function
      miin <- function(x) min(x, na.rm = TRUE)

      # Calculate minimum distances
      min_distances <- apply(distances, 1, miin)

      # Store minimum distances in a new column
      subset_df$min_distances <- min_distances

    end_time <- Sys.time()
    print(paste("Time for municipality Forest", i, ": ", end_time - start_time))

    # Store the updated subset_df in the inner_results list
    inner_results[[i]] <- subset_df

  # Combine the results of the inner loop using do.call
  do.call(rbind, inner_results)



提示:使用 %do% 而不是 %dopar% 调试代码,并仅运行前两个值:

grand_list <- foreach(df = list.dfs[1:2], .packages = "sf") %do% { ... }


© www.soinside.com 2019 - 2024. All rights reserved.