我正在将来自不同来源的数据整理到多个数据框中。每个数据帧都有维度 R_k X C_k。然后,我想将每个数据帧(由唯一标签(如 Object1_Time1)标识)导出到 Excel 中相应的命名范围(对应于唯一标签,即 Object1_Time1)。使用 R 可以实现类似的功能吗?
我研究了 openxlsx 和 xlsx,它们似乎都不支持本机。使用 R 以某种方式隔离 Excel 命名范围的单元格范围,然后在导出数据帧时将它们用作开始和结束列,这可能是一个想法吗?
我在下面提供了一个带有数据的可重现示例 - 我只是还没有成功。
范围值的计算和存储如下:
# Load the Excel workbook
wb <- loadWorkbook("...\\Test_Excel.xlsx")
# Get information about all named ranges
range_info <- getNamedRegions(wb)
# Extract position attribute as a vector
positions <- attr(range_info, "position")
# Drop values after (inclusive) the ":" character
positions_cleaned <- sub(":.*", "", positions)
col_start <- gsub("\\d", "", positions_cleaned)
row_start <- as.numeric(gsub("\\D", "", positions_cleaned))
# Function to convert Excel column letters to numeric values
excel_col_to_numeric <- function(col) {
col = toupper(col)
result = 0
for (i in 1:nchar(col)) {
result = result * 26 + (as.numeric(charToRaw(substr(col, i, i))) - as.numeric(charToRaw("A")) + 1)
}
return(result)
}
# Convert to numeric values
column_start_numeric <- lapply(col_start, excel_col_to_numeric)
代码主体如下:
library(writexl)
library(openxlsx)
library(dplyr)
library(purrr)
library(MASS)
library(tidyr)
# Create some sample dataframes
set.seed(123)
# Number of time periods and countries
num_time_periods <- 10
num_countries <- 10
# Generate panel data for the first data set
panel_data_1 <- expand.grid(TimePeriod = 1:num_time_periods, Country = 1:num_countries) %>%
mutate(Value = rnorm(n()))
# Reshape the data to wide format for the first set
panel_data_wide_1 <- spread(panel_data_1, key = TimePeriod, value = Value)
# Generate panel data for the second data set
panel_data_2 <- expand.grid(TimePeriod = 1:num_time_periods, Country = 1:num_countries) %>%
mutate(Value = rnorm(n()))
# Reshape the data to wide format for the second set
panel_data_wide_2 <- spread(panel_data_2, key = TimePeriod, value = Value)
# Store dataframes in a list with names based on named ranges
df_list <- list(NamedRange1 = panel_data_wide_1, NamedRange2 = panel_data_wide_2)
# Export each dataframe to an EXISTING named range in Excel
write_to_excel <- function(df, sheet_name = "Panel1", range_name) {
wb <- loadWorkbook("..\\Test_Excel.xlsx")
# Check if the sheet exists before writing data
if (sheet_name %in% getSheetNames(wb)) {
writeData(wb, sheet = sheet_name, x = df, startCol = 1, startRow = 1)
defineName(wb, name = range_name, formula = sheet_name)
saveWorkbook(wb, "..\\Test_Excel.xlsx", overwrite = TRUE)
} else {
warning(paste("Sheet", sheet_name, "does not exist. Data not exported."))
}
}
# Iterate over the list and export each dataframe
walk(df_list, ~write_to_excel(.x, range_name = .y))
这是我想要粘贴到的 Excel 工作表示例,其中命名范围延伸到绿色阴影区域。
使用
openxlsx
,我确定了以下内容:
# define function to extract named range sheet/row/col
fn_namedRng <- function(wb,rname,part) {
if(!part %in% c("sh","col","row","xy")) {
stop("Argument 'part' must be one of: 'sh','col','row' or 'xy'.")
}
i = which(c(getNamedRegions(wb)==rname))
sheet = c(attr(getNamedRegions(wb), "sheet"))[i]
rng = c(attr(getNamedRegions(wb), "position"))[i]
col1 = col2int(gsub("[0-9]","",gsub(":.*$", "", rng)))
col2 = col2int(gsub("[0-9]","",gsub( ".*:", "", rng)))
row1 = as.integer(gsub("[A-Z]","",gsub(":.*$", "", rng)))
row2 = as.integer(gsub("[A-Z]","",gsub( ".*:", "", rng)))
if(part == "sh"){
result = sheet
} else if (part == "col") {
result = col1
} else if (part == "row") {
result = row1
} else if (part == "xy") {
result = c(col1,row1)
}
return(result)
}
然后使用通常的
writeData
并引用自定义函数进行调用。 它将把框架转储到指定范围的左上角。 您不需要指定开始和结束列/行。
# send column names
nRng = "YourNamedRange"
writeData(
wb = wb,
sheet = fn_namedRng(wb,nRng,"sh"),
xy = fn_namedRng(wb,nRng,"xy"),
x = yourDataFrame, colNames = FALSE
)
您可以类似地为
writeData
构建一个环绕函数,以最大限度地减少调用函数来获取各个地址元素的需要。