我正在尝试使用
fread
读取文件,并根据 both 索引和列名称选择列。我可以用 dplyr
做到这一点,但不能用 data.table
做到这一点。刚学后者。知道怎么做吗?无法在网上或帮助文件中找到解决方案。一个虚拟示例:
library(readr)
library(data.table)
# Create dummy data
DT <- data.table(ID = 1:50,
Code = sample(LETTERS[1:4], 50, replace = T),
State = rep(c("Alabama","Indiana","Texas","Nevada"), 50))
# Export to csv
write_csv(DT,"test.csv")
rm(DT)
# Import
DT <- fread("test.csv", select = "ID") # col name
DT <- fread("test.csv", select = c(2)) # col index
DT <- fread("test.csv", select = c("ID") | c(2)) # both = ERROR
DT <- fread("test.csv")[c("ID") | c(2)] # Error too (NOT IDEAL since loading all data anyway)
# Dplyr's approach
DT <- read_csv("test.csv", col_select = c("ID") | c(2)) # Works!
稍微改编https://stackoverflow.com/a/62207245/3358272:
cols <- colnames(fread("test.csv", nrows=0))
cols
# [1] "ID" "Code" "State"
fread("test.csv", select=which(cols %in% "ID" | seq_along(cols) %in% 2))
# ID Code
# <int> <char>
# 1: 1 D
# 2: 2 D
# 3: 3 B
# 4: 4 C
# 5: 5 D
# 6: 6 A
# 7: 7 C
# 8: 8 B
# 9: 9 B
# 10: 10 A
# ---
# 191: 41 D
# 192: 42 B
# 193: 43 D
# 194: 44 D
# 195: 45 D
# 196: 46 C
# 197: 47 D
# 198: 48 A
# 199: 49 A
# 200: 50 D