我想请求一个脚本来检测和合并(见下文)R中的标题行,当示例中有多行标题时。普遍的答案应该是: 1.确定标题行数(2到更多) 2.填充标题间隙(请参阅示例中的NA) 3.将所有标题行合并为一个。
我只能手动完成,见下文。对于包含任意行数的标头,这可能是可能的。
text1<-"NA h_row1a NA NA NA h_row1b NA NA NA
NA h_row2a NA h_row2b NA h_row2c NA h_row2d NA
NA h_row3a h_row3b h_row3c h_row3d h_row3e h_row3f h_row3g h_row3h
element1 2 24% 25 40 23 44% 76 34
element2 3 26% 40 86 233 12% 55 12"
table1<-read.table(text=text1, skip=3,header=FALSE)
cat(text1, file = "ex.data")
header<-scan("ex.data", nlines = 1, what = character(), sep="", na.strings = "NA")
library(zoo)
header<-na.locf(header, na.rm=FALSE) # this fills the header gaps
header2 <- scan("ex.data", skip = 1, nlines = 1, what = character(), sep="", na.strings = "NA")
header2<-na.locf(header2, na.rm=FALSE)
header3 <- scan("ex.data", skip = 2, nlines = 1, what = character(), sep="", na.strings = "NA")
names(table1) <- paste0(header, header2, header3)
table1
# NANANA h_row1ah_row2ah_row3a h_row1ah_row2ah_row3b h_row1ah_row2bh_row3c h_row1ah_row2bh_row3d h_row1bh_row2ch_row3e h_row1bh_row2ch_row3f, etc.
#1 element1 2 24% 25 40 23 44%, etc.
#2 element2 3 26% , etc.
你可以这样做。它使用rle
来查看有多少行无法强制到numeric
,并假设这些是标题。我还把第一列设为rownames - 不确定你是否想要这个。您可能还希望在完成此过程后将剩余值转换为numeric
- 此时它们仍然是character
。
tab <- read.table(text=text1, header=FALSE,stringsAsFactors = FALSE)
#estimate no of header rows
headrows <- rle(apply(tab,1,function(x)(any(!is.na(as.numeric(x))))))$lengths[1]
#fill in blanks in headers
tab[1:headrows,] <- t(apply(tab[1:headrows,],1,na.locf,na.rm=FALSE))
names(tab) <- apply(tab[1:headrows,],2,paste0,collapse="_")
tab <- tab[-c(1:headrows),] #remove header rows (now set as column names)
rownames(tab) <- tab[,1]
tab <- tab[,-1] #remove first column (now set as rownames)
tab
h_row1a_h_row2a_h_row3a h_row1a_h_row2a_h_row3b h_row1a_h_row2b_h_row3c h_row1a_h_row2b_h_row3d
element1 2 24% 25 40
element2 3 26% 40 86
h_row1b_h_row2c_h_row3e h_row1b_h_row2c_h_row3f h_row1b_h_row2d_h_row3g h_row1b_h_row2d_h_row3h
element1 23 44% 76 34
element2 233 12% 55 12