我有一个数据框,每行有7个数字,我想做一个for或while循环告诉我,当一行是一行时。
数据框:
1st 2nd 3rd 4th 5th 6th 7th
1 5 32 34 38 39 49 8
2 10 20 21 33 40 44 34
3 10 20 26 28 35 48 13
4 14 19 23 36 44 46 7
5 9 24 25 27 36 38 41
6 7 13 14 20 29 32 28
7 11 22 24 28 29 38 20
8 1 11 29 33 36 44 37
9 9 12 25 31 43 44 5
10 1 5 6 31 39 46 44
11 14 19 23 36 44 46 7
期望的输出:
4 14 19 23 36 44 46 7
11 14 19 23 36 44 46 7
我尝试了代码但错误:lapply(df,function(i)all(df [i,] == df [1:nrow(df),]))
但这不正确。请指教,谢谢。
一个base R
选项将是
unique(Filter(Negate(is.null), lapply(seq_len(nrow(df)), function(i) {
i1 <- rowSums(df[i,][col(df)] == df)== ncol(df)
if(sum(i1) >1) df[i1,]}) ))
[1]]
# 1st 2nd 3rd 4th 5th 6th 7th
#4 14 19 23 36 44 46 7
#11 14 19 23 36 44 46 7
如果我们只对重复的行感兴趣
df[duplicated(df)|duplicated(df, fromLast = TRUE),]
# 1st 2nd 3rd 4th 5th 6th 7th
#4 14 19 23 36 44 46 7
#11 14 19 23 36 44 46 7
使用dplyr::group_by_all()
的选项可以非常方便:
library(dplyr)
df %>% group_by_all() %>%
filter(n()>1) # n()>1 will make sure to return only rows having duplicates
# # A tibble: 2 x 7
# # Groups: X1st, X2nd, X3rd, X4th, X5th, X6th, X7th [1]
# X1st X2nd X3rd X4th X5th X6th X7th
# <int> <int> <int> <int> <int> <int> <int>
# 1 14 19 23 36 44 46 7
# 2 14 19 23 36 44 46 7
数据:
df <- read.table(text =
"1st 2nd 3rd 4th 5th 6th 7th
1 5 32 34 38 39 49 8
2 10 20 21 33 40 44 34
3 10 20 26 28 35 48 13
4 14 19 23 36 44 46 7
5 9 24 25 27 36 38 41
6 7 13 14 20 29 32 28
7 11 22 24 28 29 38 20
8 1 11 29 33 36 44 37
9 9 12 25 31 43 44 5
10 1 5 6 31 39 46 44
11 14 19 23 36 44 46 7",
header = TRUE)