获取数据集的一半值[关闭]

问题描述 投票:-1回答:3

我提前道歉这是我的第一篇文章。

我有以下数据集。我想让船只达到最大速度的一半(例如船只1 6秒,船只2 4秒)。我有20个不同的船只来获得它们的半速时间值。提前谢谢了。

vessel  time    speed 
vessel1 1   6
vessel1 2   7
vessel1 3   10
vessel1 4   16
vessel1 5   18
vessel1 6   20
vessel1 7   22
vessel1 8   24
vessel1 9   36
vessel1 10  37
vessel1 11  38
vessel1 12  38
vessel1 13  39
vessel1 14  40
vessel1 15  41
vessel1 16  41
vessel1 17  41
vessel2 1   7
vessel2 2   9
vessel2 3   12
vessel2 4   15
vessel2 5   20
vessel2 6   22
vessel2 7   23
vessel2 8   24
vessel2 9   25
vessel2 10  26
vessel2 11  27
vessel3 12  28
vessel4 13  29
vessel5 14  30
r
3个回答
1
投票

获取示例数据:

df <- read.table(text="
vessel  time    speed 
vessel1 1   6
vessel1 2   7
vessel1 3   10
vessel1 4   16
vessel1 5   18
vessel1 6   20
vessel1 7   22
vessel1 8   24
vessel1 9   36
vessel1 10  37
vessel1 11  38
vessel1 12  38
vessel1 13  39
vessel1 14  40
vessel1 15  41
vessel1 16  41
vessel1 17  41
vessel2 1   7
vessel2 2   9
vessel2 3   12
vessel2 4   15
vessel2 5   20
vessel2 6   22
vessel2 7   23
vessel2 8   24
vessel2 9   25
vessel2 10  26
vessel2 11  27
vessel3 12  28
vessel4 13  29
vessel5 14  30"
,header = T)

使用dplyr方法:

library(dplyr)

df %>%
  group_by(vessel) %>%              # for each vessel
  filter(speed >= max(speed)/2) %>% # keep all speed values above or equal to the half of max speed
  filter(speed == min(speed)) %>%   # keep the minimum speed value
  ungroup()                         # forget the grouping

# # A tibble: 5 x 3
#    vessel  time speed
#    <fctr> <int> <int>
# 1 vessel1     7    22
# 2 vessel2     4    15
# 3 vessel3    12    28
# 4 vessel4    13    29
# 5 vessel5    14    30

0
投票

下面的代码应该这样做(用read.csv(“youfilename.csv”)替换read.table:

mydf <- read.table(text="
vessel1 1   6
vessel1 2   7
vessel1 3   10
vessel1 4   16
vessel1 5   18
vessel1 6   20
vessel1 7   22
vessel1 8   24
vessel1 9   36
vessel1 10  37
vessel1 11  38
vessel1 12  38
vessel1 13  39
vessel1 14  40
vessel1 15  41
vessel1 16  41
vessel1 17  41
vessel2 1   7
vessel2 2   9
vessel2 3   12
vessel2 4   15
vessel2 5   20
vessel2 6   22
vessel2 7   23
vessel2 8   24
vessel2 9   25
vessel2 10  26
vessel2 11  27
vessel3 12  28
vessel4 13  29
vessel5 14  30"
)

colnames(mydf) = c("Vessel","Time","Speed")

middle_Speed_Time = by(mydf,mydf[,"Vessel"],function(x){ min(x[x[,3] >= max(x[,3])/2,2]) } )

第一行读入数据。

为方便起见,第二行只是给它们标准化的名称。

第三行通过每个容器的data.frame并找到机器超过半速的第一个时间点(即> = max(速度)/ 2)。


0
投票

你可以做:

d <- read.table(header=TRUE, text=
"vessel  time    speed 
vessel1 1   6
vessel1 2   7
vessel1 3   10
vessel1 4   16
vessel1 5   18
vessel1 6   20
vessel1 7   22
vessel1 8   24
vessel1 9   36
vessel1 10  37
vessel1 11  38
vessel1 12  38
vessel1 13  39
vessel1 14  40
vessel1 15  41
vessel1 16  41
vessel1 17  41
vessel2 1   7
vessel2 2   9
vessel2 3   12
vessel2 4   15
vessel2 5   20
vessel2 6   22
vessel2 7   23
vessel2 8   24
vessel2 9   25
vessel2 10  26
vessel2 11  27
vessel3 12  28
vessel4 13  29
vessel5 14  30")

which.halfspeed <- function(speed) {
  half <- max(speed)/2
  sum(speed<=half)
}

index <- tapply(d$speed, d$vessel, which.halfspeed)
index
mapply('[', split(d$time, d$vessel), index)
© www.soinside.com 2019 - 2024. All rights reserved.