我管理着一个电影节,每年举办“最佳电影”竞赛。每年最多可以有 50 部电影和任意数量的评委。
我计划使用 R 来分析各个评委的分数,然后用 Markdown 发布一份报告,然后分发给他们。
以下是包含示例原始数据的数据框:
Film = c("Vertigo", "Fargo", "Platoon", "Beetlejuice", "Zodiac")
Mean = c( 7.45,6.98,7.5,7.2,5.82)
TrimmedMean = c(7.42, 7.04,7.36,7.05,6.26)
Judge1 = c(5.56,6.00,7.00,6.22,5.33)
Judge2 = c(6.89,8.81,8.83,7.22,6.78)
Judge3 = c(9.25,6.38,6.71,6.88,6.50)
Judge4 = c(8.00,7.67,7.44,6.89,7.22)
Judge5 = c(5.11,4.89,7.11,8.78,6.44)
Judge6 = c(9.89,8.11,7.89,7.22,2.67)
RawData = data.frame(Film,Mean,TrimmedMean,Judge1,Judge2,Judge3,Judge4,Judge5,Judge6)
出于本练习的目的,“平均”和“修剪平均”列可被视为附加“评判”。
我想根据每位评委的分数对电影进行排名,然后删除数值,留下如下所示的排序电影列表:
Mean = c('Platoon','Vertigo','Beetlejuice','Fargo','Zodiac')
TrimmedMean = c('Vertigo','Platoon','Beetlejuice','Fargo','Zodiac')
Judge1 = c('Platoon','Beetlejuice','Fargo','Vertigo','Zodiac')
Judge2 = c('Platoon','Fargo','Beetlejuice','Vertigo','Zodiac')
Judge3 = c('Vertigo','Beetlejuice','Platoon','Zodiac','Fargo')
Judge4 = c('Vertigo','Fargo','Platoon','Zodiac','Beetlejuice')
Judge5 = c('Beetlejuice','Platoon','Zodiac','Vertigo','Fargo')
Judge6 = c('Vertigo','Fargo','Platoon','Beetlejuice','Zodiac')
SortedData = data.frame(Mean,TrimmedMean,Judge1,Judge2,Judge3,Judge4,Judge5,Judge6)
我生成第一个表的主数据集包含 5 年前的数据,我可以按年份进行子集化。我希望能够分析不同年份的数据,除了指定要子集的年份之外,无需更改每年的代码。该代码应该响应不同年份的数据集,这些数据集将有不同数量的评委和电影。
有什么想法吗?
我认为您正在寻找的功能是
order()
。它显示了向量中值的相对顺序。
order(c(1.65, 6.92, 7.98, 3.42, 5.75), decreasing=TRUE)
# [1] 3 2 5 4 1
我们可以使用这个顺序,为每个法官查找
Film
列中的姓名。
data.frame(lapply(RawData[, -1], function(x) RawData$Film[order(x, decreasing=TRUE)]))
# Mean TrimmedMean Judge1 Judge2 Judge3 Judge4 Judge5 Judge6
# 1 Platoon Vertigo Platoon Platoon Vertigo Vertigo Beetlejuice Vertigo
# 2 Vertigo Platoon Beetlejuice Fargo Beetlejuice Fargo Platoon Fargo
# 3 Beetlejuice Beetlejuice Fargo Beetlejuice Platoon Platoon Zodiac Platoon
# 4 Fargo Fargo Vertigo Vertigo Zodiac Zodiac Vertigo Beetlejuice
# 5 Zodiac Zodiac Zodiac Zodiac Fargo Beetlejuice Fargo Zodiac
相同,但更优雅(我认为)和更快
SortedData <- RawData[, -1]
SortedData[] <- RawData$Film[sapply(RawData[, -1], order, decreasing=TRUE)]
SortedData
# Mean TrimmedMean Judge1 Judge2 Judge3 Judge4 Judge5 Judge6
# 1 Platoon Vertigo Platoon Platoon Vertigo Vertigo Beetlejuice Vertigo
# 2 Vertigo Platoon Beetlejuice Fargo Beetlejuice Fargo Platoon Fargo
# 3 Beetlejuice Beetlejuice Fargo Beetlejuice Platoon Platoon Zodiac Platoon
# 4 Fargo Fargo Vertigo Vertigo Zodiac Zodiac Vertigo Beetlejuice
# 5 Zodiac Zodiac Zodiac Zodiac Fargo Beetlejuice Fargo Zodiac