R相关性:我与COR()函数的相关结果不一致

问题描述 投票:0回答:1

data("pbc2.id", package = "JM") # Mayo Clinic Primary Biliary Cirrhosis Data df <- pbc2.id vars_num1 <- c("years", "age", "serBilir", "serChol", "albumin", "alkaline", "SGOT", "platelets", "prothrombin", "histologic", "status2") cor(df[vars_num1], use = "complete.obs", method="pearson") # years vs age: -0.17719866 cor(df$years, df$age, use = "complete.obs", method="pearson") # -0.1631033

其他列确实给出了一致的结果,例如
serBilir

vs
serChol
(0.39675890)。我自己还对其进行了编码以测试它:

v <- function(x,y=x) mean(x*y) - mean(x)*mean(y)
my_corr <- function(x,y) v(x,y) / sqrt(v(x) * v(y))
my_corr(df$years, df$age) # -0.1631033
为什么
cor(df[vars_num1], use = "complete.obs", method="pearson")

为什么给出不同的结果?
	

我认为问题来自您的NA值。在第二种情况下,COR函数比第一种情况下保持更多的行。使用

na.omit
,您会发现自己发现了同一件事。

data("pbc2.id", package = "JM") # Mayo Clinic Primary Biliary Cirrhosis Data df <- pbc2.id vars_num1 <- c("years", "age", "serBilir", "serChol", "albumin", "alkaline", "SGOT", "platelets", "prothrombin", "histologic", "status2") df = na.omit(df) cor(df[vars_num1], use = "complete.obs", method="pearson") # years vs age: -0.17719866 cor(df$years, df$age, use = "complete.obs", method="pearson") # -0.17719866 df[vars_num1]
r correlation pearson-correlation
1个回答
0
投票

最新问题
© www.soinside.com 2019 - 2025. All rights reserved.