在生化亲和实验中绘制S形曲线以获得Kd值

问题描述 投票:0回答:1

我测量了几种生物分子彼此(配体和靶标)的结合亲和力。理想情况下,这会产生数据的 S 形曲线。拐点代表 Kd 值(解离常数),它是所述分子之间结合强度的指标。

x 轴是配体浓度,y 轴是观察到的响应、Fnorm 或其他测量值(通常是 1 到 1000 之间的数字)。这张附图显示了它的样子:

example graph 1

  1. 我可以毫无问题地在

    ggplot2
    中绘制我的数据点,并且实际上得到了一个不错的图表。然而,我还需要通过数据点的实际 S 形曲线(周围有 95% CI)。

  2. 在某些情况下,配体不能浓缩得更高,这意味着曲线的“平台”之一缺失了一部分。我想要一条 S 形曲线来尝试“猜测”或估计完整数据分布的样子,即使其中一小部分丢失。 这张附图显示了我的意思:

example graph 2

请注意曲线的右下部分不仅仅以数据点结束。但相反,它根据最后的数据点“近似”或“猜测”S 形曲线的其余部分。

  1. 因此,我还希望能够展示某种拟合优度度量,让人们知道这个 sigmoidal 模型对数据的拟合/描述效果如何。不确定像 McFaddens 的 R2 值这样的东西是否好,但像这样的东西。

  2. 最后,我当然还需要拐点作为输出(= x 轴上的值)。

我发现了一些类似的问题,但它们并不完全是我所需要的,并且我未能适应我的解决方案:

这是三次亲和力测量的虚拟示例数据:

ligand-conc (x): 5.289E-09, 1.058E-08, 2.115E-08, 4.231E-08, 8.462E-08, 1.692E-07, 3.385E-07, 6.769E-07, 1.354E-06, 2.708E-06, 5.415E-06, 1.083E-05, 2.166E-05, 4.332E-05, 8.665E-05, 1.733E-04

Fnorm exp.a (y): 792.6444, 792.8537, 788.0273, 793.9693, 792.3848, 792.311, 790.5109, 790.4974, 796.1723, 790.8627, 790.2954, 784.7171, 773.0447, 760.8085, 745.5512, 738.3463

Fnorm exp.b (y): 790.2453, 793.8565, 789.5286, 791.8368, 788.5138, 790.0382, 792.85, 789.1439, 790.3487, 792.1872, 786.6738, 780.0627, 775.8658, 762.8376, 747.4288, 737.8717

Fnorm exp.c (y):  788.2453, 790.5648, 792.8529, 790.1368, 793.5138, 791.7038, 788.85, 791.1439, 789.4487, 788.8872, 789.5674, 783.3063, 774.8658, 764.5838, 749.4288, 736.5872

这是 Excel 格式的样子:excel 文件

这是我迄今为止使用的代码:

mydata <- read.csv("example")

names(mydata)[names(mydata) == "ligand.conc"] <- "ligand" #different name of a column for convenience
mydata$ligand <- as.numeric(mydata$ligand)
mydata$ligand <- mydata$ligand*1000 #changing the unit of the concentration from M to mM
mydata$ligand <- mydata$ligand*1000 #changing the unit of the concentration from mM to µM
mydata$Fnorm <- as.numeric(mydata$Fnorm)

base = ggplot(mydata)

base + 
  geom_point(aes(x=concentration, y=Fnorm, color=experiment))+

  geom_smooth(aes(x=ligand, y=Fnorm),
  method = drm, method.args = list(fct = L.4()), se = FALSE)+

  theme_bw() +
  theme(                                                 
    axis.line.x.bottom = element_line(color = 'black'), 
    axis.line.y.left   = element_line(color = 'black'),
    axis.line.y.right  = element_line(color = 'black'),
    panel.grid.minor.x = element_blank(),
    panel.border       = element_blank(),
    axis.title.x = element_markdown(),
    axis.title.y = element_markdown(),
    axis.minor.ticks.length = rel(1),
    axis.text = element_text(color = "black",
                             size = 10),
    axis.ticks=element_line(linewidth=0.6),
    axis.ticks.length = unit(2.75, "pt"),
    ) +

  scale_color_manual(
    name="Replicate",
    labels = c("1", "2", "3"), 
    values = c("sienna1", "dodgerblue","grey43")) +
  
  coord_cartesian(ylim = c(720, 800), expand = TRUE)+
  
  scale_x_continuous(trans="log10",
                    expand = c(0, 0),
                    label = label_number(),
                    breaks = c(0.01, 0.1, 1, 10, 100, 1000),
                    guide = guide_axis_logticks(long = 2.3, mid = 1.65, short = 0.75),
                    limits = c(0.001,1000))+
  
  labs(
    y="Fnorm [%<sub>280</sub>]",
    x="Ligand conc. [µM]"
  )+

print(base)

这会产生以下图:

my graph

曲线显得非常粗糙/锯齿状并且不是很平滑。如上所述,曲线是不完整的,因为它必须猜测右下部分是什么样子。它还缺少 95% CI。此外,我不确定如何获得拐点和拟合优度输出。

到目前为止,我已经使用了 drc 包(如下所述:https://rstats4ag.org/dose-response-curves.html),但这会计算出所谓的 ED50 值,我不确定它是否有效等于拐点,因此等于 Kd。

为了得到曲线周围95%的CI,我尝试过替换

geom_smooth(aes(x=ligand, y=Fnorm),
  method = drm, method.args = list(fct = L.4()), se = FALSE)+

geom_smooth(aes(x=ligand, y=Fnorm),
  method = drm, method.args = list(fct = L.4()), level=0.95)+

但是这会产生此错误:

geom_smooth() 使用公式 = 'y ~ x' 无法拟合组 -1。由 pred$fit 中的错误引起:$ 运算符对于原子向量无效

我也不确定为什么我必须在同一个 ggplot 中连续指定

aes (x, y)
两次。将
geom_smooth()
添加到绘图中,而不添加
aes()
部分,会返回错误:

stat_smooth() 需要以下缺失的美感:x 和 y。

在代码中添加

geom_smooth(aes(x,y))
甚至会产生:

无法拟合组-1。由method()中的错误引起: 收敛失败:奇异收敛(7)”

r ggplot2 logistic-regression chemistry
1个回答
0
投票

这是我到目前为止所拥有的。 虽然您可以在

drc::drm()
中使用
ggplot
等拟合方法,但(根据我的经验)在外部应用它们通常更容易,然后将结果输入到
ggplot

设置数据

lvec <- c(5.289E-09, 1.058e-08, 2.115e-08, 4.231e-08, 8.462e-08, 1.692e-07,
          3.385e-07, 6.769e-07, 1.354e-06, 2.708e-06, 5.415e-06, 1.083e-05,
          2.166e-05, 4.332e-05, 8.665e-05, 1.733e-04)
expa <- c(792.6444, 792.8537, 788.0273, 793.9693, 792.3848, 792.311, 790.5109, 790.4974, 796.1723, 790.8627, 790.2954, 784.7171, 773.0447, 760.8085, 745.5512, 738.3463)
expb <- c(790.2453, 793.8565, 789.5286, 791.8368, 788.5138, 790.0382, 792.85, 789.1439, 790.3487, 792.1872, 786.6738, 780.0627, 775.8658, 762.8376, 747.4288, 737.8717)
expc <- c(788.2453, 790.5648, 792.8529, 790.1368, 793.5138, 791.7038, 788.85, 791.1439, 789.4487, 788.8872, 789.5674, 783.3063, 774.8658, 764.5838, 749.4288, 736.5872)
dd <- data.frame(conc=lvec*1000,
                 Fnorm = c(expa, expb, expc),
                 experiment = rep(c("a","b","c"), each = 16))

拟合模型并生成带有标准误差的预测

library(drc)

fit1 <- drm(Fnorm ~ conc, data = dd, fct = LL.4())
## explicitly ask for predictions over a wider range than the data
pframe <- data.frame(
    conc = exp(seq(log(4e-6), log(10), length.out = 101)))
pframe <- cbind(pframe,
                predict(fit1, newdata = pframe, se.fit = TRUE))
names(pframe)[2:3] <- c("Fnorm", "Fnorm_se")

情节

library(ggplot2)
ggplot(dd, aes(conc, Fnorm)) +
    geom_point(aes(colour = experiment)) +
    geom_line(data = pframe) +
    geom_ribbon(data = pframe, colour = NA,
                fill = "black",
                alpha = 0.1,
                aes(ymin = Fnorm - 2*Fnorm_se,
                    ymax = Fnorm + 2*Fnorm_se)) +
    scale_x_log10() +
    geom_hline(yintercept = coef(fit1)[2], lty = 2) +
    geom_vline(xintercept = coef(fit1)[4], lty = 2)

(modified from copilot): A graph with a scatter plot and a fitted curve. The x-axis shows concentration and is on a logarithmic scale, ranging from 1e-04 to 1e+00. The y-axis shows the binding affinity and ranges approximately from 720 to 800. There are three sets of data points from different experiments (a, b, c) represented by different colors. These data points are scattered across the graph but generally follow a descending trend from left to right. A black decreasing sigmoidal curve goes through the data points, with a shaded ribbon showing ± 2SE. Dashed lines show the inflection point and the lower bound.

拐点为

coef(fit1)[4]
= 0.0494。

不确定拟合优度统计量,尽管

summary()
确实给出了残差标准误差(从中你应该能够计算出 R^2 值...)或计算
cor(predict(fit1), dd$conc)^2
= 0.902。

© www.soinside.com 2019 - 2024. All rights reserved.