我希望为每个患者创建一个包含一系列的散点/线图,其中 x/y 对源自该患者列的数据。数据的小样本如下所示:
Patient ID 1 2
Controll_Percent_1 0.000 0.000
Controll_Percent_2 0.035 0.000
Controll_Percent_3 0.035 0.039
Controll_Percent_4 0.053 0.053
Controll_Percent_5 0.088 0.066
Controll_Percent_6 0.088 0.066
Controll_Percent_7 0.105 0.092
Controll_Percent_8 0.123 0.171
Controll_Percent_9 0.158 0.263
Controll_Percent_10 0.158 0.224
Controll_Percent_11 0.197
Controll_Percent_12 0.211
Controll_Percent_13 0.276
Controll_Percent_14
Controll_Percent_15
Controll_Percent_16
Controll_Percent_17
Controll_Days_1 33 32
Controll_Days_2 378 128
Controll_Days_3 575 502
Controll_Days_4 951 633
Controll_Days_5 1127 800
Controll_Days_6 1324 853
Controll_Days_7 1512 999
Controll_Days_8 1887 1122
Controll_Days_9 2141 1312
Controll_Days_10 2331 1467
Controll_Days_11 1657
Controll_Days_12 2022
Controll_Days_13 2393
Controll_Days_14
Controll_Days_15
Controll_Days_16
每个“Controll_Percent”与“Controll_days”配对,描述操作后x天后y的结果。通过在散点图中手动定义每个 x 和 y 系列,可以在 Excel 中轻松实现这一点。
如何在 R 或 ggplot2 中实现这一点?
我了解您希望在 x 轴上显示天数,在 y 轴上显示百分比,并为每个患者单独绘制一条线。这是一种方法。我存储了中间结果,以便更容易理解数据整理过程。
library(tidyverse)
DF <- read_csv("~/R/Play/Dummy.csv")
#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#> dat <- vroom(...)
#> problems(dat)
#> Rows: 32 Columns: 3
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Patient_ID
#> dbl (2): 1, 2
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
DF2 <- DF |> rename("P1" = `1`, "P2" = `2`) |> filter(!is.na(P1) | !is.na(P2))
DF3 <- DF2 |> pivot_longer(cols = c("P1","P2"),names_to = "Patient")
DF4 <- DF3 |> separate("Patient_ID", into = c("Control", "Measure", "Index"))
DF5 <- DF4 |> pivot_wider(names_from = "Measure", values_from = "value" )
DF5
#> # A tibble: 26 × 5
#> Control Index Patient Percent Days
#> <chr> <chr> <chr> <dbl> <dbl>
#> 1 Controll 1 P1 0 33
#> 2 Controll 1 P2 0 32
#> 3 Controll 2 P1 0.035 378
#> 4 Controll 2 P2 0 128
#> 5 Controll 3 P1 0.035 575
#> 6 Controll 3 P2 0.039 502
#> 7 Controll 4 P1 0.053 951
#> 8 Controll 4 P2 0.053 633
#> 9 Controll 5 P1 0.088 1127
#> 10 Controll 5 P2 0.066 800
#> # ℹ 16 more rows
ggplot(DF5, aes(x = Days, y = Percent, color = Patient, group = Patient)) +
geom_line() +geom_point()
#> Warning: Removed 3 rows containing missing values or values outside the scale range
#> (`geom_line()`).
#> Warning: Removed 3 rows containing missing values or values outside the scale range
#> (`geom_point()`).
创建于 2024 年 10 月 26 日,使用 reprex v2.1.1