如何在ggplot2中包含忽略NA个案的IF语句

问题描述 投票:0回答:2

大家好,感谢您阅读我的问题,

我试图通过类似的主题找到解决方案,但没有找到任何合适的解决方案。这可能是由于我使用过的搜索词。如果我错过了什么,请接受我的道歉。

这是我的数据(有点缩短,但可以重现):

country year        sector      UN              ETS
BG      2000        Energy      24076856.07     NA
BG      2001        Energy      27943916.88     NA
BG      2002        Energy      25263464.92     NA
BG      2003        Energy      27154117.22     NA
BG      2004        Energy      26936616.77     NA
BG      2005        Energy      27148080.12     NA
BG      2006        Energy      27444820.45     NA
BG      2007        Energy      30789683.97     31120644
BG      2008        Energy      32319694.49     30453798
BG      2009        Energy      29694118.01     27669012
BG      2010        Energy      31638282.52     29543392
BG      2011        Energy      36421966.96     34669936
BG      2012        Energy      31628708.27     30777290
BG      2013        Energy      27332059.98     27070570
BG      2014        Energy      29036437.07     28583008
BG      2015        Energy      30316871.19     29935784
BG      2016        Energy      27127914.93     26531704
BG      2017        Energy      NA              27966156
CH      2000        Energy      3171899.5       NA
CH      2001        Energy      3313509.6       NA
CH      2002        Energy      3390115.69      NA
CH      2003        Energy      3387122.65      NA
CH      2004        Energy      3682404.04      NA
CH      2005        Energy      3815915.41      NA
CH      2006        Energy      4031766.36      NA
CH      2007        Energy      3718892.16      NA
CH      2008        Energy      3837098.91      NA
CH      2009        Energy      3673731.74      NA
CH      2010        Energy      3846523.62      NA
CH      2011        Energy      3598219.48      NA
CH      2012        Energy      3640743.25      NA
CH      2013        Energy      3735935.29      NA
CH      2014        Energy      3607411.44      NA
CH      2015        Energy      3292576.93      NA
CH      2016        Energy      3380402.57      NA
CY      2000        Energy      2964656.86      NA
CY      2001        Energy      2847105.45      NA
CY      2002        Energy      3008827.44      NA
CY      2003        Energy      3235739.95      NA
CY      2004        Energy      3294769.3       NA
CY      2005        Energy      3483623.91      3471844
CY      2006        Energy      3665461.17      3653380
CY      2007        Energy      3814469.11      3801667
CY      2008        Energy      3980439.76      3967293
CY      2009        Energy      4005649.27      3992467
CY      2010        Energy      3880758.22      3868001
CY      2011        Energy      3722369.39      3728038
CY      2012        Energy      3557560.24      3545929
CY      2013        Energy      2839148.88      2829732
CY      2014        Energy      2950111.64      2940320
CY      2015        Energy      3032961.55      3023003
CY      2016        Energy      3310941.55      3300001
CY      2017        Energy      NA              3287834

下面的代码运行平稳并提供它应该的enter image description here但是,一旦循环到达一个国家(这里是CH),它只有energy$ETS中的NA值,循环就会停止。我需要的是添加一个IF语句,它允许忽略所描述的情况,然后跳转到下一个国家(而不是中止操作)或只绘制energy$UN(即它只绘制变量('UN')可用数据,因为energy$ETS仅提供NA值)。

重要提示:我不想排除所有NA值,但如果遇到一个没有energy$ETS值的国家/地区,我需要循环继续运行

ctry <- unique(energy$country)

# Color settings: colorblind-friendly palette
cols <- c("#999999", "#E69F00", "#56B4E9", "#009E73",           
"#F0E442", "#0072B2", "#D55E00", "#CC79A7")

for(i in (1:length(ctry))) {

  plot.df <- energy[energy$country==ctry[i],]
  ets.initial <- min(plot.df$year)
  x <- plot.df$UN[plot.df$year >= ets.initial & plot.df$year < 2017]
  y <- plot.df$ETS[plot.df$year >= ets.initial & plot.df$year < 2017]
  m1 <- round(summary(lm(y~x))$r.squared, 3)
  m2 <- round(lm(y~x-1)$coef, 3)

  p <- ggplot() +
    geom_line(data=plot.df,aes(x=plot.df$year, y=plot.df$UN, color='UN 1.A.1'), na.rm=TRUE) +
    geom_line(data=plot.df, aes(x=plot.df$year, y=plot.df$ETS, color='ETS 20')) +      
    annotate(geom='text', label=paste0("R^2==", m1), 
             x=2014, y=Inf, vjust=2, hjust=0, parse=TRUE, cex=3) +
    annotate(geom='text', label=paste0("beta==", m2),
             x=2014, y=Inf, vjust=4, hjust=-0.15, parse=TRUE, cex=3) +
    labs(x="Year", y="CO2 Emissions (metric tons)", z="",
         title=paste("Energy sector emissions for", ctry[i])) + 
    theme(plot.margin=unit(c(.5, .5, .5, .5), "cm")) +
    scale_color_manual(values = cols) +
    scale_y_continuous(labels = scales::comma) +
    scale_x_continuous(breaks = seq(2000, 2017, by=5)) +
    labs(color="Datasets")
    p            
    ggsave(p, filename=paste("H:/figures_energy/", ctry[i], ".png", sep=""),
           width=6.5, height=6)
}

非常感谢您的任何帮助!

最好,

康斯坦丁

r if-statement ggplot2
2个回答
1
投票

我实施了我的评论(并做了一些一般的清理),它对我有用。我不想创建一堆文件,因此我将这些图放在列表中而不是保存它们。确保你的p ggsave(...)行在pggsave之间有一个换行符 - 你在同一行的问题中的方式是语法错误。

ctry <- unique(energy$country)

# Color settings: colorblind-friendly palette
cols <- c(
  "#999999",
  "#E69F00",
  "#56B4E9",
  "#009E73",
  "#F0E442",
  "#0072B2",
  "#D55E00",
  "#CC79A7"
)

plot_list = list()

for (i in (1:length(ctry))) {
  plot.df <- energy[energy$country == ctry[i], ]

  # Go to next iteration if ETS is all NA
  if(all(is.na(plot.df$ETS))) { 
    next 
  }

  # clean up modeling code. It is pointless to define the minimum and then
  # subset everything above the minimum. By definition, everything is already
  # above the minimum. It's also cleaner to subset the data frame and use 
  # the `data` argument of `lm`:
  mod.df = plot.df[plot.df$year < 2017, ]
  m1 <- round(summary(lm(ETS ~ UN, data = mod.df))$r.squared, 3)
  m2 <- round(lm(ETS ~ UN - 1, data = mod.df)$coef, 3)

  # Only using one data frame, so set it in the initial `ggplot()`, not
  # re-specify it in every layer. Similarly, set `aes(x = year)` once.
  p <- ggplot(data = plot.df, aes(x = year)) +
    # use bare column names in aes()
    geom_line(aes(y = UN, color = 'UN 1.A.1'), na.rm = TRUE) +
    geom_line(aes(y = ETS, color = 'ETS 20')) + 
    annotate(
      geom = 'text',
      label = paste0("R^2==", m1),
      x = 2014, y = Inf,
      vjust = 2, hjust = 0,
      parse = TRUE, 
      cex = 3
    ) +
    annotate(
      geom = 'text',
      label = paste0("beta==", m2),
      x = 2014, y = Inf,
      vjust = 4, hjust = -0.15,
      parse = TRUE,
      cex = 3
    ) +
    labs(
      x = "Year",
      y = "CO2 Emissions (metric tons)",
      z = "",
      title = paste("Energy sector emissions for", ctry[i])
    ) +
    theme(plot.margin = unit(c(.5, .5, .5, .5), "cm")) +
    scale_color_manual(values = cols) +
    scale_y_continuous(labels = scales::comma) +
    scale_x_continuous(breaks = seq(2000, 2017, by = 5)) +
    labs(color = "Datasets")
 plot_list[[i]] = p
}

使用此数据:

energy = read.table(header = T, text = "country year        sector      UN              ETS
BG      2000        Energy      24076856.07     NA
BG      2001        Energy      27943916.88     NA
BG      2002        Energy      25263464.92     NA
BG      2003        Energy      27154117.22     NA
BG      2004        Energy      26936616.77     NA
BG      2005        Energy      27148080.12     NA
BG      2006        Energy      27444820.45     NA
BG      2007        Energy      30789683.97     31120644
BG      2008        Energy      32319694.49     30453798
BG      2009        Energy      29694118.01     27669012
BG      2010        Energy      31638282.52     29543392
BG      2011        Energy      36421966.96     34669936
BG      2012        Energy      31628708.27     30777290
BG      2013        Energy      27332059.98     27070570
BG      2014        Energy      29036437.07     28583008
BG      2015        Energy      30316871.19     29935784
BG      2016        Energy      27127914.93     26531704
BG      2017        Energy      NA              27966156
CH      2000        Energy      3171899.5       NA
CH      2001        Energy      3313509.6       NA
CH      2002        Energy      3390115.69      NA
CH      2003        Energy      3387122.65      NA
CH      2004        Energy      3682404.04      NA
CH      2005        Energy      3815915.41      NA
CH      2006        Energy      4031766.36      NA
CH      2007        Energy      3718892.16      NA
CH      2008        Energy      3837098.91      NA
CH      2009        Energy      3673731.74      NA
CH      2010        Energy      3846523.62      NA
CH      2011        Energy      3598219.48      NA
CH      2012        Energy      3640743.25      NA
CH      2013        Energy      3735935.29      NA
CH      2014        Energy      3607411.44      NA
CH      2015        Energy      3292576.93      NA
CH      2016        Energy      3380402.57      NA
CY      2000        Energy      2964656.86      NA
CY      2001        Energy      2847105.45      NA
CY      2002        Energy      3008827.44      NA
CY      2003        Energy      3235739.95      NA
CY      2004        Energy      3294769.3       NA
CY      2005        Energy      3483623.91      3471844
CY      2006        Energy      3665461.17      3653380
CY      2007        Energy      3814469.11      3801667
CY      2008        Energy      3980439.76      3967293
CY      2009        Energy      4005649.27      3992467
CY      2010        Energy      3880758.22      3868001
CY      2011        Energy      3722369.39      3728038
CY      2012        Energy      3557560.24      3545929
CY      2013        Energy      2839148.88      2829732
CY      2014        Energy      2950111.64      2940320
CY      2015        Energy      3032961.55      3023003
CY      2016        Energy      3310941.55      3300001
CY      2017        Energy      NA              3287834")

2
投票
for(i in (1:length(ctry))){

plot.df <- energy[energy$country==ctry[i],]
ets.initial <- min(plot.df$year)
if(FALSE %in% is.na(plot.df$ETS) # Checks if there is any non-NA value in plot.df$ETS

(produce plots and rest of output as planned)
 }

将是使用基数R的解决方案。

© www.soinside.com 2019 - 2024. All rights reserved.