重新编码录制前后的级别摘要/概述

Question

我有dplyr::recode一些因素，我正在寻找一个干净的方式来制作LaTeX表，其中比较新旧类别，即水平。

以下是使用来自`mtcars的cyl的问题的说明。先是一些包裹，

# install.packages("tidyverse", "stargazer","reporttools") 
library(tidyverse)

以及我打算使用的数据，

mcr <- mtcars %>% select(cyl) %>% as_tibble() 
mcr %>% print(n=5)
#> # A tibble: 32 x 1
#>     cyl
#> * <dbl>
#> 1  6.00
#> 2  6.00
#> 3  4.00
#> 4  6.00
#> 5  8.00
#> # ... with 27 more rows

现在，我创建了两个新因子，一个有三个类别，cyl_3col，一个有两个，cyl_is_red，即：

mcr_col <- mcr %>% as_tibble() %>%
    mutate(cyl_3col = factor(cyl, levels = c(4, 6, 8),labels = c("red", "blue", "green")),
           cyl_is_red = recode(cyl_3col, .default = 'is not red', 'red' = 'is red'))
mcr_col  %>% print(n=5)
#> # A tibble: 32 x 3
#>     cyl cyl_3col cyl_is_red
#>   <dbl> <fct>    <fct>     
#> 1  6.00 blue     is not red
#> 2  6.00 blue     is not red
#> 3  4.00 red      is red    
#> 4  6.00 blue     is not red
#> 5  8.00 green    is not red
#> # ... with 27 more rows

现在，我想说明cyl_3col和cyl_is_red中的类别是如何相关的。

也许这样的事情更好，

#> cyl_is_red  cyl_3col 
#> is red               
#>             red      
#> is not red           
#>             blue     
#>             green

可能这样的事情，我想象is not red类别跨越两行\multirow{}或类似的东西。

#>  cyl_3col   cyl_is_red
#> 1 red       is red    
#> 2 blue      is not red
#> 3 green     ----------

使用stargazer或可能使用其他一些TeX工具。我对如何最好地显示重新编码非常开放。我假设有一些聪明的方法来编写这个想法来自我之前的人？

我现在使用像mcr_col %>% count(cyl_3col, cyl_is_red)这样的东西，但我不认为它真的有效。

Answer 1

pixiedust有合并选项。

---
title: "Untitled"
output: pdf_document
header-includes: 
- \usepackage{amssymb} 
- \usepackage{arydshln} 
- \usepackage{caption} 
- \usepackage{graphicx} 
- \usepackage{hhline} 
- \usepackage{longtable} 
- \usepackage{multirow} 
- \usepackage[dvipsnames,table]{xcolor} 
---

```{r}
library(pixiedust)
library(dplyr)

mcr <- mtcars %>% select(cyl) %>% as_tibble() 
mcr_col <- mcr %>% as_tibble() %>%
  mutate(cyl_3col = factor(cyl, levels = c(4, 6, 8),labels = c("red", "blue", "green")),
         cyl_is_red = recode(cyl_3col, .default = 'is not red', 'red' = 'is red'))

mcr_col %>% 
  count(cyl_3col, cyl_is_red) %>% 
  select(-n) %>% 
  dust(float = FALSE) %>% 
  sprinkle(cols = "cyl_is_red",
           rows = 2:3,
           merge = TRUE) %>% 
  sprinkle(sanitize = TRUE,
           part = "head")
```

Answer 2

解决问题的方法可能有点不同，就是将记录显示为情节而不是表格 - 这样可以避免产生乳胶语法。你可以这样做：

# Here I make some data with lots of levels
tdf <- data.frame(cat1 = factor(letters), 
                  cat2 = factor(c(rep("Low", 9), rep("Mid", 9), rep("High", 8))))
# We'll collapse the alphabet down to three factors
tdf$cat2 <- factor(tdf$cat2, levels(tdf$cat2)[c(2,3,1)])

# Now plot it as arrows running from the first encoding to the second
ggplot2::ggplot(tdf) + 
  geom_segment(data=tdf, aes(x=.05, xend = .45, y = cat1, yend = cat2), arrow = arrow()) + 
  geom_text(aes(x=0, y=cat1, label=cat1)) + 
  geom_text(aes(x=.5, y=cat2, label=cat2))+ 
  facet_wrap(~cat2, nrow = 3, scales = "free_y") + 
  theme_classic()+
  theme(axis.title.x=element_blank(),
        axis.text.x=element_blank(),
        axis.ticks.x=element_blank(),
        axis.title.y=element_blank(),
        axis.text.y=element_blank(),
        axis.ticks.y=element_blank(),
        axis.line = element_blank(),
        strip.background = element_blank(),
        strip.text.y = element_blank()) +
  ggtitle("Variable Recodings")

有了很多变量，读者的眼睛可能会更容易。

Answer 3

如果HTML适合您而不是乳胶，那么您可能会发现库tableHTML有很多选项

这是一个你可以用它做的事情的例子：

library(tableHTML)

connections <- mcr_col %>% 
  count(cyl_3col, cyl_is_red) 


groups <- connections %>% 
  group_by(cyl_is_red) %>% 
  summarise(cnt = length(cyl_3col))


tableHTML(connections %>% 
            select(-n, -cyl_is_red), 
          rownames = FALSE,
          row_groups = list(groups$cnt, groups$cyl_is_red))

Answer 4

我仍然不确定你想要如何概括，但假设有一个列（如cyl）你要从这个分析中排除，那么怎么样

> mcr_col  %>% select(-cyl) %>% distinct
# A tibble: 3 x 2
  cyl_3col cyl_is_red
  <fct>    <fct>     
1 blue     is not red
2 red      is red    
3 green    is not red

这将为您提供一个不同输出的表，其中您需要指定的唯一列是您要排除的列（可能是响应）。

重新编码录制前后的级别摘要/概述

问题描述投票：5回答：4

4个回答

最新问题

重新编码录制前后的级别摘要/概述

问题描述 投票：5回答：4

4个回答

最新问题

问题描述投票：5回答：4