我正在尝试将函数循环应用到 R 中的文件路径列表,目的是使用 dbplyr 将输出附加到内存不足的 SQLite 数据库。该数据库当前为空,并且具有预定义的结构。
这是数据库结构(我不知道如何为数据库连接创建正确的表示)
database
# Source: table<epi_table> [0 x 14]
# Database: sqlite 3.44.2 [/home/me/repositories/some_path/data/some_path.db]
# ℹ 14 variables: doc_index <int>, content_type <chr>, style_name <chr>, text <chr>, level <dbl>,
# num_id <int>, row_id <int>, is_header <int>, cell_id <dbl>, col_span <dbl>, row_span <int>,
# successful_import <int>, filepath <chr>, doc_vs_docx <chr>
我循环应用到文件路径列表的函数会为每次迭代生成如下输出:
new_row <- structure(list(doc_index = 1L, content_type = "paragraph", style_name = NA_character_,
text = "Some_text", level = NA_real_, num_id = NA_integer_,
row_id = NA_integer_, is_header = NA, cell_id = NA_real_,
col_span = NA_real_, row_span = NA_integer_, successful_import = TRUE,
filepath = "some_file.docx", doc_vs_docx = "docx"), row.names = c(NA,
-1L), class = c("tbl_df", "tbl", "data.frame"))
new_row
# A tibble: 1 × 14
doc_index content_type style_name text level num_id row_id is_header cell_id col_span row_span successful_import filepath doc_vs_docx
<int> <chr> <chr> <chr> <dbl> <int> <int> <lgl> <dbl> <dbl> <int> <lgl> <chr> <chr>
1 1 paragraph NA Some_text NA NA NA NA NA NA NA TRUE some_file.docx docx
我尝试使用以下代码将行插入数据库:
database |> rows_insert(new_rows, conflict = 'ignore')
但是,我遇到了关于不同对象类的错误:
Error in `auto_copy()`:
! `x` and `y` must share the same src.
ℹ `x` is a <tbl_SQLiteConnection/tbl_dbi/tbl_sql/tbl_lazy/tbl> object.
ℹ `y` is a <tbl_df/tbl/data.frame> object.
ℹ Set `copy = TRUE` if `y` can be copied to the same source as `x` (may be slow).
如果我打算将常规 tibble 行添加到 tbl 对象,是否需要事先将其转换为 tbl 对象? dbplyr 对于这个特定的用例没有那么有用吗?我现在对学习正确的 SQL 合成不太感兴趣,并且始终依赖 dbplyr。