我希望能够根据传入的类列表,使用 Pandoc 从命令行过滤 markdown。规则是:
我用来测试它的降价将是这样的:
---
title: Example doc
---
## Images
![Generic image](generic_image.png)
![Generic image](generic_image.png){width=30px}
![Color only image](color_image.png){.color-only}
![Color only image](color_image.png){.color-only width=30px}
![Color only image](color_image.png){width=30px .color-only}
![BW only image](bw_image.png){.bw-only}
![BW only image](bw_image.png){.bw-only width=30px}
![BW only image](bw_image.png){width=30px .bw-only}
## Blocks
::: {.other}
Block that shouldn't be filtered.
:::
:::
Block that shouldn't be filtered.
:::
::: {.color-only}
Color only block.
:::
::: {.bw-only}
BW only block.
:::
## Spans
[Span that shouldn't be filtered]{.other}
[Color only span]{.color-only}
[BW only span]{.bw-only}
## Links
[Link that shouldn't be filtered](link.html)
[Link that shouldn't be filtered](link.html){.other}
[Color only link](link.html){.color-only}
[BW only link](link.html){.bw-only}
我尝试创建一个 Lua 过滤器,但我是新手,而且我根本不具备相关知识。另外,我猜可能已经有一个过滤器可以做到这一点,但我还没有找到它。谁能指出我正确的方向吗?
谢谢
我
我有一个类似的过滤器,似乎可以工作——不过,有一个非常令人沮丧的问题,我将在下面尝试解释。
我的方法是在 pandoc cli 的元数据参数中以逗号分隔列表的形式提供我想要保留的类的名称。在过滤器内,我使用 LPeg 将该列表解析为表格;然后,我对块和内联元素分别应用了一个过滤器,并使用该表来测试包含情况。
给定这个过滤器(
classfilter.lua
上的LUA_PATH
):
-- split arglist by lpeg, function as it appears on lpeg documentation:
-- https://www.inf.puc-rio.br/~roberto/lpeg/
local function split(s, sep)
sep = lpeg.P(sep)
local elem = lpeg.C((1 - sep) ^ 0)
local p = lpeg.Ct(elem * (sep * elem) ^ 0) -- make a table capture
return lpeg.match(p, s)
end
local keeplist = {}
-- This function will go inside the Meta filter
-- the keeplist table will be available to the rest
-- of the filters to consult
local function collect_vars(m)
for _, classname in pairs(split(m.keeplist, ",")) do
keeplist[classname] = true
end
end
local function keep_elem(_elem)
-- keep if no class designation
if not _elem.classes or #_elem.classes == 0 then
return true
end
-- keep if class name in keeplist
for _, classname in ipairs(_elem.classes) do
if keeplist[classname] ~= nil then
return true
end
end
-- don't keep otherwise
return false
end
local function filter_list_by_classname(_elems)
for _elemidx, _elem in ipairs(_elems) do
if not keep_elem(_elem) then
_elems:remove(_elemidx)
end
end
return _elems
end
-- forcing the meta filter to run first
return { { Meta = collect_vars }, { Inlines = filter_list_by_classname, Blocks = filter_list_by_classname } }
-- 以及这个 pandoc cli 命令:
pandoc -f markdown -t markdown --lua-filter=classfilter.lua orn.md -M 'keeplist=other,bw-only'
-- 我得到以下输出:
## Images
![Generic image](generic_image.png)
![Generic image](generic_image.png){width="30px"}
<figure>
<figcaption>Color only image</figcaption>
</figure>
<figure>
<figcaption>Color only image</figcaption>
</figure>
<figure>
<figcaption>Color only image</figcaption>
</figure>
![BW only image](bw_image.png){.bw-only}
![BW only image](bw_image.png){.bw-only width="30px"}
![BW only image](bw_image.png){.bw-only width="30px"}
## Blocks
::: other
Block that shouldn't be filtered.
:::
::: Block that shouldn't be filtered. :::
::: bw-only
BW only block.
:::
## Spans
[Span that shouldn't be filtered]{.other}
[BW only span]{.bw-only}
## Links
[Link that shouldn't be filtered](link.html)
[Link that shouldn't be filtered](link.html){.other}
[BW only link](link.html){.bw-only}
...这似乎是所需的输出,除了
<figure>
标签,我无法删除它。 假设我的Inline
过滤器仅从src
中删除Figure
元素,并保持caption
完好无损,我尝试在单独的过滤器中迭代Figure
和Image
元素,以找到带有空
Figure
字段的 contents
块,用空表替换它们——但这根本不会改变结果。我的意思是,在 ipairs 循环之前将以下内容添加到 filter_list_by_classname
函数中:
_elems:walk({
Figure = function(a_figure)
if not a_figure.content[1].content[1] then
return {}
else
return a_figure
end
end
})
什么也没做。
所以也许这可能是解决方案的开始。