在我的上一个关于 M 中二元分解的问题取得了巨大成功之后,我现在遇到了一个新问题。
我正在尝试计算两个二元组列表之间的匹配对,我尝试过的两种方法有时都会低估,我无法准确说明原因。 例如,{"He","el","ll","lo"} 和 {"He","el","lo"} 之间的匹配正确计数为 3 个匹配(好吧,6 个,但我是故意将函数内的计数加倍),而 {"He","el","ll","lo"} 和 {"Hi","il","lo"} 错误地计算了 0 个匹配项而不是 1 个匹配项。
我在实现之前使用 List.Sort() 作为输入,尽管问题对于排序来说是不变的。
我的两个函数都基于 wikibooks 中 Dice 相似性页面的 java 实现中的计数器
此处摘录:
int matches = 0, i = 0, j = 0;
while (i < n && j < m)
{
if (sPairs[i] == tPairs[j])
{
matches += 2;
i++;
j++;
}
else if (sPairs[i] < tPairs[j])
i++;
else
j++;
}
我最初的破解导致了递归函数:
(x as list, y as list, i as number, j as number, matches) as number =>
let
matcher = if x{i} = y{j} then matches + 2 else matches,
ineq = if x{i} < y{j} then 1 else 0,
Check = if i = List.Count(x) - 1 or j = List.Count(y) - 1 then matcher
else if matcher > matches then @Counter1(x,y,i+1,j+1,matcher)
else if ineq = 1 then @Counter1(x,y,i+1,j,matcher)
else @Counter1(x,y,i,j+1,matcher)
in Check
为了避免由于潜在的速度问题而导致的递归,我还让副驾驶使用 list.accumulate 给我写了一个函数
(x as list, y as list) as number =>
let
n = List.Count(x),
m = List.Count(y),
matches = List.Accumulate(
{0..n-1},
[i = 0, j = 0, matches = 0],
(state, current) =>
if state[i] < n and state[j] < m then
if x{state[i]} = y{state[j]} then
[i = state[i] + 1, j = state[j] + 1, matches = state[matches] + 2]
else if x{state[i]} < y{state[j]} then
[i = state[i] + 1, j = state[j], matches = state[matches]]
else
[i = state[i], j = state[j] + 1, matches = state[matches]]
else
state
)[matches]
in
matches
据我所知,这两个函数都给出了相同的输出,第二个函数肯定感觉更快,尽管这意味着它们也都给出了计数不足的问题。
唯一想到的是我改编的java代码使用字母对的二进制表示来比较它们,而我不确定M如何在我的函数中比较字母对。
任何帮助将不胜感激!
我认为你的编码技能比我好得多,但无论如何我都会尝试一下。作为一个懒惰的人,我选择了我能想到的最简单的方法,将第二个二元组列表扩展到第一个列表中的每个二元组并计数匹配。我确信可以用更简单的方式来完成,但这是我使用 Power Query 的简单方式:
let
// Load the original table, subsitute other sources
Source = Table.FromRecords({
[Column1 = "He", Column2 = "Hi"],
[Column1 = "el", Column2 = "il"],
[Column1 = "ll", Column2 = "lo"],
[Column1 = "lo", Column2 = null]
}),
// Separate Column1 and Column2 into two tables
Table1 = Table.SelectColumns(Source, {"Column1"}),
Table2 = Table.SelectColumns(Source, {"Column2"}),
// Remove null values from Table2, not strictly required
Table2NonNull = Table.SelectRows(Table2, each [Column2] <> null),
// Create a custom column in Table1 to add all rows of Table2 to each row of Table1
CrossJoin = Table.AddColumn(Table1, "Column2", each Table2NonNull[Column2]),
// Expand the new column to create the cross join effect
ExpandedTable = Table.ExpandListColumn(CrossJoin, "Column2"),
// Count matches
Custom = Table.AddColumn(ExpandedTable, "Custom", each if [Column1]=[Column2] then 1 else 0),
// Sum all ones in the column
TotalMatchCount = List.Sum(Custom[Custom])
in
TotalMatchCount