合并已排序的对

Question

我有两个（或更多，但如果解决了两个，它解决了任何数字）2×N矩阵，它代表具有x（第一行）和y（第二行）坐标的点。这些点始终按递增的x坐标排序。我想要做的是我想将这两个矩阵合并为一个3乘N矩阵，这样如果两个点（每个矩阵一个）具有相同的x坐标，它们将在新矩阵中形成一列，第一个row是x坐标，第二和第三行是两个y坐标。但是，如果一个矩阵中的某个点的x坐标与第二个矩阵中的所有其他点不同，我仍然希望放置完整的3个元素列，使得x坐标仍然排序，并且缺少的值来自另一个矩阵由具有较低x坐标的最近值（如果没有则为NaN）替换。

最好通过例子来解释。

第一个矩阵：

1  3  5  7  % x coordinate
1  2  3  4  % y coordinate

第二个矩阵：

2  3  4  7  8  % x coordinate
5  6  7  8  9  % y coordinate

期望的结果：

1    2  3  4  5  7  8  % x coordinate
1    1  2  2  3  4  4  % y coordinate from first matrix
NaN  5  6  7  7  8  9  % y coordinate from second matrix

我的问题是，如何在matlab / octave和numpy中有效地完成它？（实际上，因为我总是可以“手动”使用循环，但这似乎不正确。）

Answer 1

您可以使用interp1和关键字'previous'进行策略（如果您不关心它是否更大或更小，您也可以选择'nearest'）和'extrap'以允许外推。

定义矩阵

a=[...
1  3  5  7;... 
1  2  3  4];

b=[...
2  3  4  7  8;...
5  6  7  8  9];

然后找到插值点

x = unique([a(1,:),b(1,:)]);

并插入

[x ; interp1(a(1,:),a(2,:),x,'previous','extrap') ; interp1(b(1,:),b(2,:),x,'previous','extrap') ]

时间结果：

我测试了算法

n = 1e6;
a = cumsum(randi(3,2,n),2);
b = cumsum(randi(2,2,n),2);

得到了：

沃尔夫：1.7473秒
Flawr：0.4927 s
我的：0.2757秒

Answer 2

此版本使用set操作：

a=[...
1  3  5  7;... 
1  2  3  4];

b=[...
2  3  4  7  8;...
5  6  7  8  9];

% compute union of x coordinates
c = union(a(1,:),b(1,:));

% find indices of x of a and b coordinates in c
[~,~,ia] = intersect(a(1,:),c); 
[~,~,ib] = intersect(b(1,:),c);

% create output matrix
d = NaN(3,numel(c));
d(1,:) = c;
d(2,ia) = a(2,:);
d(3,ib) = b(2,:);

% fill NaNs
m = isnan(d);
m(:,1) = false;
i = find(m(:,[2:end,1])); %if you have multiple consecutive nans you have to repeat these two steps
d(m) = d(i);

disp(d);

Try it online!

Answer 3

你的例子：

a = [1 3 5 7; 1 2 3 4];
b = [2 3 4 7 8; 5 6 7 8 9];

% Get the combined (unique, sorted) `x` coordinates 
output(1,:) = unique([a(1,:), b(1,:)]);
% Initialise y values to NaN
output(2:3, :) = NaN;   
% Add x coords from `a` and `b`
output(2, ismember(output(1,:),a(1,:))) = a(2,:);
output(3, ismember(output(1,:),b(1,:))) = b(2,:);
% Replace NaNs in columns `2:end` with the previous value. 
% A simple loop has the advantage of capturing multiple consecutive NaNs.
for ii = 2:size(output,2)
    colNaN = isnan(output(:, ii));
    output(colNaN, ii) = output(colNaN, ii-1);
end

如果您有超过2个矩阵（如您的问题所示），那么我建议

将它们存储在一个单元格数组中，并在它们上面循环以执行对ismember的调用，而不是每个矩阵硬编码一个代码行。
NaN替换循环已经为任意数量的行进行了矢量化。

这是使用a和b演示的任意数量矩阵的通用解决方案：

mats = {a, b};
cmats = horzcat(mats);
output(1, :) = unique(cmats(1,:));
output(2:numel(mats)+1, :) = NaN;
for ii = 1:size(mats)
    output(ii+1, ismember(output(1,:), mats{ii}(1,:))) = mats{ii}(2,:);
end
for ii = 2:size(output,2)
    colNaN = isnan(output(:,ii));
    output(colNaN, ii) = output(colNaN, ii-1);
end

合并已排序的对

问题描述投票：1回答：3

3个回答

最新问题

合并已排序的对

问题描述 投票：1回答：3

3个回答

最新问题

问题描述投票：1回答：3