我正在尝试使用 parfor 进行大型矩阵运算。我在反序列化期间出现内存不足错误。有没有办法减少内存?以下是最小的示例代码:
clc; clear;
warpedImages = num2cell(uint8(randi([0,255], 1654, 6288, 3, 35)),1:3);
warpedImages = reshape(warpedImages,1,[]);
% Initialze
n = length(warpedImages);
sigmaN = 10;
sigmag = 0.1;
panoramasize = size(warpedImages{1});
Amat = cell(n);
Bvec = zeros(n,1);
IuppeIdx = nonzeros(triu(reshape(1:numel(Amat), size(Amat))));
Amat_temp = cell(1,length(IuppeIdx));
matSize = size(Amat);
% 4D warped images (Slicing)
wim_4d = cell2mat(reshape(warpedImages,1,1,1,[]));
% Get the Ibarijs and Nijs
parfor i = 1:length(IuppeIdx)
% Index to subscripts
[ii,jj] = ind2sub(matSize, IuppeIdx(i));
if ii == jj
diag_val_1 = 0;
diag_val_2 = 0;
Z = 1:n;
Z(Z==ii) = [];
for d = Z
[Ibarij, Ibarji, Nij] = getIbarNij(panoramasize, wim_4d(:,:,:,ii), wim_4d(:,:,:,d));
diag_val_1 = diag_val_1 + ( (Nij + Nij) .* Ibarij.^2 );
diag_val_2 = diag_val_2 + Nij;
end
diag_val = diag_val_1 + (sigmaN^2/sigmag^2) * diag_val_2;
B_val = (sigmaN^2/sigmag^2) * diag_val_2;
Amat_temp{i} = diag_val;
Bvec(i) = B_val
end
if ii ~= jj
[Ibarij,Ibarji,Nij] = getIbarNij(panoramasize, wim_4d(:,:,:,ii), wim_4d(:,:,:,jj));
Amat_temp{i} = -(Nij+Nij) .* (Ibarij .* Ibarji);
end
end
function [Ibarij,Ibarji,Nij] = getIbarNij(panoramasize, Imij, Imji)
Ibarij = zeros(panoramasize,'uint8');
Ibarji = zeros(panoramasize,'uint8');
% Overlay the warpedImage onto the panorama.
maski = imbinarize(rgb2gray(255 * Imij));
maskj = imbinarize(rgb2gray(255 * Imji));
% Find the overlap mask
Nij_im = maski & maskj;
Nij_im = imfill(Nij_im, 'holes');
Nijidx = repmat(Nij_im, 1, 1, size(Imij,3));
% Get the overlapping region RGB values for two images
Ibarij(Nijidx) = Imij(Nijidx);
Ibarji(Nijidx) = Imji(Nijidx);
% Convert to double
Ibarij_double = double(Ibarij);
Ibarji_double = double(Ibarji);
% Nij
Nij = sum(sum(Nij_im));
% Ibar ijs
Ibarij = reshape(sum(sum(Ibarij_double)) ./ Nij, 1, 3);
Ibarji = reshape(sum(sum(Ibarji_double)) ./ Nij, 1, 3);
% Replace NaNs by zeros
Ibarij(isnan(Ibarij)) = 0;
Ibarji(isnan(Ibarji)) = 0;
end
第
[Ibarij, Ibarji, Nij] = getIbarNij(panoramasize, wim_4d(:,:,:,ii), wim_4d(:,:,:,d));
行抛出警告消息:整个数组或结构“wim_4d”是广播变量。这可能会导致不必要的通信开销。我使用 ind2sub 来获取下标,因为它很容易工作。但wim_4d(:,:,:,ii)
等不能切片。如有任何其他建议和帮助,我们将不胜感激!
我建议为此使用线程池 - 即在运行代码之前运行
parpool('Threads')
。
线程池的优点之一是它可以提高内存效率,尤其是对于
parfor
广播数据(wim_4d
就是这种情况)。使用进程池,每个工作线程都需要该 1Gb 阵列的自己的单独副本;使用线程池,工作人员会自动使用高效的共享内存副本。
尝试下面的代码,我简化了您的
getIbarNij
函数,并在循环内生成图像,您可以用图像读取函数替换随机矩阵。
clc; clear;
% Initialze
n = 35;
sigmaN = 10;
sigmag = 0.1;
Amat = cell(n);
Bvec = zeros(n,1);
IuppeIdx = nonzeros(triu(reshape(1:numel(Amat), size(Amat))));
Amat_temp = cell(1,length(IuppeIdx));
matSize = size(Amat);
% Get the Ibarijs and Nijs
parfor i = 1:length(IuppeIdx)
% Index to subscripts
[ii,jj] = ind2sub(matSize, IuppeIdx(i));
imgii=uint8(randi([0,255], 1654, 6288, 3));
if ii == jj
diag_val_1 = 0;
diag_val_2 = 0;
Z = 1:n;
Z(Z==ii) = [];
for d = Z
imgd=uint8(randi([0,255], 1654, 6288, 3));
[Ibarij, Ibarji, Nij] = getIbarNij( imgii, imgd);
diag_val_1 = diag_val_1 + ( (Nij + Nij) .* Ibarij.^2 );
diag_val_2 = diag_val_2 + Nij;
end
diag_val = diag_val_1 + (sigmaN^2/sigmag^2) * diag_val_2;
B_val = (sigmaN^2/sigmag^2) * diag_val_2;
Amat_temp{i} = diag_val;
Bvec(i) = B_val
else
imgjj=uint8(randi([0,255], 1654, 6288, 3));
[Ibarij,Ibarji,Nij] = getIbarNij(imgii, imgjj);
Amat_temp{i} = -(Nij+Nij) .* (Ibarij .* Ibarji);
end
end
function [Ibarij,Ibarji,Nij] = getIbarNij(Imij, Imji)
% Overlay the warpedImage onto the panorama.
maski = imbinarize(rgb2gray(255 * Imij));
maskj = imbinarize(rgb2gray(255 * Imji));
% Find the overlap mask
Nij_im = maski & maskj;
Nij_im = imfill(Nij_im, 'holes');
% Get the overlapping region RGB values for two images
Imij=Imij(:);
Imij=Imij([Nij_im(:);Nij_im(:);Nij_im(:)]);
Imji=Imji(:);
Imji=Imji([Nij_im(:);Nij_im(:);Nij_im(:)]);
% Nij
Nij = sum(sum(Nij_im));
% Ibar ijs
Ibarij = sum(reshape(Imij,[], 3))/Nij;
Ibarji = sum(reshape(Imji,[], 3))/Nij;
% Replace NaNs by zeros
Ibarij(isnan(Ibarij)) = 0;
Ibarji(isnan(Ibarji)) = 0;
end