我想用长文件格式绘制一些测量值的平均值,其中第一列中的所有测量值和第二列中的测试数字如下:
A =
Trial Number Measurement
1 0.1
1 0.5
1 0.7
1 0.3
1 0.2
2 0.2
2 0.4
2 0.5
... ...
我想绘制一条平均所有试验的曲线,所以我不知何故需要根据试验编号进行分组,然后取平均值并绘制。问题是每次试验的测量次数并不总是相同,并且一些试验缺失,因此试验数量不连续。关于如何做到这一点的任何想法?
编辑:通过“对所有试验进行平均”,我的意思是我想要每次试验的第一次测量的平均值(这里:0.15),第二次测量的平均值(0.45)等,然后根据这些平均值绘制曲线。
让我们概述一下可能符合您需求的不同方法:
1)使用findgroups和splitapply
data = readtable('data.txt','HeaderLines',1);
data.Properties.VariableNames = {'Trials' 'Measurements'};
[G,trials] = findgroups(data.Trials);
means = splitapply(@mean,data.Measurements,G);
result = table(trials,means);
result.Properties.VariableNames = {'Trial' 'AverageMeasurement'};
bar(result.Trial,result.AverageMeasurement);
set(gca,'XTick',min(data.Trials):max(data.Trials));
data = readtable('data.txt','HeaderLines',1);
data.Properties.VariableNames = {'Trials' 'Measurements'};
data = sortrows(data);
trials_uni = unique(data.Trials);
result = cell2mat(arrayfun(@(x)[x mean(data.Measurements(data.Trials == x))],trials_uni,'UniformOutput',false));
bar(result(:,1),result(:,2));
set(gca,'XTick',min(trials_uni):max(trials_uni));
3)使用accumarray
data = readtable('data.txt','HeaderLines',1);
data.Properties.VariableNames = {'Trials' 'Measurements'};
data = sortrows(data);
[trials_uni,~,trials_idx] = unique(data.Trials);
result = accumarray(trials_idx,data.Measurements,[],@mean);
bar(trials_uni,result);
set(gca,'XTick',min(trials_uni):max(trials_uni));
以下是我用于测试的data.txt
的内容:
Trial Number Measurement
1 0.1
1 0.5
1 0.7
1 0.3
1 0.2
2 0.2
2 0.4
2 0.5
4 0.2
4 0.1
7 0.8
7 0.4
7 0.5
7 0.4
这是最终输出:
如果你想要累积均值,那么hwre是如何计算的:
data = readtable('data.txt','HeaderLines',1);
data.Properties.VariableNames = {'Trials' 'Measurements'};
data = sortrows(data);
cm = cumsum(data.Measurements) ./ (1:height(data)).':
plot(data.Trials,cm);
如果您想通过试验索引对累积平均值进行子集化,则可以使用以前的方法之一。
如果您想要计算每个组的累积平均值,可以使用上述方法之一按索引拆分数据,然后计算每个组的累积平均值。
根据ViG的回答,这里有类似的逻辑索引。请注意,这个答案并不要求试验有序(即,如果早期试验的结果在序列中的某个时间后记录,它仍然有效)。
trialData = importdata('stack.txt');
trials = trialData.data(:,1); % trails
meas = trialData.data(:,2); % measurements
uniqueTrials = unique(trials); % unique list of trials
outputMeans = NaN(length(uniqueTrials), 1); % initialize output to NaN
% take mean for each unique trial
for ii=1:length(uniqueTrials)
outputMeans(ii) = mean(meas(trials == uniqueTrials(ii)));
end
plot(uniqueTrials, outputMeans); % plot
你可以这样做:
data = importdata('stack.txt'); % import data
trails = data.data(:,1); % trails
meas = data.data(:,2); % measurments
[~,idx] = ismember(trails, trails); % get indices of new trails
trails = unique(trails); % only save uniques
idx = unique(idx); % only save uniques
meass = zeros(length(idx),1); % allocate memory
for i=1:length(idx)-1
meass(i) = mean(meas(idx(i):idx(i+1)-1)); % save average of each trail
end
meass(end) = mean(meas(idx(end):end)); % last trail
plot(trails,meass) % plot