在下面的代码中,我映射通过从 AzureDevOps Rest Api 并行下载对象而获得的数据。当将 mapDictionary 转换为代码库不同模块中的 Excel 工作表时,我注意到代码中存在一些奇怪的瓶颈。由于对 mapDictionary 的引用是延迟加载的,因此对数千个对象进行简单的 foreach 花了很长时间。
private Dictionary<int, IEnumerable<CodeReviewResponse>> MapToCodeReviewRequests(
IEnumerable<WorkItemBatchResponse> workItemBatchResponses,
WiqlResponse? workItemRelations)
{
Dictionary<int, IEnumerable<CodeReviewResponse>> mapDictionary =
new Dictionary<int, IEnumerable<CodeReviewResponse>>();
IEnumerable<int>? codeReviewRequestIds = workItemRelations?.workItemRelations
.Where(y => y.source is not null)
.Select(x => x.source.id)
.Distinct();
foreach(int codeReviewId in codeReviewRequestIds)
{
var codeReviewResponseIds = workItemRelations?.workItemRelations
.Where(x => x.source is not null && x.source.id.Equals(codeReviewId))
.Select(x => x.target.id).ToList();
if (codeReviewResponseIds is not null && codeReviewResponseIds.Count() > 0)
{
mapDictionary[codeReviewId] = workItemBatchResponses
.SelectMany(x => x.value)
.Where(x => codeReviewResponseIds.Contains(x.fields.SystemId));
}
}
return mapDictionary;
}
如果 codeReviewResponseIds 是一个整数列表(如示例中所示),那么我的数据不会延迟加载,并且对这些引用的迭代速度很快,如果我在没有 .ToList() 的情况下存储它,则需要永远计算他们,为什么?
我想你可能没有意识到这段代码实际上是在枚举(foreaching):
mapDictionary[codeReviewId] = workItemBatchResponses
.SelectMany(x => x.value)
.Where(x => codeReviewResponseIds.Contains(x.fields.SystemId));
对于每次迭代,我们都将其称为
codeReviewResponseIds.Contains(x.fields.SystemId)
,它也是枚举。
评论中提到的问题是,我们对 Enumerable 有
Where
和 Select
操作,每次枚举它时都会执行 - 在我们的例子中,“外部枚举”的每次迭代都会发生。
最好的说明方法是通过一个简短的例子:
static int _operationsExecuted = 0;
public static bool IsEvenOperation(int number) {
_operationsExecuted++;
return number % 2 == 0;
}
public static int MultiPlyBy10Operation(int number) {
_operationsExecuted++;
return number * 10;
}
static void Main() {
IEnumerable<int> itemsToSearch = Enumerable.Range(1, 1000);
IEnumerable<int> enumerableWithOperations = Enumerable.Range(1, 10)
.Where(IsEvenOperation)
.Select(MultiPlyBy10Operation);
_operationsExecuted = 0;
// this is really foreach
List<int> found = itemsToSearch
.Where(x => enumerableWithOperations.Contains(x))
.ToList();
Console.WriteLine(_operationsExecuted); // 14970
Console.WriteLine(found.Count()); // 5 [20,40,60,80,100]
_operationsExecuted = 0;
found = new List<int>();
foreach (var element in itemsToSearch) {
if (enumerableWithOperations.Contains(element)) {
found.Add(element);
}
}
Console.WriteLine(_operationsExecuted); // 14970
Console.WriteLine(found.Count()); // 5
_operationsExecuted = 0;
var materialized = enumerableWithOperations.ToList();
Console.WriteLine(_operationsExecuted); // 15
// 10 ops for IsEven
// 5 ops for MultiplyBy10
_operationsExecuted = 0;
found = itemsToSearch
.Where(x => materialized.Contains(x))
.ToList();
Console.WriteLine(_operationsExecuted); // 0
Console.WriteLine(found.Count()); // 5
}