我正在尝试通过流实现大文件的文件上传,来自 ASP.NET 文档。文档中链接的 repo 有点令人困惑,因为它有 .net 3.x 和 5.x 的示例应用程序,而文档站点上的代码来自 3.x 版本。我使用 5.x 版本作为我自己实现的模板。这个示例应用程序是一个纯 REST 控制器,省略了一些文件检查并且没有禁用模型绑定,因为它只是不在 post 方法中使用参数(代码直接从官方示例 repo 复制):
/// <summary>
/// Action for upload large file
/// </summary>
/// <remarks>
/// Request to this action will not trigger any model binding or model validation,
/// because this is a no-argument action
/// </remarks>
/// <returns></returns>
[HttpPost]
[Route(nameof(UploadLargeFile))]
public async Task<IActionResult> UploadLargeFile()
{
var request = HttpContext.Request;
// validation of Content-Type
// 1. first, it must be a form-data request
// 2. a boundary should be found in the Content-Type
if (!request.HasFormContentType ||
!MediaTypeHeaderValue.TryParse(request.ContentType, out var mediaTypeHeader) ||
string.IsNullOrEmpty(mediaTypeHeader.Boundary.Value))
{
return new UnsupportedMediaTypeResult();
}
var reader = new MultipartReader(mediaTypeHeader.Boundary.Value, request.Body);
var section = await reader.ReadNextSectionAsync();
// This sample try to get the first file from request and save it
// Make changes according to your needs in actual use
while (section != null)
{
var hasContentDispositionHeader = ContentDispositionHeaderValue.TryParse(section.ContentDisposition,
out var contentDisposition);
if (hasContentDispositionHeader && contentDisposition.DispositionType.Equals("form-data") &&
!string.IsNullOrEmpty(contentDisposition.FileName.Value))
{
// Don't trust any file name, file extension, and file data from the request unless you trust them completely
// Otherwise, it is very likely to cause problems such as virus uploading, disk filling, etc
// In short, it is necessary to restrict and verify the upload
// Here, we just use the temporary folder and a random file name
// Get the temporary folder, and combine a random file name with it
var fileName = Path.GetRandomFileName();
var saveToPath = Path.Combine(Path.GetTempPath(), fileName);
using (var targetStream = System.IO.File.Create(saveToPath))
{
await section.Body.CopyToAsync(targetStream);
}
return Ok();
}
section = await reader.ReadNextSectionAsync();
}
// If the code runs to this location, it means that no files have been saved
return BadRequest("No files data in the request.");
}
在大多数情况下,如果我将接收到的流存储到文件中,我上面代码的实现工作正常。但是,1.我想做一些文件检查和 2.我不想存储到磁盘,而是直接上传到 MinIO S3 存储,上传前需要知道对象大小。
section.Body
接收到的上传流似乎是只读流,长度始终为0。它不可搜索,所以我不知道如何获取有关内容的更多信息。
查看 3.x 示例,我看到他们将接收到的流复制到
MemoryStream
中以获取文件信息(这是在 Utilities/Filehelper.cs 中完成的),但是恕我直言,这违背了流式传输的目的,因为我们现在无论如何都在填满服务器内存。如果可能的话,我想避免这种情况。
我想我也可以从 MinIO 请求一个预签名的 POST url,直接上传流并让 MinIO 处理文件大小限制等等,但我希望 REST 服务器控制这些东西,我也需要一些信息无论如何立即在服务器上。
问题是,是否以及如何在不将包含的数据复制到内存的情况下获取流大小。
我有一个可行的解决方案,我将分享如下。该代码包含一些特定于我的应用程序的类,我不会添加它们的代码,但是从命名上它们的目的应该很清楚,并且从代码中应该解决上述问题。
在客户端(blazor webassembly),我们将内容长度添加到
StreamContent
的标题中。以下是重要的代码部分:
private async Task OnUploadFiles()
{
List<Task> uploadTasks = new();
using var content = new MultipartFormDataContent();
// _fileToUpload is a List of IBrowserfiles
foreach (var file in _filesToUpload)
{
if (file.Validity != "valid")
continue;
Stream fileStream = file.BrowserFile.OpenReadStream(file.BrowserFile.Size);
var fileContent = new StreamContent(fileStream);
fileContent.Headers.ContentType = new MediaTypeHeaderValue(file.BrowserFile.ContentType);
// here we add the ContentLength header!
fileContent.Headers.ContentLength = fileStream.Length;
content.Add(fileContent, file.BrowserFile.Name);
file.ProcessingState = "uploading";
// This is not really needed, I'm using the stream position to calculate the upload process. This is done asynchronously in separate Tasks
uploadTasks.Add(Task.Run(async () =>
{
file.ProcessingState = "uploading";
while (fileStream.Position < file.BrowserFile.Size)
{
file.UploadProgress = (int)(fileStream.Position * 100 / file.BrowserFile.Size);
await Task.Delay(500);
StateHasChanged();
}
file.ProcessingState = "finished";
file.UploadProgress = 100;
}));
}
var uploadTask = ApiClient.PostAsync($"File/{_selectedDataset.ID}", content);
uploadTasks.Add(uploadTask);
await Task.WhenAll(uploadTasks);
var response = await uploadTask;
// TODO some more checks if everything went smoothly
if (response.IsSuccessStatusCode)
{
var addedFiles = await response.Content.ReadFromJsonAsync<IEnumerable<FileModel>>();
if (addedFiles is not null)
{
foreach(var addedFile in addedFiles)
_datasetFiles.Add(addedFile);
}
}
// cleanup
_filesToUpload.Clear();
await _jsFileDropperObject.InvokeVoidAsync("clearDataTransferItems");
_selectedDataset.FileCount = _datasetFiles.Count();
_selectedDataset.Size = _datasetFiles.Sum(dsf => dsf.FileSize);
}
在服务器端,我们有这个端点代码:
[HttpPost("{datasetID}")]
[DisableFormValueModelBinding]
public async Task<ActionResult<List<FileModel>>> AddFiles(string datasetID)
{
var claimID = _configuration.GetValue<string>("ClaimTypesObjectIdentifier");
if (string.IsNullOrWhiteSpace(claimID))
return Unauthorized("Unable to identify user!");
var userID = User.FindFirstValue(claimID);
var userName = User.Identity?.Name;
if (string.IsNullOrWhiteSpace(userID) || string.IsNullOrWhiteSpace(userName))
{
return Unauthorized("Unable to identify user!");
}
var dbDataset = await _datasetAccess.GetDataset(datasetID);
if (dbDataset is null)
return BadRequest("Invalid DatasetID!");
if (dbDataset.UserID != userID && !dbDataset.CanPublicEdit)
return Unauthorized("User is not allowed to edit this dataset!");
var request = HttpContext.Request;
if (!request.HasFormContentType ||
!MediaTypeHeaderValue.TryParse(request.ContentType, out var mediaTypeHeader) ||
string.IsNullOrEmpty(mediaTypeHeader.Boundary.Value))
{
return new UnsupportedMediaTypeResult();
}
var reader = new MultipartReader(mediaTypeHeader.Boundary.Value, request.Body);
var section = await reader.ReadNextSectionAsync();
List<FileModel> files = new();
while (section != null)
{
var hasContentDispositionHeader = ContentDispositionHeaderValue.TryParse(section.ContentDisposition,
out var contentDisposition);
if (hasContentDispositionHeader &&
contentDisposition is not null &&
contentDisposition.DispositionType.Equals("form-data") &&
!string.IsNullOrEmpty(contentDisposition.Name.Value))
{
if(section.Headers is not null &&
section.Headers.TryGetValue("Content-Length", out var contentLength) &&
!string.IsNullOrEmpty(contentLength))
{
if (long.TryParse(contentLength, out long fileSize))
{
string fileName = contentDisposition.Name.Value;
string fileExtension = Path.GetExtension(fileName);
string validity = FileUtilities.CheckUploadFileValidity(
_fileUploadChecksConfigMonitor.CurrentValue,
fileExtension,
fileSize,
dbDataset.DatasetType);
if(validity.Equals("valid") && !string.IsNullOrWhiteSpace(section.ContentType))
{
DbFileModel dbFile = new()
{
DatasetID = dbDataset.ID,
SecureObjectName = Path.GetRandomFileName(),
FileName = fileName,
FileExtension = fileExtension,
UploadDate = DateTime.UtcNow,
FileSize = fileSize
};
await _minioStorage.StoreObject(
section.Body,
dbDataset.UserID,
dbFile.DatasetID,
dbFile.SecureObjectName,
section.ContentType,
fileSize
);
await _fileAccess.InsertFile(dbFile);
files.Add(new FileModel()
{
Name = dbFile.FileName,
UserID = dbDataset.UserID,
DatasetID = dbDataset.ID,
FileExtension = fileExtension,
FileSize = dbFile.FileSize,
UploadDate = dbFile.UploadDate,
Url = await _minioStorage.GetObjectUrl(
dbDataset.UserID,
dbFile.DatasetID,
dbFile.SecureObjectName
)
});
}
}
}
}
section = await reader.ReadNextSectionAsync();
}
return Ok(files);
}
代码可能需要一些重构以使其更具可读性,我仍然需要针对多个大文件对其进行测试(这可能需要一些 Kestrel 配置)但就我能够测试的而言,这似乎工作正常!
读者可能还会考虑用户 eocron 上面的评论,建议将文件拆分为更小的对象,并将它们与元数据对象分开存储在 S3 服务器上。