我有一个 CSV 文件的 URL。文件大小为 300kb,2700 行 15 列。
我在 Python 和 C# 中尝试了多种方法,但以异常结束
远端关闭连接无响应**
我尝试过的事情
Python:
import pandas as pd
import numpy as np
import os
# Download CSV with read_csv
df = pd.read_csv('https://nsearchives.nseindia.com/products/content/sec_bhavdata_full_17072024.csv', low_memory=False)
再次使用Python
import urllib.request
url = 'https://nsearchives.nseindia.com/products/content/sec_bhavdata_full_17072024.csv'
filename = 'large_file.csv'
def download_large_file(url, filename):
with urllib.request.urlopen(url) as response, open(filename, 'wb') as out_file:
while True:
chunk = response.read(8192) # Download in 8KB chunks
if not chunk:
break
out_file.write(chunk)
download_large_file(url, filename)
print("File downloaded successfully!")
C#
using System.Net;
WebClient webClient = new WebClient();
webClient.DownloadFile("URL");
问题是服务器需要一些先前的 cookie 应该在请求中为您提供文件,这里是一个完整的 C# 程序将为您提供该功能
using System.Diagnostics;
using System.Net;
using System.Net.Http.Headers;
var cookieContainer = new CookieContainer();
HttpClientHandler handler = new HttpClientHandler()
{
AllowAutoRedirect = true,
UseCookies = true,
CookieContainer = cookieContainer,
UseDefaultCredentials = true,
};
HttpClient client = new(handler);
client.DefaultRequestHeaders.Accept.Clear();
client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("text/html"));
client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("text/csv"));
client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("*/*"));
client.DefaultRequestHeaders.AcceptEncoding.Add(new("gzip"));
// first vist the Main Page to obtain required cookies
Console.Write("vist Main Page to obtain cookies...");
var MainPage = new Uri(@"https://www.nseindia.com");
var mainPageRes = await client.GetAsync(MainPage);
if (!mainPageRes.IsSuccessStatusCode)
{
Console.WriteLine("Failed!");
Console.WriteLine("can't obtains cookies form the main page");
Console.WriteLine("status code: " + mainPageRes.StatusCode);
return;
}
Console.WriteLine("done.");
Console.Write("start to download csv file ....");
var csvUri = new Uri(@"https://nsearchives.nseindia.com/products/content/sec_bhavdata_full_17072024.csv");
var response = await client.GetAsync(csvUri);
if (!response.IsSuccessStatusCode)
{
Console.WriteLine($"Faile.");
Console.WriteLine("Can't download the file");
Console.WriteLine("status code: " + response.StatusCode);
Console.WriteLine(response.Headers);
return;
}
Console.WriteLine("done.");
var filename = "sec_bhavdata_full_17072024.csv";
using var contentStreem = await response.Content.ReadAsStreamAsync();
using var stream = new FileStream(filename, FileMode.Create, FileAccess.Write);
Console.Write("start to save content to file....");
await contentStreem.CopyToAsync(stream);
Console.WriteLine("done");
try
{
if (OperatingSystem.IsWindows())
{
Process.Start("explorer.exe", ".");
}
}
finally
{
Console.WriteLine($"file name is {filename}");
}