阅读Web内容会在禁用时返回JS

问题描述 投票:-1回答:1

我写了以下代码来读取网页内容:

string url = "https://hackerone.com/directory?asset_type=URL&order_direction=DESC&order_field=started_accepting_at";
HttpClient httpclient = new HttpClient();
var html = httpclient.GetStringAsync(url);
MessageBox.Show(html.Result);//returns JavaScript Is disabled In your browser as a part of the response body

问题是我在html.Result中将“JavaScript is disabled”作为一部分,所以有人建议将url更改为以下内容:

http://service.prerender.io/https://hackerone.com/directory?asset_type=URL&order_direction=DESC&order_field=started_accepting_at

但它没有用,有什么想法吗?

编辑:它使用此代码工作,但它超级慢(像6秒)!

        string html = string.Empty;
        string url = "https://hackerone.com/directory?asset_type=URL&order_direction=DESC&order_field=started_accepting_at";
        HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
        using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
        using (Stream stream = response.GetResponseStream())
        using (StreamReader reader = new StreamReader(stream))
        {
            html = reader.ReadToEnd();
        }
c# html http web-scraping
1个回答
0
投票

你必须使用无头浏览器(如selenium,splash等),它可以运行网站的脚本并为您提供完整的网页。你可以看到关于C#的无头浏览器的这个问题和答案:

Headless browser for C# (.NET)?

这个列表在GitHub中:

https://github.com/dhamaniasad/HeadlessBrowsers

© www.soinside.com 2019 - 2024. All rights reserved.