如何使用VBA查询选择器从该网站提取数据

问题描述 投票:0回答:1

我需要从这个公开且无需任何登录的网站的html代码中提取“href”:

(要通过 xmlhttp 访问页面,需要一个 REFERER,即以 REFSTRING 形式发布)

公共网站

我的所有代码都失败了,所以我做了以下尝试:

Function CheckIfAccess()

Dim html As MSHTML.HTMLDocument, xhr As Object, Headers As Variant
URL As String, RefString As String, CountItems As Long

Set html = New MSHTML.HTMLDocument
Set xhr = CreateObject("MSXML2.ServerXMLHTTP.6.0")

URL = "https://www.zvg-portal.de/index.php?button=showZvg&zvg_id=5972&land_abk=br"
RefString = "https://www.zvg-portal.de/index.php?button=Suchen"
    
With xhr
    .Open "POST", URL, False
    .setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
    .send RefString
    html.body.innerHTML = .responseText
End With

CountItems = html.querySelectorAll("a").length

MsgBox (CountItems)

End Function

Messegabox 总是显示 0,所以代码甚至无法确定 'a' 的数量。 问题是这段代码在过去运行得非常完美:

GetHref = html.querySelectorAll("a").Item(x).innerText

...但现在突然不再了!

HTML代码是否有阻塞或者我的代码需要调整吗?

谢谢!

html vba queryselector referer msxml2
1个回答
0
投票

变化:

  • 使用
    Referer
    访问带有
    WinHttpRequest
    的网站。
  • 通过使用
    responseBody
    解码
    ADODB.Stream
    将字节数组转换为文本。

在VBE中添加两个引用

  • Microsoft HTML 对象库
  • Microsoft WinHTTP 服务,版本 5.1
Option Explicit
Sub CheckIfAccess()
    Dim htmlDoc As MSHTML.HTMLDocument, oWHR As Object, htmlText As String
    Dim sURL As String, RefString As String, CountItems As Long
    Set htmlDoc = New MSHTML.HTMLDocument
    Set oWHR = New WinHttp.WinHttpRequest
    sURL = "https://www.zvg-portal.de/index.php?button=showZvg&zvg_id=13592&land_abk=be"
    RefString = "https://www.zvg-portal.de/index.php?button=Suchen"
    With oWHR
        .Open "GET", sURL, False
'        .setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
'        .setRequestHeader "User-Agent", "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Mobile Safari/537.36"
        .setRequestHeader "Referer", RefString
        .send
        If .Status = 200 Then
            htmlText = BytesToString(.responseBody)
    '        Debug.Print htmlText
            htmlDoc.body.innerHTML = htmlText
        End If
    End With
    CountItems = htmlDoc.querySelectorAll("a").Length
    Debug.Print (CountItems)
End Sub
Function BytesToString(byteArray) As String
    Dim stream As Object
    Set stream = CreateObject("ADODB.Stream")
    With stream
        .Type = 1
        .Open
        .Write byteArray
        .Position = 0
        .Type = 2
        .Charset = "utf-8"
        BytesToString = .ReadText
        .Close
    End With
    Set stream = Nothing
End Function
© www.soinside.com 2019 - 2024. All rights reserved.