我需要从这个公开且无需任何登录的网站的html代码中提取“href”:
(要通过 xmlhttp 访问页面,需要一个 REFERER,即以 REFSTRING 形式发布)
我的所有代码都失败了,所以我做了以下尝试:
Function CheckIfAccess()
Dim html As MSHTML.HTMLDocument, xhr As Object, Headers As Variant
URL As String, RefString As String, CountItems As Long
Set html = New MSHTML.HTMLDocument
Set xhr = CreateObject("MSXML2.ServerXMLHTTP.6.0")
URL = "https://www.zvg-portal.de/index.php?button=showZvg&zvg_id=5972&land_abk=br"
RefString = "https://www.zvg-portal.de/index.php?button=Suchen"
With xhr
.Open "POST", URL, False
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
.send RefString
html.body.innerHTML = .responseText
End With
CountItems = html.querySelectorAll("a").length
MsgBox (CountItems)
End Function
Messegabox 总是显示 0,所以代码甚至无法确定 'a' 的数量。 问题是这段代码在过去运行得非常完美:
GetHref = html.querySelectorAll("a").Item(x).innerText
...但现在突然不再了!
HTML代码是否有阻塞或者我的代码需要调整吗?
谢谢!
变化:
Referer
访问带有 WinHttpRequest
的网站。responseBody
解码 ADODB.Stream
将字节数组转换为文本。在VBE中添加两个引用
Option Explicit
Sub CheckIfAccess()
Dim htmlDoc As MSHTML.HTMLDocument, oWHR As Object, htmlText As String
Dim sURL As String, RefString As String, CountItems As Long
Set htmlDoc = New MSHTML.HTMLDocument
Set oWHR = New WinHttp.WinHttpRequest
sURL = "https://www.zvg-portal.de/index.php?button=showZvg&zvg_id=13592&land_abk=be"
RefString = "https://www.zvg-portal.de/index.php?button=Suchen"
With oWHR
.Open "GET", sURL, False
' .setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
' .setRequestHeader "User-Agent", "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Mobile Safari/537.36"
.setRequestHeader "Referer", RefString
.send
If .Status = 200 Then
htmlText = BytesToString(.responseBody)
' Debug.Print htmlText
htmlDoc.body.innerHTML = htmlText
End If
End With
CountItems = htmlDoc.querySelectorAll("a").Length
Debug.Print (CountItems)
End Sub
Function BytesToString(byteArray) As String
Dim stream As Object
Set stream = CreateObject("ADODB.Stream")
With stream
.Type = 1
.Open
.Write byteArray
.Position = 0
.Type = 2
.Charset = "utf-8"
BytesToString = .ReadText
.Close
End With
Set stream = Nothing
End Function