我正在尝试用xmlhttp和VBA抓取一个网站。 Url有一个哈希(#)符号,被VBA误解了...... 这与此处解释的问题相同:Error in XMLHTTP Get request with Special character in URL 代码如下:
Sub webscraping()
Dim x As String
Dim req As MSXML2.XMLHTTP60
Dim doc As HTMLDocument
x = "https://search.gleif.org/#/search/"
Set req = New MSXML2.XMLHTTP60
With req
.Open "GET", x, False
.send
If .Status <> 200 Then
MsgBox "Http Request Error"
Exit Sub
End If
Set doc = New MSHTML.HTMLDocument
doc.body.innerHTML = .responseText
End With
End Sub
先感谢您
您可以通过直接使用搜索字词调用API来执行此操作,如下所示。我正在使用json parser来处理json响应。在项目链接中添加.bas
之后,还可以去VBE>工具>参考>添加对Microsoft Scripting Runtime
的引用。
请注意,您可以在此处更改queryString参数:"&page%5Bnumber%5D=1&page%5Bsize%5D=15"
。目前这是页面= 1和每页15的记录。
VBA:
Option Explicit
Public Sub webscraping()
Dim url As String
Dim req As MSXML2.XMLHTTP60
Dim doc As HTMLDocument
Dim json As Object
Const TERM = "banana corp."
url = "https://api.gleif.org/api/v1/lei-records?filter%5Bfulltext%5D=" & Application.EncodeURL(TERM) & "&page%5Bnumber%5D=1&page%5Bsize%5D=15"
Set req = New MSXML2.XMLHTTP60
With req
.Open "GET", url, False
.send
If .Status <> 200 Then
MsgBox "Http Request Error"
Exit Sub
End If
Set json = JsonConverter.ParseJson(.responseText)
End With
Stop
'Do something with json
End Sub
返回了很多信息,所以这里只是它的摘录:
在这里研究API:
https://www.gleif.org/en/lei-data/gleif-lei-look-up-api/access-the-api