将网站导入/抓取到 Excel 中

问题描述 投票:0回答:1

我正在尝试从数据库中抓取一些数据,并且我已经设置好了。我在 IE 中查找登录数据库的选项卡,然后通过 vba 将查询链接粘贴到那里。但是我如何提取它从 IE 选项卡返回的数据并将其放入 Excel 单元格或数组中。

这是我打开查询的代码:

Sub import()
Dim row As Integer
Dim strTargetFile As String
Dim wb As Workbook
Dim test As String
Dim ie As Object


Call Fill_Array_Cultivar

 For row = 3 To 4

    Sheets.Add.Name = Cultivar_Array(row, 1)
    strTargetFile = "https://www3.wipo.int/pluto/user/jsp/select.jsp?fl=app_date%2Cden_info%2Cden_final&hl=false&json.nl=map&wt=json&type=upov&start=0&qi=3-nNCXQ6etEVv184O9nnd5yg%3D%3D&q=cc%3AIT%20AND%20latin_name%3A(zea%20mays)%20AND%20den_info%3A" & Trim(Cultivar_Array(row, 1)) & "&facet=false"

        Set ie = GetIE("https://www3.wipo.int" & "*")
         If Not ie Is Nothing Then

            ie.navigate (strTargetFile)

Else
    MsgBox "IE not found!"
End If
Next row

End Sub

这是合适的函数:

'Find an IE window with a matching (partial) URL
'Assumes no frames.
Function GetIE(sAddress As String) As Object

Dim objShell As Object, objShellWindows As Object, o As Object
Dim retVal As Object, sURL As String


    Set retVal = Nothing
    Set objShell = CreateObject("Shell.Application")
    Set objShellWindows = objShell.Windows

 'see if IE is already open
    For Each o In objShellWindows
        sURL = ""
        On Error Resume Next
        sURL = o.document.Location
        On Error GoTo 0
        If sURL <> "" Then
            If sURL Like sAddress & "*" Then
              Set retVal = o
              Exit For
            End If
        End If
    Next o

Set GetIE = retVal
End Function

网站返回给我的是一个带有一行文字的白色页面。这是一个例子:

 {"response":{"start":0,"docs":[{"den_final":"Abacus","app_date":"1998-01-13T22:59:59Z"}],"numFound":1},"qi":"3-nNCXQ6etEVv184O9nnd5yg==","sv":"bswa2.wipo.int","lastUpdated":1436333633993}

PS。我还尝试使用 importxml 函数,它会导入网站,但只是一个错误页面,因为它无法识别我已登录。

excel vba web-scraping internet-explorer
1个回答
0
投票

我找到了解决方案,它相当简单但很难找到。 我可以抓住 ie.Document.body.innertext 这是我需要的所有文本。 请参阅下面我更新的代码:

Sub import()
Dim row As Integer
Dim strTargetFile As String
Dim wb As Workbook
Dim test As String
Dim ie As Object
Dim pageText As String

Call Fill_Array_Cultivar

For row = 3 To 4

    Sheets.Add.Name = Cultivar_Array(row, 1)
    strTargetFile = "https://www3.wipo.int/pluto/user/jsp/select.jsp?fl=app_date%2Cden_info%2Cden_final&hl=false&json.nl=map&wt=json&type=upov&start=0&qi=3-nNCXQ6etEVv184O9nnd5yg%3D%3D&q=cc%3AIT%20AND%20latin_name%3A(zea%20mays)%20AND%20den_info%3A" & Trim(Cultivar_Array(row, 1)) & "&facet=false"

    Set ie = GetIE("https://www3.wipo.int" & "*")
    If Not ie Is Nothing Then
        ie.navigate (strTargetFile)

        Do Until ie.ReadyState = 4: DoEvents: Loop

        pageText = ie.Document.body.innertext
        ActiveSheet.Cells(1, 1) = pageText
        pageText = Empty
    Else
        MsgBox "IE not found!"
    End If
Next row
End Sub
© www.soinside.com 2019 - 2024. All rights reserved.