IE excel vba来获取数据

问题描述 投票:-1回答:2

我需要从网站上获取数据。网站内容跟随标签,我需要提取href属性。

link rel ='canonical'href ='http://www.wingatecinci.com'

为此,我编写了以下代码来提取Excel中的href属性

    Option Explicit
    Sub Tester()
    Dim IE As New InternetExplorer
    Dim i As Long
    Dim Cano As String

    Range("A1").Value = "Cano"

    Set IE = New InternetExplorer
    URL = "http://www.wingatecinci.com/"
    IE.navigate Url
    IE.Visible = True

    Do While IE.Busy Or IE.ReadyState <> READYSTATE_COMPLETE
        DoEvents
    Loop

    Cano = IE.document.getElementsByTagName("canonical")(i).innerHTML
    Range("A" & i + 2).Value = Cano
End Sub

但我没有得到成功,并面临错误看到这个截图http://prntscr.com/kpy9dh可以任何人看看这个并帮助我?

excel vba internet-explorer web-scraping
2个回答
0
投票

这两行引用i作为变量:

Cano = IE.document.getElementsByTagName("canonical")(i).innerHTML
Range("A" & i + 2).Value = Cano

但是你没有在代码中的任何地方设置i的值。你的意思是把它放在一个循环中吗?


此处的“标签”是<link> - "canonical"部分是该标签的属性,因此您需要在代码中进一步详细说明以测试这些:

For Each el In IE.Document.getElementsByTagName("link")
    If el.hasAttribute("link") Then
        // do something with this element
    End If
Next

0
投票

您可以为元素添加等待,并使用CSS属性选择器来定位属性。这当前有一个5秒的循环来尝试找到元素。

Option Explicit
Public Sub GetLink()
    Dim IE As New InternetExplorer, ele As Object, t As Date
    Const MAX_WAIT_SEC As Long = 5
    With IE
        .Visible = True
        .navigate "http://www.wingatecinci.com/"

        t = Timer
        Do While ele Is Nothing
            DoEvents
            On Error Resume Next
            Set ele = .document.querySelector("[rel='canonical']")
            On Error GoTo 0
            If Timer - t > MAX_WAIT_SEC Then Exit Do
        Loop
        If Not ele Is Nothing Then Debug.Print ele.href
        .Quit
    End With
End Sub

参考文献:

  1. Microsoft HTML对象库
  2. Microsoft Internet Controls
© www.soinside.com 2019 - 2024. All rights reserved.