我需要从网站上获取数据。网站内容跟随标签,我需要提取href属性。
link rel ='canonical'href ='http://www.wingatecinci.com'
为此,我编写了以下代码来提取Excel中的href属性
Option Explicit
Sub Tester()
Dim IE As New InternetExplorer
Dim i As Long
Dim Cano As String
Range("A1").Value = "Cano"
Set IE = New InternetExplorer
URL = "http://www.wingatecinci.com/"
IE.navigate Url
IE.Visible = True
Do While IE.Busy Or IE.ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
Cano = IE.document.getElementsByTagName("canonical")(i).innerHTML
Range("A" & i + 2).Value = Cano
End Sub
但我没有得到成功,并面临错误看到这个截图http://prntscr.com/kpy9dh可以任何人看看这个并帮助我?
这两行引用i
作为变量:
Cano = IE.document.getElementsByTagName("canonical")(i).innerHTML
Range("A" & i + 2).Value = Cano
但是你没有在代码中的任何地方设置i
的值。你的意思是把它放在一个循环中吗?
此处的“标签”是<link>
- "canonical"
部分是该标签的属性,因此您需要在代码中进一步详细说明以测试这些:
For Each el In IE.Document.getElementsByTagName("link")
If el.hasAttribute("link") Then
// do something with this element
End If
Next
您可以为元素添加等待,并使用CSS属性选择器来定位属性。这当前有一个5秒的循环来尝试找到元素。
Option Explicit
Public Sub GetLink()
Dim IE As New InternetExplorer, ele As Object, t As Date
Const MAX_WAIT_SEC As Long = 5
With IE
.Visible = True
.navigate "http://www.wingatecinci.com/"
t = Timer
Do While ele Is Nothing
DoEvents
On Error Resume Next
Set ele = .document.querySelector("[rel='canonical']")
On Error GoTo 0
If Timer - t > MAX_WAIT_SEC Then Exit Do
Loop
If Not ele Is Nothing Then Debug.Print ele.href
.Quit
End With
End Sub
参考文献: