当html元素没有返回值时如何跳过一行?

问题描述 投票:0回答:1

以下是我要完成的细分:

1)从电子表格中的列表中获取值

2)使用该值搜索URL

3)从HTML中获取ElementId并将其添加到电子表格中(这不需要添加到电子表格中,但我需要它用于下一次搜索,否则不知道该怎么做)

4)使用ElementId搜索另一个URL

5)从HTML中提取ClassName的值

当我的搜索中的一个值没有来自HTML的IDValue中的ElementId时,我的问题出现了。我需要跳过这些值和GoTo NextName,但还没有找到成功的方法。

我已经尝试在Do循环中添加一个计时器,但仍然无法前进到列表中的下一个i

这是我的代码:

Option Explicit

Public Sub CheckNames()
    Dim ws As Worksheet, i As Long, IE As InternetExplorerMedium
    Dim doc As HTMLDocument, docElement As Object
    Dim docClass As Object

Set ws = ActiveSheet
Set IE = New InternetExplorerMedium

1)

For i = 2 To ws.Cells(ws.Rows.Count, 1).End(xlUp).Row
    IE.Visible = False

2)

    IE.navigate "Beginning of URL" & ws.Cells(i, 1).Value & "End of URL"
    Do
        DoEvents
    Loop Until IE.readyState = READYSTATE_COMPLETE

    IE.Refresh

    Set doc = IE.document

3)

    Do
        Set docElement = Nothing
        On Error Resume Next
        Set docElement = doc.getElementById("IDValue")
        On Error GoTo NextName
        DoEvents

    Loop Until Not docElement Is Nothing

    ws.Cells(i, 3).Value = Right(Trim(docElement.innerText), 5)

4)

    IE.navigate "Beginning of another URL" & ws.Cells(i, 3).Value & "End of URL"

5)

    Do
        Set docClass = doc.getElementsByClassName("CLass Name in HTML")(3)

        DoEvents
    Loop Until IE.readyState = READYSTATE_COMPLETE

    ws.Cells(i, 4).Value = Trim(docClass.innerText)

    Set doc = Nothing
    Set docElement = Nothing

NextName:
Next i
IE.Quit
End Sub

所以简而言之,我正在寻找一种方法,当i没有值时跳到列表中的下一个doc.getElementById("IDValue")

提前致谢!

html excel vba excel-vba web-scraping
1个回答
1
投票

一些基本指针:

  1. 至少你需要在循环上超时以防止无限循环
  2. 您可以使用On Error Resume Next ..... On Error GoTo 0来包装您尝试设置元素
  3. 然后在基于Next iIf Not docElement Is Nothing Then之前有条件地执行其余代码,即找到并设置了元素
  4. 添加了一些类型的功能以提高效率

对其余部分代码进行了一些整理,但没有完整的评论。

Option Explicit
Public Sub CheckNames()
    Dim ws As Worksheet, i As Long, IE As InternetExplorerMedium
    Dim doc As HTMLDocument, docElement As Object
    Dim docClass As Object, t As Date
    Const MAX_WAIT_SEC As Long = 5

    Set IE = New InternetExplorerMedium

    For i = 2 To ws.Cells(ws.Rows.Count, 1).End(xlUp).Row

        IE.Visible = False
        IE.navigate "Beginning of URL" & ws.Cells(i, 1).Value & "End of URL"

        While IE.Busy Or IE.readyState < 4: DoEvents: Wend

        IE.Refresh

        Set doc = IE.document
        t = Timer

        Do
            DoEvents
            Set docElement = Nothing
            On Error Resume Next
            Set docElement = doc.getElementById("IDValue")
            On Error GoTo 0
            If Timer - t > MAX_WAIT_SEC Then Exit Do
        Loop Until Not docElement Is Nothing

        If Not docElement Is Nothing Then
            ws.Cells(i, 3).Value = Right$(Trim$(docElement.innerText), 5)

            IE.navigate "Beginning of another URL" & ws.Cells(i, 3).Value & "End of URL"

            While IE.Busy Or IE.readyState < 4: DoEvents: Wend
            Set docClass = doc.getElementsByClassName("CLass Name in HTML")(3)

            ws.Cells(i, 4).Value = Trim$(docClass.innerText)

        End If
    Next i
    IE.Quit
End Sub
© www.soinside.com 2019 - 2024. All rights reserved.