VBA获取更改ID的HTML元素信息

问题描述 投票:2回答:1

我正在尝试创建一个excel web scraper,它可以登录我公司的票务跟踪系统,并在工作表上记录某些信息(指定了潜在客户,项目的期望日期等)。我做得很好,直到我不得不从具有不断变化的ID的网站上拉出一个字段。

例如,在两个页面上,相同的字段将具有ID:

  • “Cq_widget_CqFilteringSelect_32”
  • “Cq_widget_CqFilteringSelect_9”

有人可以提供指导我如何搜索并将“IT Lead”值粘贴到excel中吗?

HTML snippet of div

Snippet of actual website

Setup in excel

以下是我到目前为止的情况

我在这方面感到困惑:

lead = objCollection(i).Value

Sub CQscrub()

Dim i As Long
Dim objElement As Object
Dim objCollection As Object
Dim objCollection2 As Object
Dim ie As InternetExplorer
Dim html As HTMLDocument
Dim numbers() As String
Dim size As Integer
Dim row As Integer
Dim objLead As Object
Dim objLead2 As Object
Dim lead As String
Dim counter As Integer

size = WorksheetFunction.CountA(Worksheets(1).Columns(1)) - 4
ReDim numbers(size)

For row = 10 To (size + 10)
    numbers(row - 10) = Cells(row, 1).Value
    'Cells(row, 2) = numbers(row - 10)
Next row


Set ie = New InternetExplorer
ie.Height = 1000
ie.Width = 1000
ie.Visible = True
ie.navigate "http://clearquest/cqweb/"

Application.StatusBar = "Loading http://clearquest/cqweb"

Do While ie.Busy
    Application.Wait DateAdd("s", 1, Now)
Loop

Application.StatusBar = "Searching form. Please wait..."
'Had these below as comment
Dim WRnumber1 As String
WRnumber1 = Range("A10").Value
'Range("A6").Value = WRnumber1


Dim iLastRow As Integer
Dim Rng As Range
iLastRow = Cells(Rows.Count, "a").End(xlUp).row 'last row of A

'Set objCollection = ie.document.getElementsByTagName("input") originally here
For counter = 0 To size - 1
    Set objCollection = ie.document.getElementsByTagName("input")
    i = 0
    While i < objCollection.Length
        If objCollection(i).Name = "cqFindRecordString" Then
            objCollection(i).Value = numbers(counter)

        End If
        i = i + 1
    Wend
    '''''''''''''''''' Find Label ''''''''''''''''''''''''''''
    Set objCollection = ie.document.getElementsByTagName("label")
    i = 0
    While i < objCollection.Length
        If objCollection(i).innerText = "IT Lead/Assigned To" Then
            lead = objCollection(i).Value
            'Set objLead = objCollection(i)
        End If
        i = i + 1
    Wend
    '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
            Range("B" & (iLastRow - (size - counter - 1))).Value = lead
    Set objElement = ie.document.getElementById("cqFindRecordButton")
    objElement.Click
    Do While ie.Busy
        Application.Wait DateAdd("s", 1, Now)
    Loop
    Application.Wait (Now + TimeValue("0:00:02"))
Next counter

ie.Quit
Set ie = Nothing
Set objElement = Nothing
Set objCollection = Nothing

Application.StatusBar = ""
MsgBox "Done!"

End Sub

注意:网站仅限内部网站

目标:在“IT主管/分配给”字段下选择名称并粘贴到Excel

谢谢

html excel vba excel-vba web-scraping
1个回答
1
投票

关于提供的代码,tl; dr。

但是,如果您想要在HTML片段中提供的划痕部分,以下内容可能会起作用(我无法测试我无法访问的内容:D)。

抓取元素有很多种不同的方法,这个方法就是抓住类名dijitReset dijitInputField dijitInputContainer的第一个实例。类名并不总是唯一的值,但由于这个类名的复杂程度,我觉得在某种程度上它是安全的。

您可以使用一行来Set yourObj...,但出于演示目的,我决定将其分解。 1-liner方法来设置你的obj:

Set yourObj = doc.getElementsByClassName("dijitReset dijitInputField dijitInputContainer")(0).getElementsByTagName("input")(1)

代码片段:

Sub getElementFromIE()

    Dim ie As InternetExplorer

    ' ... your above code pulls up webpage ...

    '''''''''''''''''' Find Label ''''''''''''''''''''''''''''
    Dim doc As HTMLDocument, yourObj As Object
    Set doc = ie.document

    ' I assume the class name is unique? If so, just append (0) as I did below
    Set yourObj = doc.getElementsByClassName("dijitReset dijitInputField dijitInputContainer")(0)
    Set yourObj = yourObj.getElementsByTagName("input")(1)
    lead = yourObj.Value

End Sub

(1)Set yourObj = yourObj.getElementsByTagName("input")(1)的原因是因为在你的班级input之后有2个dijitReset...标签。你想要这个标签的第二个实例,它包含你的价值;正如您可能已经知道的那样,您正在使用Base 0,这意味着第二个实例实际上是数字1。

© www.soinside.com 2019 - 2024. All rights reserved.