如何在 Excel 中使用 VBA 和 PowerQuery 解析公共网站的 HTML 内容,并使用参数进行自动化?

问题描述 投票:0回答:1

在公共网站上,根据所选的联邦州,右侧会列出 30 个房产。如果该联邦州提供超过 30 处房产,则下一页会列出另外 30 处房产。

此代码运行完美,根据输入的 URL,它列出了一个 Excel 表中一页的所有属性。

Sub ZVGQueryNavigate(URL As String)
Const QueryName As String = "ZVGQuery"

Dim Query As WorkbookQuery

' Check if the query exists
On Error Resume Next
Set Query = ThisWorkbook.Queries(QueryName)
On Error GoTo 0

If Query Is Nothing Then
    ' If query doesn't exist, add it
    AddZVGQuery URL
Else
    ' Update the query formula
    Query.Formula = GetFVGQueryFormula(URL)
    Query.Refresh
    ThisWorkbook.RefreshAll
End If
End Sub

Sub AddZVGQuery(Optional URL As String = "https://zvgscout.com/brandenburg?page=3")
Const QueryName As String = "ZVGQuery"
Const TableName As String = "ZVGTable"

Dim Formula As String
Formula = GetFVGQueryFormula(URL)

Dim Destination As Range

' Ensure the query does not already exist
On Error Resume Next
ThisWorkbook.Queries(QueryName).Delete
On Error GoTo 0

' Set destination to the first cell of the active sheet
Set Destination = ActiveSheet.Range("A1")

' Add the query to the workbook
ActiveWorkbook.Queries.Add Name:=QueryName, Formula:=Formula

' Add a worksheet to display the query output
Dim NewSheet As Worksheet
Set NewSheet = ActiveWorkbook.Worksheets.Add

' Add the query to the table on the new sheet
With NewSheet.ListObjects.Add(SourceType:=0, Source:= _
    "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=" &     Chr(34) & QueryName & Chr(34) & ";Extended Properties=""""" _
    , Destination:=NewSheet.Range("A1")).QueryTable
    .CommandType = xlCmdSql
    .CommandText = Array("SELECT * FROM [" & QueryName & "]")
    .RowNumbers = False
    .FillAdjacentFormulas = False
    .PreserveFormatting = True
    .RefreshOnFileOpen = False
    .BackgroundQuery = True
    .RefreshStyle = xlInsertDeleteCells
    .SavePassword = False
    .SaveData = True
    .AdjustColumnWidth = True
    .RefreshPeriod = 0
    .PreserveColumnInfo = True
    .ListObject.DisplayName = TableName
    .Refresh BackgroundQuery:=False
End With
End Sub


Function GetFVGQueryFormula(URL As String) As String
' Define the query formula as lines
Dim Lines(0 To 9) As String
Lines(0) = "let"
Lines(1) = "    Source = Web.BrowserContents(" & Chr(34) & URL & Chr(34) & "),"
Lines(2) = "    #""Extracted Links From Html"" = Html.Table("
Lines(3) = "        Source, "
Lines(4) = "        {{""Links"", ""a.group"", each [Attributes][href]}}, "
Lines(5) = "        [RowSelector="".group""]"
Lines(6) = "    ),"
Lines(7) = "    #""Removed Nulls"" = Table.SelectRows(#""Extracted Links From Html"", each [Links] <> null)"
Lines(8) = "in"
Lines(9) = "    #""Removed Nulls"""
GetFVGQueryFormula = Join(Lines, vbNewLine)
End Function

因此,如果我想启动 mainsub 函数,那么语法如下:

Sub Main()

ZVGQueryNavigate ("https://zvgscout.com/nordrheinwestpfalen?page=12")

End Sub

有些联邦州拥有 500 多个属性,这些属性分布在 20 个页面上,每个页面有 30 个对象。

我的问题: 我必须如何更改代码才能在子 Main() 中输入“https://zvgscout.com/”& PARAMETER01 & PARAMETER02),其中 URL 由 Parameter01 = 联邦州和 Parameter02 = 组成页数,函数开始循环,从 Page1 到 PageX(参数 02),并将联邦州的所有数据编译到一个 Excel 表中?

Sub Main() 代码将如下所示:

Sub Main()

Dim NumOfPages as Long
Dim FedState as String
Dim x as long

NumOfPages = 20
FedState = "nordrheinwestpfalen"

For x = 1 to NumberOfPages  
  ZVGQueryNavigate ("https://zvgscout.com/" & FedState & "?page=" & NumOfPages )
Next

End Sub

并且 ZVGQueryNavigate() 函数也必须修改,以便可以传递参数。

我将非常感谢任何支持

html excel vba powerquery
1个回答
0
投票

首先,让我们使用 VBA 通过 InternetExplorer 对象从特定 URL 获取 HTML 源代码。之后,我们可以将该 HTML 源代码直接引入 Power Query,并通过其转换功能(例如指定适当的标签和属性),它将有助于提取所需的数据。所有这些都在 VBA 和 Power Query 步骤中带有参数,以实现灵活性。这意味着你可以改变网页的方向,甚至数据提取的方法,而无需修改VBA代码

© www.soinside.com 2019 - 2024. All rights reserved.