使用Python从网站下载最新版本的文件到特定位置

Question

我正在学习Python中的selenium和web-scraping（V3.6.6，x64版本）。我正在尝试编写一个脚本，在执行时，会自动将最新的win64版本geckodriver（发布此问题时的v0.22.0）从url https://github.com/mozilla/geckodriver/releases下载到我的Windows PC上的特定位置。

我的问题是，当我使用Mozilla Firefox浏览器查看页面源时，我尝试下载的特定版本的id和类与所有其他可用版本相同。我无法过滤掉特定部分并获取href，以便可以下载该文件。我肯定错过了一些东西，但尽管有几次互联网搜索，我无法弄清楚我做错了什么。我请求Stackoverflow的专家在接下来的步骤中指导/纠正我。以下是我要解决的问题：

1）下载最新geckodriver的win64版本

2）文件应下载到C：\ Python

3）如何理解程序已完全下载文件以便可以进一步执行？

from urllib.request import urlopen, urlretrieve
from bs4 import BeautifulSoup

# Define page where geckodriver can be downloaded
url = "https://github.com/mozilla/geckodriver/releases"

try:
    # Query the website and return the html to the variable ‘page’
    page = urlopen(url)
except:
    # Thow message for any unexpected behaviour when loading page
    print("Unable to download geckodriver. Hit any key to exit program.")
    user_input = input()
    exit()

# Parse the html using beautifulsoup and store in variable `soup`
soup = BeautifulSoup(page, "html.parser")

# Trying to search and filter latest win64 version
result = soup.find_all('a', {'class': 'd-flex flex-items-center'})

Answer 1

首先，找到最新版本，然后获取win64链接：

latest = soup.find('div', {'class': 'release-entry'})
results = latest.find_all('a', {'class': 'd-flex flex-items-center'})
for result in results:
    if 'geckodriver/releases/download/' in result.get('href) and 'win64.zip' in result.get('href):
        print (result.get('href))

使用Python从网站下载最新版本的文件到特定位置

问题描述投票：1回答：1

1个回答

最新问题

使用Python从网站下载最新版本的文件到特定位置

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1