安装最新的 xgboost nightly build

问题描述 投票:0回答:1

我想安装最新的 xgboost nightly build。该文档表明可以在此处找到最新版本:https://s3-us-west-2.amazonaws.com/xgboost-nightly-builds/list.html?prefix=master/

获取最新版本的名称,然后可以按以下方式使用 pip(示例):

!pip install https://s3-us-west-2.amazonaws.com/xgboost-nightly-builds/master/xgboost-2.0.0.dev0%2B15ca12a77ebbaf76515291064c24d8c2268400fd-py3-none-manylinux2014_x86_64.whl

有没有办法以某种方式指定“最新的夜间构建”,而不必复制提交密钥?

python installation xgboost
1个回答
0
投票

鉴于页面源代码,很明显夜间构建列表是使用 JavaScript 动态填充的。要抓取由 JavaScript 填充的内容,您可以使用 Selenium 或任何其他执行 JavaScript 代码来生成完整 DOM 的无头浏览器方法。

首先,安装 Selenium 和兼容的 WebDriver,例如 Chrome WebDriver。

pip install selenium

此处下载正确版本的 ChromeDriver 并使其可在您系统的

PATH
中访问。

然后,代码将尝试获取专门为

.whl
构建的最新
manylinux2014_x86_64
文件并安装它。

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
import subprocess
import time

# Initialize WebDriver
options = webdriver.ChromeOptions()
options.add_argument("--headless")
driver = webdriver.Chrome(options=options)

# Open the URL
driver.get("https://s3-us-west-2.amazonaws.com/xgboost-nightly-builds/list.html?prefix=master/")

# Wait for the list to load
try:
    element_present = EC.presence_of_element_located((By.ID, 'listing'))
    WebDriverWait(driver, 10).until(element_present)
except TimeoutException:
    print("Timed out waiting for page to load")

time.sleep(5)  # Further delay to ensure JavaScript has time to load

# Extract the URLs
links = driver.find_elements(By.TAG_NAME, 'a')

# Initialize a variable to store the latest build URL
latest_build_url = ""

# Iterate through links in reverse to find the most recent one that matches our criteria
for link in reversed([l.get_attribute('href') for l in links]):
    if "py3-none-manylinux2014_x86_64.whl" in link:
        latest_build_url = link
        break

# Close the browser
driver.close()

# Install the latest build using pip
if latest_build_url:
    subprocess.run(["pip", "install", latest_build_url])
else:
    print("Latest build not found.")

注意:WebDriver 将等待最多 10 秒来加载列表。根据需要调整该值。

该方法应该允许您在 JavaScript 填充 URL 后获取 URL 列表。请注意,此方法还依赖于页面的当前结构,该结构可能会发生变化。

© www.soinside.com 2019 - 2024. All rights reserved.