我正在尝试使用 PyInstaller 将我的
main.py
脚本打包成可执行文件。该脚本包含一个网络爬虫,它使用 Selenium 和 chromedriver.exe
导航到网站并自动将文件 (PDF) 下载到名为“Files”的特定目录中,该目录与 main.py
位于同一目录中。为了清楚起见,这是预期文件结构的屏幕截图。
当我直接运行
main.py
时,一切都按预期工作,下载到“文件”目录。但是,使用 PyInstaller 打包后使用以下命令:
pyinstaller --onefile --add-data "chromedriver.exe;." --add-data "urls.txt;." main.py
并运行生成的
.exe
文件(chromedriver.exe
和 urls.txt
包含在同一目录中),我遇到一个问题:虽然 .exe
成功启动 Chrome 并下载文件,但它不再创建或使用“文件”目录位于同一位置。相反,下载的内容会保存到像C:\Users\{username}\AppData\Local\Temp\_MEI78762\Files
这样的临时目录中,该目录在程序退出后会被删除,因此下载的文件将无法访问。
下面是我用来设置下载路径的代码。该逻辑尝试检测可执行文件的基本路径,但它没有按预期工作:
# Determine the base path
if getattr(sys, 'frozen', False):
# If the application is run as a bundle, the PyInstaller bootloader
# extends the sys module by a flag frozen=True and sets the app
# path into variable _MEIPASS'.
base_path = sys._MEIPASS
else:
base_path = os.path.abspath(".")
# Create the Files directory if it doesn't exist
download_dir = os.path.join(base_path, "Files")
if not os.path.exists(download_dir):
os.makedirs(download_dir)
# Extract all URLS from urls.txt and store in a variable call urls
urls = []
with open("./test_urls.txt", "r") as file:
urls = file.readlines()
# Configure Chrome options to set the download directory and disable the download prompt
chrome_options = webdriver.ChromeOptions()
prefs = {
"download.default_directory": download_dir,
"download.prompt_for_download": False,
"directory_upgrade": True,
"safebrowsing.enabled": True,
"safebrowsing.disable_download_protection": True, # Disable download protection
"profile.default_content_setting_values.automatic_downloads": 1, # Allow automatic downloads
"profile.default_content_settings.popups": 0, # Disable popups
"profile.content_settings.exceptions.automatic_downloads.*.setting": 1 # Allow multiple downloads
}
当您将脚本打包成独立的可执行文件时,默认情况下可执行文件会将文件解压到临时目录(如
_MEIPASS
)。要解决此问题,您需要修改 base_path
以指向可执行文件所在的目录。当 sys.frozen 为 true 时,我们可以使用 sys.executable 来执行此操作。
这个实现看起来像这样:
import os
import sys
from selenium import webdriver
# Determine the base path
if getattr(sys, 'frozen', False):
# Running as a PyInstaller bundle, use the directory of the executable
base_path = os.path.dirname(sys.executable)
else:
# Running as a script, use the current working directory
base_path = os.path.abspath(".")
# Define the download directory for "Files" within base_path
download_dir = os.path.join(base_path, "Files")
# Create the "Files" directory if it doesn't exist
if not os.path.exists(download_dir):
os.makedirs(download_dir)
# Define the path to urls.txt and check if it exists
urls_file = os.path.join(base_path, "urls.txt")
if not os.path.isfile(urls_file):
raise FileNotFoundError(f"Expected 'urls.txt' in {base_path}. Please place 'urls.txt' in the same directory as the executable.")
# Read URLs from urls.txt
urls = []
with open(urls_file, "r") as file:
urls = file.readlines()
# Configure Chrome options for Selenium
chrome_options = webdriver.ChromeOptions()
prefs = {
"download.default_directory": download_dir,
"download.prompt_for_download": False,
"directory_upgrade": True,
"safebrowsing.enabled": True,
"safebrowsing.disable_download_protection": True,
"profile.default_content_setting_values.automatic_downloads": 1,
"profile.default_content_settings.popups": 0,
"profile.content_settings.exceptions.automatic_downloads.*.setting": 1
}
chrome_options.add_experimental_option("prefs", prefs)
# Initialize the Chrome WebDriver
driver = webdriver.Chrome(executable_path=os.path.join(base_path, "chromedriver.exe"), options=chrome_options)