我尝试使用selenium、requests、beautifulsoup和curl从apkmirror下载APK文件,当我到达下载页面时我陷入困境,然后URL重定向,并且download.php请求出现在检查中网络中的 devtools 具有该文件的直接链接。 之前 之后
我尝试使用带 -L 标志的curl 将文件下载到以下网址:
curl -L "https://www.apkmirror.com/apk/google-inc/youtube/youtube-19-16-39-release/youtube-19-16-39-android-apk-download/download/?key=15b5cb3061082b309a0c30f1d2e410704e909596&forcebaseapk=true"
并且curl 获取一个HTML 页面,并且不遵循上图中显示的重定向URL。
这可以使用 python requests 库来完成。我们需要按照 4 个步骤从 apkmirror 下载并保存应用程序。
import requests
from bs4 import BeautifulSoup
from lxml import etree
headers = {
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
'accept-language': 'en-GB,en;q=0.9',
'cache-control': 'no-cache',
'pragma': 'no-cache',
'priority': 'u=0, i',
'sec-ch-ua': '"Google Chrome";v="125", "Chromium";v="125", "Not.A/Brand";v="24"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Linux"',
'sec-fetch-dest': 'document',
'sec-fetch-mode': 'navigate',
'sec-fetch-site': 'none',
'sec-fetch-user': '?1',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36',
}
apk_url = "https://www.apkmirror.com/apk/google-inc/youtube/youtube-19-16-39-release/youtube-19-16-39-android-apk-download/"
# Fetching Main APK page to extracting download link
print(f"Fetching APK URL: {apk_url}")
main_page_response = requests.get(
apk_url,
headers=headers,
)
main_page_soup = BeautifulSoup(main_page_response.content, "html.parser")
main_page_dom = etree.HTML(str(main_page_soup))
link = main_page_dom.xpath("//a[contains(@class,'downloadButton')]/@href")[0]
download_link = f"https://www.apkmirror.com{link}"
# Fetching download link to get the additional parameters required to download the actual app
print(f"Fetching Download Link: {download_link}")
download_page_response = requests.get(
download_link,
headers=headers,
)
download_page_soup = BeautifulSoup(download_page_response.content, "html.parser")
download_page_dom = etree.HTML(str(download_page_soup))
id_value = download_page_dom.xpath("//input[contains(@name,'id')]/@value")[0]
key_value = download_page_dom.xpath("//input[contains(@name,'key')]/@value")[0]
params = {
'id': id_value,
'key': key_value,
'forcebaseapk': 'true',
}
headers["referer"] = download_link
# Making another request with the extracted parameters and headers to download the file
final_url = f"https://www.apkmirror.com/wp-content/themes/APKMirror/download.php?id={id_value}&key={key_value}&forcebaseapk=true"
print(f"Making final request to download the file: {final_url}")
print("Please wait for sometime. It depends on the apk file size and internet speed")
apk_response = requests.get(
'https://www.apkmirror.com/wp-content/themes/APKMirror/download.php',
params=params,
headers=headers,
)
# Writing response to apk file (mention file path in place of youtube.apk)
print("Saving response to a apk file")
with open('youtube_final.apk', 'wb+') as f:
f.write(apk_response.content)
print("File saved successfully")