我一直在尝试编写代码来打印网络选项卡下的 Fetch/XHR 选项卡下的预览。图像如下所示-
在此图像中,我想在选择验证码下的音频按钮后在控制台上打印。
网站链接是- https://tmrsearch.ipindia.gov.in/eregister/
首先在此网站上,您必须选择左侧第一个按钮“商标申请/注册商标”
之后,选择国家/IRDI 号码复选框以转到所需页面
检查元素后,当您单击验证码下方的音频按钮时,您将在网络选项卡下的 XHR/Fetch 选项卡下的预览选项卡中看到验证码。验证码将出现在预览选项卡中。
我已经创建了一个Python代码,可以进入所需的页面。
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver=webdriver.Chrome()
driver.maximize_window()
driver.get("https://tmrsearch.ipindia.gov.in/eregister/")
wait = WebDriverWait(driver, 10)
# switch into the frame context
wait.until(EC.frame_to_be_available_and_switch_to_it((By.NAME, "eregoptions")))
# click on the targeted element
wait.until(EC.element_to_be_clickable((By.ID, "btnviewdetails"))).click()
# come out of frame
driver.switch_to.default_content()
time.sleep(10)
wait.until(EC.frame_to_be_available_and_switch_to_it((By.NAME, "showframe")))
# click on the targeted element
wait.until(EC.element_to_be_clickable((By.ID, "rdb_0"))).click()
# come out of frame
driver.switch_to.default_content()
time.sleep(10)
我想添加几行代码,可以在“网络”选项卡下的“Fetch/XHR”下的“预览”选项卡下打印验证码。
我们可以通过两种方式获取验证码值。
硒线:
请求库:
我将这两种方法都包含在代码中。看看并使用您觉得舒服的那个。
import time
from seleniumwire import webdriver
#from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import json
import requests
# This method will capture captcha value from the network calls
def get_captcha_details_using_selemiumwire(driver):
all_req = driver.requests
for request in all_req:
url = request.url
if (url and 'GetCaptcha' in url):
json_str = request.response.body.decode("utf-8")
data = json.loads(json_str)
captcha = data['d']
print(f"Captcha using seleniumwire: {captcha}")
return captcha
# This method will make a request with session and get the captcha value
def get_captcha_details_using_requests(session_id):
headers = {
'Accept': 'application/json, text/javascript, */*; q=0.01',
'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8',
'Connection': 'keep-alive',
'Content-Type': 'application/json; charset=UTF-8',
'Cookie': f"ASP.NET_SessionId={session_id}",
'Origin': 'https://tmrsearch.ipindia.gov.in',
'Referer': 'https://tmrsearch.ipindia.gov.in/eregister/Application_View.aspx',
'Sec-Fetch-Dest': 'empty',
'Sec-Fetch-Mode': 'cors',
'Sec-Fetch-Site': 'same-origin',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36',
'X-Requested-With': 'XMLHttpRequest',
'sec-ch-ua': '"Not/A)Brand";v="8", "Chromium";v="126"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Linux"',
}
json_data = {}
response = requests.post(
'https://tmrsearch.ipindia.gov.in/eregister/Viewdetails_Copyright.aspx/GetCaptcha',
headers=headers,
json=json_data,
verify=False,
)
data_hash = response.json()
captcha = data_hash["d"]
print(f"Captcha using requests library: {captcha}")
return captcha
# Adding seleniumwire options and initializing the driver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--disable-http2')
seleniumwire_options = {
'mitm_http2': False
}
driver = webdriver.Chrome(options=chrome_options, seleniumwire_options=seleniumwire_options)
# Initializing selenium driver
#driver=webdriver.Chrome()
driver.maximize_window()
driver.get("https://tmrsearch.ipindia.gov.in/eregister/")
wait = WebDriverWait(driver, 10)
# Taking session_id from main page to use it in captcha request
cookies_list = driver.get_cookies()
for cookie in cookies_list:
name = cookie["name"]
if (name and "ASP.NET_SessionId" in name):
session_id = cookie["value"]
break
print(f"Session ID: {session_id}")
# switch into the frame context
wait.until(EC.frame_to_be_available_and_switch_to_it((By.NAME, "eregoptions")))
# click on the targeted element
wait.until(EC.element_to_be_clickable((By.ID, "btnviewdetails"))).click()
# come out of frame
driver.switch_to.default_content()
time.sleep(10)
wait.until(EC.frame_to_be_available_and_switch_to_it((By.NAME, "showframe")))
# click on the targeted element
wait.until(EC.element_to_be_clickable((By.ID, "rdb_0"))).click()
time.sleep(10)
# click on audio button to reveal captcha in network calls
btn_xpath = "//img[contains(@title,'Captcha Audio')]"
audio_btn = driver.find_element(By.XPATH, btn_xpath)
if audio_btn:
audio_btn.click()
print("Clicked on audio button")
time.sleep(10)
# come out of frame
driver.switch_to.default_content()
captcha = get_captcha_details_using_selemiumwire(driver)
if session_id:
captcha = get_captcha_details_using_requests(session_id)