在 playwright-python 中,我知道我可以使用
elementHandle
得到 querySelector()
。
示例(同步):
from playwright import sync_playwright
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch()
page = browser.newPage()
page.goto('https://duckduckgo.com/')
element = page.querySelector('input[id=\"search_form_input_homepage\"]')
如何根据这个
elementHandle
获取与此相关的元素? IE。父母、祖父母、兄弟姐妹、孩子的手柄?
原答案:
使用
querySelector()
/ querySelectorAll
与
XPath(XML 路径语言) 允许您检索 elementHandle
(分别是句柄集合)。一般来说,XPath 可用于浏览 XML 文档中的元素和属性。
from playwright import sync_playwright
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch(headless=False)
page = browser.newPage()
page.goto('https://duckduckgo.com/')
element = page.querySelector('input[id=\"search_form_input_homepage\"]')
parent = element.querySelector('xpath=..')
grandparent = element.querySelector('xpath=../..')
siblings = element.querySelectorAll('xpath=following-sibling::*')
children = element.querySelectorAll('xpath=child::*')
browser.close()
更新(2022-07-22):
似乎
browser.newPage()
已被弃用,因此在较新版本的 playwright 中,该函数称为 browser.new_page()
(注意不同的函数名称)。
可以选择首先创建一个浏览器上下文(然后关闭它)并在该上下文上调用
new_page()
。
访问孩子/父母/祖父母/兄弟姐妹的方式保持不变。
from playwright import sync_playwright
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch(headless=False)
context = browser.new_context()
page = context.new_page()
page.goto('https://duckduckgo.com/')
element = page.querySelector('input[id=\"search_form_input_homepage\"]')
parent = element.querySelector('xpath=..')
grandparent = element.querySelector('xpath=../..')
siblings = element.querySelectorAll('xpath=following-sibling::*')
children = element.querySelectorAll('xpath=child::*')
context.close()
browser.close()
接受的答案是旧版本的剧作家。对于当前版本,使用以下格式即可。
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch(headless=False)
context = browser.new_context()
page =context.new_page()
page.goto('https://duckduckgo.com/')
element = page.query_selector('input[id=\"search_form_input_homepage\"]')
parent = element.query_selector('xpath=..')
grandparent = element.query_selector('xpath=../..')
siblings = element.query_selector_all('xpath=following-sibling::*')
children = element.query_selector_all('xpath=child::*')
context.close()
browser.close()
现在不再使用 query_selector。 相反,使用
child_element = page.locator('input[id=\"search_form_input_homepage\"]')
parent = page.locator('div').filter(has=child_element)
grandparent = page.locator('xpath=../..')
siblings = page.locator('xpath=following-sibling::*')
children = page.locator('xpath=child::*')