我正在尝试使用 Selenium (Python) 从具有以下结构的网站进行网页抓取(我对 html 进行了匿名化),但 GET 部分不起作用..
我想从 HTML 中检索以下信息:
每当我使用 XPATH 尝试以下代码时,它都不起作用:
代码:
INFO_1= driver.find_elements(By.XPATH, "/html/body/div[1]/div/div/div[2]/section[2]/div[2]/div[1]/div/a/div[3]/div/span[1]")
HTML 代码:
<a href=" anonymized href " title=" **TITLE** " class="suggestion-link">
<div>
<a href="#" style="display: inherit;"><span class="fav"></span></a></div>
<div class="image-wrap">
<!----> <!----> <!----> <!----> <!----> <!---->
<div class="carousel">
<span id="carousel_prev_8iikolcpmwi" style="display: none;"></span>
<div id="carousel_o2u8ikkx7nl" class="owl-carousel owl-theme owl-loaded owl-drag">
<div class="owl-stage-outer">
<div class="owl-stage" style="transform: translate3d(0px, 0px, 0px); transition: all 0s ease 0s; width: 970px;">
<div class="owl-item active" style="width: 323.242px;">
<div class="item">
<div class="filigrane"><img src="/images/filigrane.png"></div>
<div class="loaded">< anonymised.jpg" alt=" **TITLE 2**" class="img"> </div>
</div>
</div>
<div class="owl-item" style="width: 323.242px;">
<div class="item">
<div class="filigrane"><img src="/images/filigrane.png"></div>
<div class="loaded"><img src=" anonymized.jpg" alt=" **TITLE** " class="img"> </div>
</div>
</div>
<div class="owl-item" style="width: 323.242px;">
<div class="item">
<div class="filigrane"><img src="/images/filigrane.png"></div>
<div class="loaded"><img src=" anonymised.jpg" alt=" **TITLE** " class="img"> </div>
</div>
</div>
</div>
</div>
<div class="owl-nav disabled">
<div class="owl-prev">next</div>
<div class="owl-next">prev</div>
</div>
<div class="owl-dots"><button role="button" class="owl-dot active"><span></span></button><button role="button" class="owl-dot"><span></span></button><button role="button" class="owl-dot"><span></span></button></div>
</div>
<span id="carousel_next_d6crxou5xy"></span>
</div>
</div>
<div class="content-wrap">
<div class="card-top">
<span class="card-left uppercase"> **INFO 1**</span>
<span class="card-right">
<!----> <!---->
**INFO 2**
<!---->
</span>
</div>
<h3 class="title-wrap">**INFO 3**</h3>
<p class=""><span class="moreup"></span> **TEXT 1**.</p>
</div>
</a>
鉴于共享 HTML,请尝试以下相对 XPath 表达式:
//span[text()=' **INFO 1**' and @class='card-left uppercase']
//span[@class='card-right']
//h3[@class='title-wrap' and contains(text(),'INFO 3')]
//p[contains(text(),'TEXT 1')]