我正在使用 bs4 和 selenium 来抓取土地拍卖网站(URL https://bid.hertz.ag/ui/auctions/112571/14320874),并且我无法抓取拍卖日期和附件链接在“BIDS”和“DOCS”选项卡中。我认为这是因为该网站识别出一台模拟单击选项卡的机器,因为我的抓取工具识别了它需要单击的选项卡,但总是在执行此操作之前崩溃。
我使用过无头浏览器,使用 .scrollIntoView() 作为选项卡,尝试使用 time.sleep() 让它看起来像一个人正在等待单击选项卡,但我不知道如何获取返回信息。
关于如何克服这个问题有什么建议吗?
您可以使用他们的API端点轻松获取目标数据:
https://bid.hertz.ag/api/items/14320874/bids?bid_type=public&newer_than=
https://bid.hertz.ag/api/auctions/112571?page=active
import requests
from bs4 import BeautifulSoup
bid_url = 'https://bid.hertz.ag/api/items/14320874/bids?bid_type=public&newer_than='
docs_url = 'https://bid.hertz.ag/api/auctions/112571?page=active'
doc_resp = requests.get(docs_url).json()
soup = BeautifulSoup(doc_resp['terms']['legalese'], 'lxml')
print(f"Terms and Conditions: {soup.text.strip()}")
links = doc_resp['items'][0]['documents']
for i in links:
links_details = [i['file_name'], i['url']]
print(links_details)
bid_resp = requests.get(bid_url).json()
for i in bid_resp:
details = [f"BIDDER: {i['anonymized_bidder_id']}", f"BID: ${i['amount']}", f"{i['placed_at']}"]
print(details)
Terms and Conditions: Please note a complete and final series of Bidding Instructions that includes sale-specific terms and conditions will be available prior to each auction sale, and will be read by the auctioneer, prior to the start of bidding on the day of the sale.
ANNOUNCEMENTS MADE BY THE AUCTIONEER AT THE TIME OF SALE WILL TAKE PRECEDENCE OVER ANY PREVIOUSLY PRINTED MATERIAL OR ANY OTHER ORAL STATEMENTS MADE. All decisions of the Auctioneer are final.
This auction is open and public. Bids may come from persons bidding online, over the phone, or in attendance on-site from where the online sale is being administered.
All bidders must be pre-approved by Hertz Farm Management/Hertz Real Estate Services (Hertz).
Hertz reserves the right to deny registration of any bidder if there is a question as to the person’s credentials, intent, or fitness for bidding.
Hertz represents the Seller(s) in this transaction.
Bidding is not contingent upon financing, nor is it contingent upon any Buyer or Lender inspection.
Minimum bid increments are at the discretion of the auctioneer throughout the auction.
The successful bidder will be required to execute a Sales Agreement at the conclusion of the auction and submit the appropriate down payment in a timely manner as instructed by the Auctioneer.
The subject property is being offered under the sale-specific terms. Preliminary terms are provided in the printed and electronic versions of the auction brochure. Final terms will be provided prior to the beginning of bidding on the day of sale and will be included in the Real Estate Sales Agreement.
The information in the auction brochure was obtained from sources deemed reliable, but Hertz, its licensees, and its employees make no guarantees as to its accuracy.
Prospective bidders are urged to fully inspect the property prior to the auction and to rely on their own conclusions; doing so is the complete responsibility of each prospective bidder. Bidding is based solely upon bidder inspection and due diligence.
All real estate and any improvements are being offered and sold on an “As-Is, Where-Is, With ALL Faults” basis, without any warranties or representations, express or implied, as to the fitness for a particular use, access, water quality or quantity, physical or environmental condition. Neither the Seller(s) nor the Auction Company are warranting any specific zoning classification, location of utilities, assurance of building permits, driveway permits, or water or septic permits.
Any and all acreage figures, diagrams, and/or dimensions included in the marketing materials are based on information currently available but are not guaranteed. All drawn property boundaries in the marketing materials are approximate. Any soil productivity rating reference that is included in the marketing materials is based on the information currently available in the USDA/NRCS soil survey database, and these numbers are subject to change upon periodic update by USDA/NRCS.
['Decatur_120_000-3296-01(1).pdf', 'https://s3.amazonaws.com/bwpaperclip-production/item_documents/assets/000/122/305/original/Decatur_120_000-3296-01%281%29.pdf']
['HOW-TO__Register_for_Online_Bidding.pdf', 'https://s3.amazonaws.com/bwpaperclip-production/item_documents/assets/000/122/306/original/HOW-TO__Register_for_Online_Bidding.pdf']
['HOW_TO_Access_and_Navigate_the_Bidding_Platform.pdf', 'https://s3.amazonaws.com/bwpaperclip-production/item_documents/assets/000/122/307/original/HOW_TO_Access_and_Navigate_the_Bidding_Platform.pdf']
['HOW-TO_View_an_Auction_Livestream_Feed.pdf', 'https://s3.amazonaws.com/bwpaperclip-production/item_documents/assets/000/122/308/original/HOW-TO_View_an_Auction_Livestream_Feed.pdf']
['1-Bidding_Intructions.pdf', 'https://s3.amazonaws.com/bwpaperclip-production/item_documents/assets/000/123/997/original/1-Bidding_Intructions.pdf']
['3-DRAFT-Sales_Agmt.pdf', 'https://s3.amazonaws.com/bwpaperclip-production/item_documents/assets/000/123/998/original/3-DRAFT-Sales_Agmt.pdf']
['BIDDER: 33220397', 'BID: $11400', '2024-01-25T16:18:38.252Z']
['BIDDER: 33186525', 'BID: $11300', '2024-01-25T16:18:28.351Z']
['BIDDER: 33220397', 'BID: $11200', '2024-01-25T16:17:19.164Z']
['BIDDER: 33186525', 'BID: $11100', '2024-01-25T16:17:11.435Z']
['BIDDER: 33220397', 'BID: $11000', '2024-01-25T16:15:54.132Z']
['BIDDER: 33186525', 'BID: $10900', '2024-01-25T16:15:46.855Z']
['BIDDER: 33220397', 'BID: $10800', '2024-01-25T16:15:21.668Z']
['BIDDER: 33186525', 'BID: $10700', '2024-01-25T16:15:14.698Z']
['BIDDER: 33220397', 'BID: $10600', '2024-01-25T16:14:56.478Z']
['BIDDER: 33186525', 'BID: $10500', '2024-01-25T16:14:52.104Z']
['BIDDER: 33220397', 'BID: $10400', '2024-01-25T16:14:30.765Z']
['BIDDER: 33221110', 'BID: $10300', '2024-01-25T16:14:26.182Z']
['BIDDER: 33220397', 'BID: $10200', '2024-01-25T16:14:11.909Z']
['BIDDER: 33221110', 'BID: $10100', '2024-01-25T16:14:02.676Z']
['BIDDER: 33220397', 'BID: $10000', '2024-01-25T16:13:16.654Z']
['BIDDER: 33186525', 'BID: $9500', '2024-01-25T16:13:10.912Z']
['BIDDER: 33220397', 'BID: $9000', '2024-01-25T16:12:59.201Z']
['BIDDER: 33186525', 'BID: $8500', '2024-01-25T16:12:51.643Z']
['BIDDER: 33220397', 'BID: $8000', '2024-01-25T16:12:39.527Z']
['BIDDER: 33186525', 'BID: $7500', '2024-01-25T16:12:31.689Z']
['BIDDER: 33223332', 'BID: $7000', '2024-01-25T16:12:25.297Z']
['BIDDER: 33220397', 'BID: $7000', '2024-01-25T16:12:26.251Z']
['BIDDER: 33186525', 'BID: $6500', '2024-01-25T16:12:14.986Z']
['BIDDER: 33226058', 'BID: $6200', '2024-01-25T16:11:20.657Z']
['BIDDER: 33224742', 'BID: $6100', '2024-01-25T16:11:20.643Z']
['BIDDER: 33186525', 'BID: $6100', '2024-01-25T16:11:39.994Z']
['BIDDER: None', 'BID: $6000', '2024-01-25T16:11:30.421Z']