我是使用Python进行抓取的新手。在使用了大量有用的资源后,我能够抓取一个页面的内容。但是,我无法将此数据保存到.csv文件中。
蟒蛇:
import mechanize
import time
import requests
import csv
from selenium import webdriver
from selenium.webdriver.common.by import By
driver = webdriver.Firefox(executable_path=r'C:\Users\geckodriver.exe')
driver.get("myUrl.jsp")
username = driver.find_element_by_name('USER')
password = driver.find_element_by_name('PASSWORD')
username.send_keys("U")
password.send_keys("P")
main_frame = driver.find_element_by_xpath('//*[@id="Frame"]')
src = driver.switch_to_frame(main_frame)
table = driver.find_element_by_xpath("/html/body/div/div[2]/div[5]/form/div[7]/div[3]/table")
rows = table.find_elements(By.TAG_NAME, "tr")
for tr in rows:
outfile = open("C:/Users/Scripts/myfile.csv", "w")
with outfile:
writers = csv.writer(outfile)
writers.writerows(tr.text)
问题:
只有一行被写入excel文件。但是,当我将tr.text打印到控制台时,会显示所有必需的行。如何将tr
元素中的所有文本写入excel文件?
目前你的代码将打开文件,写一行,关闭它,然后在下一行再次打开它并覆盖该行。请考虑以下代码段:
# We use 'with' to open the file and auto close it when done
# syntax is best modified as follows
with open('C:/Users/Scripts/myfile.csv', 'w') as outfile:
writers = csv.writer(outfile)
# we only need to open the file once so we open it first
# then loop through each row to print everything into the open file
for tr in rows:
writers.writerows(tr.text)