我从csv中读取了我的网址,我希望最后将结果导出到新的csv中。我使用以下大约60个网址
import csv
from bs4 import BeautifulSoup
import requests
from time import sleep
from multiprocessing import Pool
contents = []
with open('websupplies2.csv') as csvf:
reader = csv.reader(csvf, delimiter=";")
for row in reader:
contents.append(row) # Add each url to list contents
price_text='-'
availability_text='-'
def parse(contents):
info = []
with open('output_websupplies.csv', mode='w') as f:
f_writer = csv.writer(f, delimiter=';', quotechar='"', quoting=csv.QUOTE_MINIMAL)
f_writer.writerow(['SKU','Price','Availability'])
for row in contents: # Parse through each url in the list.
sleep(3)
page = requests.get(row[1]).content
soup = BeautifulSoup(page, "html.parser")
price = soup.find('div', attrs={'class':'product-price'})
if price is not None:
price_text = price.text.strip()
print(price_text)
else:
price_text = "0,00"
print(price_text)
availability = soup.find('div', attrs={'class':'available-text'})
if availability is not None:
availability_text = availability.text.strip()
print(availability_text)
else:
availability_text = "Μη Διαθέσιμο"
print(availability_text)
info.append(row[0])
info.append(price_text)
info.append(availability_text)
return ';'.join(info)
if __name__ == "__main__":
with Pool(10) as p:
records = p.map(parse, contents)
if len(records) > 0:
with open('output_websupplies.csv', 'a+') as f:
f.write('\n'.join(records))
但我收到错误消息,如名称错误记录未定义。为了让脚本工作,我应该更改什么?
首先仔细检查缩进。你在这里粘贴的东西看起来不一致,如果你的if len(records) > 0:
线没有缩进,那你肯定会得到一个NameError。
为了使语句在块内,它必须具有与块中的其他语句相等的缩进,并且大于打开块的行。换句话说,if
语句中的所有内容都应排成一行。例如:
if __name__ == "__main__":
with Pool(10) as p:
records = p.map(parse, contents)
if len(records) > 0:
with open('output_websupplies.csv', 'a+') as f:
f.write('\n'.join(records))