Python：Python（Excel）中文本列表的URLS列表

Question

我有一个要在Excel中进行鸣叫的URL列表。是否可以在Python中从这些推文（URL）中提取文本？然后将其保存在Excel工作表中吗？

我看到有人使用了下面的代码，但这仅适用于1个URL。

from lxml import html

import requests

page = requests.get('https://twitter.com/realDonaldTrump/status/1237448419284783105')

tree = html.fromstring(page.content)

tree.xpath('//div[contains(@class, "permalink-tweet-container")]//p[contains(@class, "tweet-text")]//text()')

Excel包含以下列：作者和URL。excelfile（'twitter.xlsx'）看起来像这样：

Author              URL
realDon..      https://twitter.com/realDon..
.                         .
.                         .
.                         .

我尝试过此代码：

import pandas as pd
from lxml import html
import requests

input_data = pd.read_excel('twitter.xlsx')

input_data1 = input_data[['URL']]

tweets = []
for url in input_data1.values:
x = requests.get(url)
tree = html.fromstring(x.content)
i = tree.xpath('//div[contains(@class, "permalink-tweet container")]//p[contains(@class, "tweet-text")]//text()')
tweets.append(i)

错误：InvalidSchema：找不到“ ['https://twitter.com/realDonaldTrump/status/1237448419284783105']'

的连接适配器

Answer 1

简短回答-是。

答案很长-是的，有可能。我建议您对此主题进行一些阅读。

https://automatetheboringstuff.com/chapter12/介绍了如何管理和操作excel文件。 openpyxl库是您的朋友-here's their documentation。
requests是一个很棒的图书馆，可用于访问网站！ Here is their documentation

这是模拟程序逻辑的伪代码：

input_data = read(excel_file)
tweets = []
for url in input_data:
    x = get(url)
    tweets.append(x)
for tweet in tweets:
    write(tweet, excel_file)

Python：Python（Excel）中文本列表的URLS列表

问题描述投票：-1回答：1

1个回答

最新问题

Python：Python（Excel）中文本列表的URLS列表

问题描述 投票：-1回答：1

1个回答

最新问题

问题描述投票：-1回答：1