所以我正在进行一些Twitter抓取,下面的代码将为我提供tweet文本用户名和关注者计数,但是我不知道如何返回主题标签TOTALS。从本质上讲,我想知道在给定时间段内已使用#次标签或完整的标签总数。我到处搜索并找不到它。这不是我访问过的原始代码here。如果有人可以帮助我,我将非常感谢。谢谢。
def search_for_hashtags(consumer_key, consumer_secret, access_token, access_token_secret, hashtag_phrase):
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
fname = '_'.join(re.findall(r"#(\w+)", hashtag_phrase))
with open('%s.csv' % (fname), 'wb') as file:
w = csv.writer(file)
w.writerow(['timestamp', 'tweet_text', 'username', 'all_hashtags', 'followers_count'])
for tweet in tweepy.Cursor(api.search, q=hashtag_phrase+' -filter:retweets', \
lang="en", tweet_mode='extended').items(100):
w.writerow([tweet.created_at, tweet.full_text.replace('\n',' ').encode('utf-8'), tweet.user.screen_name.encode('utf-8'), [e['text'] for e in tweet._json['entities']['hashtags']], tweet.user.followers_count])
for tweet in tweepy.Cursor(api.search, q=hashtag_phrase+' -filter:retweets', \
lang="en", tweet_mode='extended').items(100):
w.writerow([tweet.created_at, tweet.full_text.replace('\n',' ').encode('utf-8'), tweet.user.screen_name.encode('utf-8'), [e['text'] for e in tweet._json['entities']['hashtags']], tweet.user.followers_count])
"""I'm using [In] via Jupyter
consumer_key = raw_input('Consumer Key ')
consumer_secret = raw_input('Consumer Secret ')
access_token = raw_input('Access Token ')
access_token_secret = raw_input('Access Token Secret ')
hashtag_phrase = raw_input('Hashtag Phrase ')
if __name__ == '__main__':
search_for_hashtags(consumer_key, consumer_secret, access_token, access_token_secret, hashtag_phrase)
明确地说,您想查找特定#标签在Twitter上使用的总次数?或特定用户使用它的次数?
[我看到您的代码已经获取了#标签实体,并且应该返回#标签字典以及它们在tweet中的索引。