如何使用tweepy检索全文推文?

问题描述 投票:2回答:1

我试图从用户定义的Twitter个人资料中抓取推文。阅读之前的帖子后,我了解Twitter JSON有一个扩展推文部分。我已将tweet_mode='extended'添加到我的api.user_timeline部分并将.text更改为.full_text.

但是,我仍然得到截断的推文。我知道转推有一个full_text属性,但我正在抓住时间轴,而不是将推文与转推分开。

有没有办法普遍查询推文并检索full_text版本。我在下面提供了我的代码。

screen_name_list = ['@x']

for name in screen_name_list:
    user = api.get_user(name)

    #initialize a list to hold all the tweepy Tweets
    alltweets = []  

    #make initial request for most recent tweets (200 is the maximum allowed count)
    new_tweets = api.user_timeline(screen_name = name, count = 200,tweet_mode='extended', include_rts=True)

    #save most recent tweets
    alltweets.extend(new_tweets)

    #save the id of the oldest tweet less one
    oldest = alltweets[-1].id - 1

    #keep grabbing tweets until there are no tweets left to grab
    while len(new_tweets) > 0:
      print 'getting tweets before %s' % (oldest)

        #all subsiquent requests use the max_id param to prevent duplicates
        new_tweets = api.user_timeline(screen_name = name, count=200, max_id=oldest, tweet_mode='extended')

        #save most recent tweets
        alltweets.extend(new_tweets)

        #update the id of the oldest tweet less one
        oldest = alltweets[-1].id - 1

        print "...%s tweets downloaded so far" % (len(alltweets))

    #transform the tweepy tweets into a 2D array that will populate the csv 
    outtweets = [[tweet.id_str, tweet.created_at, tweet.full_text.encode('utf-8')] for tweet in alltweets]
    tweet_time = [index[1] for index in outtweets]
    tweet_list = [index[2] for index in outtweets]
python twitter tweepy tweets
1个回答
0
投票

如果你更换

tweet.full_text

tweet.retweeted_status.full_text if tweet.full_text.startswith("RT @") else tweet.full_text

您将获得转发的全文,虽然前面没有“RT”,因此您可能还想在CSV中添加另一列来表示转发,例如:

[1 if tweet.full_text.startswith("RT @") else 0] for tweet in alltweets
© www.soinside.com 2019 - 2024. All rights reserved.