如何使用Tweepy在某些日期获取推文?

问题描述 投票:1回答:1

如何在特定日期使用Tweepy获取推文

我写的代码是这样的(jupyter):

import tweepy as tw 
import xlsxwriter
import datetime 
import pandas as pd
consumer_key="#"
consumer_secret="#"
access_key="#"
access_secret="#"
try:
 auth = tw.OAuthHandler(consumer_key, consumer_secret)
 auth.set_access_token(access_key, access_secret)
 auth.get_authorization_url()
 api = tw.API(auth,wait_on_rate_limit=True,wait_on_rate_limit_notify=True,compression=True,retry_count=3,retry_delay=10,timeout=15)
except tw.TweepError:
 print ('Error')

name="mahfiegilmez"

startDate = datetime.datetime(2018, 6, 24, 0, 0, 0)
endDate =   datetime.datetime(2018, 12, 31, 23, 59, 59)

say=0
tweets = []
from time import sleep
tmpTweets = api.user_timeline(name,count=200,tweet_mode="extended",lang="tr")

for tweet in tmpTweets:
        if tweet.created_at < endDate and tweet.created_at > startDate:
            tweets.append(tweet)

lastTweet = tmpTweets[-1].id
while (tmpTweets[-1].created_at > startDate):
    print("Sonraki Tweet @", tmpTweets[-1].created_at,say)

    tmpTweets = api.user_timeline(name,max_id = tmpTweets[-1].id,tweet_mode="extended")
    if lastTweet == tmpTweets[-1].id:
        print("lastTweet")
        sleep(15)
    else:
        for tweet in tmpTweets:
            if tweet.created_at < endDate and tweet.created_at > startDate:
                tweets.append(tweet)
    lastTweet = tmpTweets[-1].id
    say+=1

下一节:

tweets2=[]
tweets.reverse()
for x in tweets:
    if(x.in_reply_to_status_id==None) or (x.in_reply_to_screen_name==name):
        if (not x.retweeted) and ("RT @" not in x.full_text):
            tweets2.append(x)

赞:

  • 下一条推文@ 2019-02-15 13:33:26 1095106703098605568 157
  • 下一条推文@ 2019-02-11 23:45:58 1094442196500209666 158
  • 下一条推文@ 2019-02-10 03:45:28 1094441678889463809 159
  • 下一条推文@ 2019-02-10 03:43:24 1094441678889463809 160
  • 下一条推文@ 2019-02-10 03:43:24 1094441678889463809 161
  • 下一条推文@ 2019-02-10 03:43:24 1094441678889463809 162
  • 下一条推文@ 2019-02-10 03:43:24 1094441678889463809 163.....

我该怎么解决?

最后给出此错误。

> IndexError                                Traceback (most recent call
> last) <ipython-input-9-46264abdd8ef> in <module>
>       9         tweets.append(tweet)
>      10 
> ---> 11 while (tmpTweets[-1].created_at > startDate):
>      12     print("Last Tweet @", tmpTweets[-1].created_at, " - fetching some more")
>      13     tmpTweets = api.user_timeline(username, max_id = tmpTweets[-1].id)
> 
> IndexError: list index out of range
python twitter tweepy tweets twitterapi-python
1个回答
1
投票

一种更好的方法是对since_id方法/ max_id端点使用API.user_timelineAPI.user_timeline参数,而不是发出很多不必要的请求来遍历时间范围之外的大量推文。您还应该考虑改用GET statuses/user_timeline

© www.soinside.com 2019 - 2024. All rights reserved.