使用twitter api和spark搜索特定的关键词

问题描述 投票:0回答:1

我正在尝试这个代码,我用#Apple替换#。

val ssc = new StreamingContext("local[*]", "PopularHashtags", Seconds(1))
val tweets = TwitterUtils.createStream(ssc, None)
val statuses = tweets.map(status => status.getText())
val tweetwords = statuses.flatMap(tweetText => tweetText.split(" "))
val hashtags = tweetwords.filter(word => word.startsWith("#"))
val hashtagKeyValues = hashtags.map(hashtag => (hashtag, 1))
val hashtagCounts = hashtagKeyValues.reduceByKeyAndWindow( (x,y) => x + y, (x,y) => x - y, Seconds(1000), Seconds(1))
val sortedResults = hashtagCounts.transform(rdd => rdd.sortBy(x => x._2, false))
sortedResults.print

但我没有得到任何结果。

这个流媒体是否有一些限制,有多少推文以及它将从哪个地区获取推文?此外,我尝试寻找#OPPO,因为在我的推特账户中这是趋势所以我试图寻找它,但我仍然没有得到任何结果。

scala apache-spark twitter twitter-streaming-api
1个回答
0
投票
val ssc = new StreamingContext("local[*]", "PopularHashtags", Seconds(1))
//The keyword you want to look for can be specified in a sequence as follows
var seq:Seq[String] = Seq("#Rajasthan","#Apple")
val tweets = TwitterUtils.createStream(ssc, None, seq)
val statuses = tweets.map(status => status.getText())
val tweetwords = statuses.flatMap(tweetText => tweetText.split(" "))
val hashtags = tweetwords.filter(word=>word.contains("#"))
val hashtagKeyValues = hashtags.map(hashtag => (hashtag, 1))
val hashtagCounts = hashtagKeyValues.reduceByKeyAndWindow( (x,y) => x + y, (x,y) => x - y, Seconds(1000), Seconds(1))
val sortedResults = hashtagCounts.transform(rdd => rdd.sortBy(x => x._2, false))
sortedResults.print
© www.soinside.com 2019 - 2024. All rights reserved.