我有一个看起来像这样的字符串:
{"created_at":"Tue May 12 09:45:33 +0000 2015","id":598061439090196480,"id_str":"598061439090196480","text":"I've collected 72,455 gold coins! http:\/\/t.co\/eTEbfxpAr0 #iphone"}
我希望结果是:
"Tue May 12 09:45:33 +0000 2015"
598061439090196480
"598061439090196480"
"I've collected 72,455 gold coins! http:\/\/t.co\/eTEbfxpAr0 #iphone"
分隔符可以工作,但它会为某些字符串分隔一行并开始一个新行。请建议一些函数,我可以给出子串的开始和结束模式或不同的方法将非常有帮助。谢谢。
由于您拥有JSON格式的内容,因此请使用其中一个JSON解析器。
例:
string <- '{"created_at":"Tue May 12 09:45:33 +0000 2015","id":598061439090196480,"id_str":"598061439090196480","text":"I\'ve collected 72,455 gold coins! http://example.com/eTEbfxpAr0 #iphone"}'
library(jsonlite)
fromJSON(string)
# $created_at
# [1] "Tue May 12 09:45:33 +0000 2015"
#
# $id
# [1] 5.980614e+17
#
# $id_str
# [1] "598061439090196480"
#
# $text
# [1] "I've collected 72,455 gold coins! http://example.com/eTEbfxpAr0 #iphone"
你也可以使用regmatches
功能。最好与Ananda一起使用,因为使用专门为解析json文件而创建的解析器是可行的方法。
> string <- '{"created_at":"Tue May 12 09:45:33 +0000 2015","id":598061439090196480,"id_str":"598061439090196480","text":"I\'ve collected 72,455 gold coins! http://t.co/eTEbfxpAr0 #iphone"}'
> regmatches(string, gregexpr("(?<=:)(?:\"[^\"]*\"|[^,}]*)", string, perl=T))[[1]]
[1] "\"Tue May 12 09:45:33 +0000 2015\""
[2] "598061439090196480"
[3] "\"598061439090196480\""
[4] "\"I've collected 72,455 gold coins! http://t.co/eTEbfxpAr0 #iphone\""