R中的子串提取

Question

我有一个看起来像这样的字符串：

{"created_at":"Tue May 12 09:45:33 +0000 2015","id":598061439090196480,"id_str":"598061439090196480","text":"I've collected 72,455 gold coins! http:\/\/t.co\/eTEbfxpAr0 #iphone"}

我希望结果是：

"Tue May 12 09:45:33 +0000 2015"  

598061439090196480

"598061439090196480"

"I've collected 72,455 gold coins! http:\/\/t.co\/eTEbfxpAr0 #iphone"

分隔符可以工作，但它会为某些字符串分隔一行并开始一个新行。请建议一些函数，我可以给出子串的开始和结束模式或不同的方法将非常有帮助。谢谢。

Answer 1

由于您拥有JSON格式的内容，因此请使用其中一个JSON解析器。

例：

string <- '{"created_at":"Tue May 12 09:45:33 +0000 2015","id":598061439090196480,"id_str":"598061439090196480","text":"I\'ve collected 72,455 gold coins! http://example.com/eTEbfxpAr0 #iphone"}'
library(jsonlite)
fromJSON(string)
# $created_at
# [1] "Tue May 12 09:45:33 +0000 2015"
# 
# $id
# [1] 5.980614e+17
# 
# $id_str
# [1] "598061439090196480"
# 
# $text
# [1] "I've collected 72,455 gold coins! http://example.com/eTEbfxpAr0 #iphone"

Answer 2

你也可以使用regmatches功能。最好与Ananda一起使用，因为使用专门为解析json文件而创建的解析器是可行的方法。

> string <- '{"created_at":"Tue May 12 09:45:33 +0000 2015","id":598061439090196480,"id_str":"598061439090196480","text":"I\'ve collected 72,455 gold coins! http://t.co/eTEbfxpAr0 #iphone"}'
> regmatches(string, gregexpr("(?<=:)(?:\"[^\"]*\"|[^,}]*)", string, perl=T))[[1]]
[1] "\"Tue May 12 09:45:33 +0000 2015\""                                  
[2] "598061439090196480"                                                  
[3] "\"598061439090196480\""                                              
[4] "\"I've collected 72,455 gold coins! http://t.co/eTEbfxpAr0 #iphone\""

R中的子串提取

问题描述投票：2回答：2

2个回答

最新问题

R中的子串提取

问题描述 投票：2回答：2

2个回答

最新问题

问题描述投票：2回答：2