我有2个与瑞典字符有关的问题。我直接从MS SQL数据库中获取数据。 1.任何人都可以给我一个提示,我怎样才能改变R中的瑞典字符?
我使用write.csv将数据写入csv,然后在这里复制并粘贴这些字符串以使df如下所示
library(tidyverse)
library(ggplot2)
library(scales)
c <- c("c","u","m","j","c","u","m","j","c","u","m","j")
city <- c("G<f6>teborg", "Ume<e5>", "Malm<f6>", "J<f6>nk<f6>ping","G<f6>teborg", "Ume<e5>", "Malm<f6>", "J<f6>nk<f6>ping","G<f6>teborg", "Ume<e5>", "Malm<f6>", "J<f6>nk<f6>ping")
priority <- c(1,1,1,1,0,0,0,0,2,3,3,2)
n_cust <- sample(50:1000, 12, replace=T)
df <- data.frame(c,city,priority,n_cust)
应该是ö并且是å
dpri %>% group_by(kommun, artikel_prioritet) %>% summarise(n_cust=n_distinct(kund_id), sum_sales=sum(p_sum_adj_sale), avg_margin=mean(pp_avg_margin), avg_pec_sales=mean(p_pec_sales)) %>% arrange(desc(sum_sales)) %>% head(20)%>% ggplot(aes(x=reorder(kommun, sum_sales), y=sum_sales, fill=factor(artikel_prioritet))) + geom_bar(stat='identity')+ coord_flip()+ scale_y_continuous(labels = comma)+ facet_grid(.~ factor(artikel_prioritet), scales = "free")+ theme(legend.position="none")
我收到此错误:grid.Call中的错误(C_textBounds,as.graphicsAnnot(x $ label),x $ x,x $ y,:'utf8towcs'中输入'Göteborg'无效
如果我先将这个头(20)放入变量ci中。然后使用ggplot绘制ci
ggplot(ci,aes(x = reorder(kommun,sum_sales),y = sum_sales,fill = factor(artikel_prioritet)))+ geom_bar(stat ='identity')+ coord_flip()+ scale_y_continuous(labels =逗号)+ facet_grid(.~factor(artikel_prioritet),scales =“free”)+ 主题(legend.position = “无”)
我有条形图没有任何城市传说。然后我打印出ci,我得到如下照片:
然后,我把头(20)写到csv'nityname.csv'然后read.csv回到R使用相同的代码来做条形图
ci < - read.csv(“cityname.csv”)
ggplot(ci,aes(x = reorder(kommun,sum_sales),y = sum_sales,fill = factor(artikel_prioritet)))+ geom_bar(stat ='identity')+ coord_flip()+ scale_y_continuous(labels =逗号)+ facet_grid(.~factor(artikel_prioritet),scales =“free”)+ 主题(legend.position = “无”)
这次我们可以看到传说,但这次看到了。希望得到一些建议我如何修复瑞典语中的字符串,并想知道没有write.csv有没有其他方法,然后再次阅读仍然可以得到条形图固定?
谢谢!
我相信你的问题是R不知道如何解释你的字符编码。尝试用\u
表示法代替<>
,它表示R中的UTF-8编码
> city <- c("G\u00f6teborg", "Ume\u00e5", "Malm\u00f6", "J\u00f6nk\u00f6ping","G\u00f6teborg", "Ume\u00e5", "Malm\u00f6", "J\u00f6nk\u00f6ping","G\u00f6teborg", "Ume\u00f6", "Malm\u00f6", "J\u00f6nk\u00f6ping")
> Encoding(city)
[1] "UTF-8" "UTF-8" "UTF-8" "UTF-8" "UTF-8" "UTF-8" "UTF-8" "UTF-8" "UTF-8" "UTF-8" "UTF-8" "UTF-8"
> head(city)
[1] "Göteborg" "Umeå" "Malmö" "Jönköping" "Göteborg" "Umeå"
编辑:您问了一个关于如何以编程方式进行此替换的良好后续问题。我已经为此提供了一个解决方案,使用tidyverse
包dplyr
和stringr
> city <- c("G<f6>teborg", "Ume<e5>", "Malm<f6>", "J<f6>nk<f6>ping","G<f6>teborg", "Ume<e5>", "Malm<f6>", "J<f6>nk<f6>ping","G<f6>teborg", "Ume<f6>", "Malm<f6>", "J<f6>nk<f6>ping")
> city_df <- as.data.frame(city)
> special_character_replacements <- c("<f6>" = "\\u00f6", "<e5>" = "\\u00e5")
> city_df %>%
dplyr::mutate(city_fixed =
stringr::str_replace_all(city, special_character_replacements))
city city_fixed
1 G<f6>teborg Göteborg
2 Ume<e5> Umeå
3 Malm<f6> Malmö
4 J<f6>nk<f6>ping Jönköping
5 G<f6>teborg Göteborg
6 Ume<e5> Umeå
7 Malm<f6> Malmö
8 J<f6>nk<f6>ping Jönköping
9 G<f6>teborg Göteborg
10 Ume<f6> Umeö
11 Malm<f6> Malmö
12 J<f6>nk<f6>ping Jönköping