如何在http请求正文中支持中文？

Question

URL = http://example.com,
Header = [],
Type = "application/json",
Content = "我是中文",

Body = lists:concat(["{\"type\":\"0\",\"result\":[{\"url\":\"test.cn\",\"content\":\"", unicode:characters_to_list(Content), "\"}]}"]),
lager:debug("URL:~p, Body:~p~n", [URL, Body]),
HTTPOptions = [],
Options = [],
Response = httpc:request(post, {URL, Header, Type, Body}, HTTPOptions, Options),

http服务器收到的http请求正文不是我是中文。如何解决此问题？

Answer 1

编码的运气

您必须格外小心，以确保输入内容符合您的想法，因为它可能与您的期望有所不同。

此答案适用于我正在运行的Erlang版本，该版本为

R16B03-1。我将尝试在此处获取所有详细信息，以便您可以使用自己的安装进行测试并验证。

如果不采取特定措施来更改它，则字符串将解释如下：
在终端（OS X 10.9.2）中
TerminalContent = "我是中文", TerminalContent = [25105,26159,20013,25991].
在终端中，字符串被解释为Unicode字符列表。
在模块中
BytewiseContent = "我是中文", BytewiseContent = [230,136,145,230,152,175,228,184,173,230,150,135].
在模块中，默认编码为latin1，并且包含Unicode字符的字符串被解释为bytewise列表（UTF8字节）。
[如果您使用编码为BytewiseContent的数据，则unicode:characters_to_list/1将对汉字进行双重编码，并且ææ¯ä将被发送到您希望使用我是中文的服务器。
解决方案
指定每个源文件和术语文件的编码。
如果运行erl命令行，请确保将其设置为使用Unicode。
如果从文件中读取数据，则在处理之前将字节从bytewise编码转换为unicode（这也适用于使用httpc:request/N采集的二进制数据）。
如果您在模块中嵌入unicode字符，请通过在模块的前两行中添加注释来确保表示相同的内容：
%% -*- coding: utf-8 -*-
这将改变模块解释字符串的方式，例如：
UnicodeContent = "我是中文", UnicodeContent = [25105,26159,20013,25991].
一旦您确保要串联字符而不是字节，则串联是安全的。在整个过程完成之前，请勿使用unicode:characters_to_list/1来转换您的字符串/列表。
示例代码
[在给定Url和Unicode字符列表Content时，以下功能将按预期工作：
http_post_content(Url, Content) -> ContentType = "application/json", %% Concat the list of (character) lists Body = lists:concat(["{\"content\":\"", Content, "\"}"]), %% Explicitly encode to UTF8 before sending UnicodeBin = unicode:characters_to_binary(Body), httpc:request(post, { Url, [], % HTTP headers ContentType, % content-type UnicodeBin % the body as binary (UTF8) }, [], % HTTP Options [{body_format,binary}] % indicate the body is already binary ).
为了验证结果，我使用node.js和express编写了以下HTTP服务器。此
dead-simple服务器
的唯一目的是完整地检查问题和解决方案。var express = require('express'), bodyParser = require('body-parser'), util = require('util'); var app = express(); app.use(bodyParser()); app.get('/', function(req, res){ res.send('You probably want to perform an HTTP POST'); }); app.post('/', function(req, res){ util.log("body: "+util.inspect(req.body, false, 99)); res.json(req.body); }); app.listen(3000);
Gist
正在验证
再次在Erlang中，以下函数将检查以确保HTTP响应包含回显的JSON，并确保返回了准确的Unicode字符。
verify_response({ok, {{_, 200, _}, _, Response}}, SentContent) -> %% use jiffy to decode the JSON response {Props} = jiffy:decode(Response), %% pull out the "content" property value ContentBin = proplists:get_value(<<"content">>, Props), %% convert the binary value to unicode characters, %% it should equal what we sent. case unicode:characters_to_list(ContentBin) of SentContent -> ok; Other -> {error, [ {expected, SentContent}, {received, Other} ]} end; verify_response(Unexpected, _) -> {error, {http_request_failed, Unexpected}}.
完整的example.erl module is posted in a Gist。
一旦编译好示例模块并运行了回显服务器，您将希望在Erlang shell中运行类似的内容：
example.erl
如果设置了inets:start(). Url = example:url(). Content = example:content(). Response = example:http_post_content(Url, Content).，也可以验证往返的内容：
jiffy
您现在应该能够确认任何unicode内容的往返编码。
在编码之间转换
虽然我在上面解释了编码，但是您会注意到example:verify_response(Response, Content).，TerminalContent和BytewiseContent都是整数列表。您应该以一种可以确定自己掌握的方式来进行编码。
[奇数球编码为UnicodeContent，当使用不支持“ Unicode感知”的模块时，它可能会打开。 bytewise在标题[[UTF-8字节列表
的底部附近提到了这一点。要翻译Erlang's guidance on working with unicode列表，请使用：

bytewise

我的设置
据我所知，我没有修改Erlang行为的本地设置。我的Erlang是由%% from http://www.erlang.org/doc/apps/stdlib/unicode_usage.html
utf8_list_to_string(StrangeList) ->
    unicode:characters_to_list(list_to_binary(StrangeList)).
构建和分发的[[R16B03-1

，我的机器运行OS X 10.9.2。

如何在http请求正文中支持中文？

问题描述投票：1回答：1

1个回答

您必须格外小心，以确保输入内容符合您的想法，因为它可能与您的期望有所不同。

`TerminalContent = "我是中文", TerminalContent = [25105,26159,20013,25991].`
在终端中，字符串被解释为Unicode字符列表。
在模块中

指定每个源文件和术语文件的编码。
如果运行`erl`命令行，请确保将其设置为使用Unicode。
如果从文件中读取数据，则在处理之前将字节从`bytewise`编码转换为unicode（这也适用于使用`httpc:request/N`采集的二进制数据）。

[在给定`Url`和Unicode字符列表`Content`时，以下功能将按预期工作：

再次在Erlang中，以下函数将检查以确保HTTP响应包含回显的JSON，并确保返回了准确的Unicode字符。

虽然我在上面解释了编码，但是您会注意到`example:verify_response(Response, Content).`，`TerminalContent`和`BytewiseContent`都是整数列表。您应该以一种可以确定自己掌握的方式来进行编码。

最新问题

如何在http请求正文中支持中文？

问题描述 投票：1回答：1

1个回答

您必须格外小心，以确保输入内容符合您的想法，因为它可能与您的期望有所不同。

TerminalContent = "我是中文", TerminalContent = [25105,26159,20013,25991]. 在终端中，字符串被解释为Unicode字符列表。在模块中

指定每个源文件和术语文件的编码。如果运行erl命令行，请确保将其设置为使用Unicode。如果从文件中读取数据，则在处理之前将字节从bytewise编码转换为unicode（这也适用于使用httpc:request/N采集的二进制数据）。

[在给定Url和Unicode字符列表Content时，以下功能将按预期工作：

再次在Erlang中，以下函数将检查以确保HTTP响应包含回显的JSON，并确保返回了准确的Unicode字符。

虽然我在上面解释了编码，但是您会注意到example:verify_response(Response, Content). ，TerminalContent和BytewiseContent都是整数列表。您应该以一种可以确定自己掌握的方式来进行编码。

最新问题

问题描述投票：1回答：1

`TerminalContent = "我是中文", TerminalContent = [25105,26159,20013,25991].`
在终端中，字符串被解释为Unicode字符列表。
在模块中

指定每个源文件和术语文件的编码。
如果运行`erl`命令行，请确保将其设置为使用Unicode。
如果从文件中读取数据，则在处理之前将字节从`bytewise`编码转换为unicode（这也适用于使用`httpc:request/N`采集的二进制数据）。

[在给定`Url`和Unicode字符列表`Content`时，以下功能将按预期工作：

虽然我在上面解释了编码，但是您会注意到`example:verify_response(Response, Content).`，`TerminalContent`和`BytewiseContent`都是整数列表。您应该以一种可以确定自己掌握的方式来进行编码。