是否可以从 okhttp3 客户端发送 UTF-8 字符?
对于以下字符串:
String fileName = "3$ Mù F'RANçé_33902_Country_5_202105";
String contentDisposition = "attachment;filename=" + "\"" + fileName + "\"";
我已经尝试过(对于 contentDisposition 标头):
Headers headers = new Headers.Builder()
.addUnsafeNonAscii("Content-Disposition", contentDisposition)
.add("Authorization", bearer)
.add("Content-type", "application/octet-stream")
.build();
Request request = new Request.Builder()
.headers(headers)
.post(requestBody)
.url(urlAddress)
.build();
但是服务器收到:
3$ Mù F'RANçé_33902_Country_5_202105
此请求发送给固定合作伙伴,因此我无法访问后端。
application/octet-stream
是后端需要的。
身体是这样创建的:
byte[] data = FileUtils.readFileToByteArray(file);
RequestBody requestBody = RequestBody.create(data);
它与 Postman 完美配合。
完整的 MVCE(无法包含文件和后端信息,但它之前崩溃了,无论如何,所以你可以启动这个确切的代码,它应该会抛出错误):
public class App
{
public static void main( String[] args ) throws IOException
{
OkHttpClient client = new OkHttpClient().newBuilder()
.build();
MediaType mediaType = MediaType.parse("application/octet-stream");
RequestBody body = RequestBody.create(mediaType, "");
Request request = new Request.Builder()
.url("xxxx")
.method("POST", body)
.addHeader("Content-Type", "application/octet-stream")
.addHeader("content-disposition", "attachment;filename=\"3$ Mù F'RANçé_33902_Country_5_202105.csv\"")
.addHeader("Authorization", "Bearer xxxxx")
.addHeader("Cookie", "xxxxxx")
.build();
Response response = client.newCall(request).execute();
}
}
收到错误:
java.lang.IllegalArgumentException: Unexpected char 0xf9 at 25 in content-disposition value: attachment;filename="3$ Mù F'RANçé_33902_Country_5_202105.csv"
好的http版本:
5.0.0-alpha.2
我错过了什么吗?
谢谢
HTTP 标头的默认字符集是 ISO-8859-1。然而,有 RFC 6266,描述了如何在
Content-Disposition
标头中对文件名进行编码。基本上,您指定字符集名称,然后对 UTF-8 字符进行百分比编码。您使用以 fileName="my-simple-filename"
开头的参数,而不是 filename*=utf-8''
,例如
import java.net.URLEncoder;
// ...
String fileName = "3$ Mù F'RANçé_33902_Country_5_202105";
String contentDisposition = "attachment;filename*=utf-8''" + encodeFileName(fileName);
// ...
private static String encodeFileName(String fileName) throws UnsupportedEncodingException {
return URLEncoder.encode(fileName, "UTF-8").replace("+", "%20");
}
使用 URL 编码器,然后修改“+”的结果是我在 here 发现的一个廉价技巧,如果您想避免使用 Guava、Spring 的
ContentDisposition
class 或任何其他库,而只需使用 JRE 类。
更新: 这是完整的 MCVE,展示了如何将 UTF-8 字符串作为 POST 正文和内容处置文件名发送。演示服务器展示了如何手动解码该标头 - 通常 HTTP 服务器应该自动执行此操作。
Maven POM 显示使用的依赖项:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.example</groupId>
<artifactId>SO_Java_OkHttp3SendUtf8_70804280</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<maven.compiler.source>11</maven.compiler.source>
<maven.compiler.target>11</maven.compiler.target>
</properties>
<dependencies>
<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
<version>4.9.3</version>
</dependency>
<dependency>
<groupId>org.nanohttpd</groupId>
<artifactId>nanohttpd</artifactId>
<version>2.3.1</version>
</dependency>
</dependencies>
</project>
OkHttp 演示客户端:
import okhttp3.Headers;
import okhttp3.OkHttpClient;
import okhttp3.Request;
import okhttp3.RequestBody;
import okhttp3.Response;
import java.io.IOException;
import java.net.URL;
import java.net.URLEncoder;
import java.nio.charset.StandardCharsets;
import java.util.Objects;
public class Client {
public static void main(String[] args) throws IOException {
String fileName = "3$ Mù F'RANçé_33902_Country_5_202105";
String contentDisposition = "attachment;filename*=utf-8''" + encodeFileName(fileName);
RequestBody requestBody = RequestBody.create(fileName.getBytes(StandardCharsets.UTF_8));
Headers headers = new Headers.Builder()
.add("Content-Disposition", contentDisposition)
.add("Content-type", "application/octet-stream; charset=utf-8")
.build();
Request request = new Request.Builder()
.headers(headers)
.post(requestBody)
.url(new URL("http://localhost:8080/"))
.build();
OkHttpClient client = new OkHttpClient();
Response response = client.newCall(request).execute();
System.out.println(Objects.requireNonNull(response.body()).string());
}
private static String encodeFileName(String fileName) {
return URLEncoder.encode(fileName, StandardCharsets.UTF_8).replace("+", "%20");
}
}
NanoHTTPD 演示服务器:
import fi.iki.elonen.NanoHTTPD;
import java.io.IOException;
import java.net.URLDecoder;
import java.nio.charset.StandardCharsets;
import java.util.HashMap;
import java.util.Map;
public class Server extends NanoHTTPD {
public Server() throws IOException {
super(8080);
start(NanoHTTPD.SOCKET_READ_TIMEOUT, false);
System.out.println("\nRunning! Point your browsers to http://localhost:8080/ \n");
}
public static void main(String[] args) throws IOException {
new Server();
}
private static final String UTF_8_FILE_NAME_PREFIX = ";filename*=utf-8''";
private static final int UTF_8_FILE_NAME_PREFIX_LENGTH = UTF_8_FILE_NAME_PREFIX.length();
@Override
public Response serve(IHTTPSession session) {
try {
Map<String, String> files = new HashMap<>();
session.parseBody(files);
String postBody = files.get("postData");
String contentDisposition = session.getHeaders().get("content-disposition");
String fileName = decodeFileName(
contentDisposition.substring(
contentDisposition.indexOf(UTF_8_FILE_NAME_PREFIX) + UTF_8_FILE_NAME_PREFIX_LENGTH
)
);
System.out.println("POST body: " + postBody);
System.out.println("Content disposition: " + contentDisposition);
System.out.println("UTF-8 file name: " + fileName);
return newFixedLengthResponse(postBody + "\n" + fileName);
}
catch (IOException | ResponseException e) {
e.printStackTrace();
return newFixedLengthResponse(e.toString());
}
}
private static String decodeFileName(String fileName) {
return URLDecoder.decode(fileName.replace("%20", "+"), StandardCharsets.UTF_8);
}
}
如果先运行服务器,然后运行客户端,您将在服务器控制台上看到以下内容:
Running! Point your browsers to http://localhost:8080/
POST body: 3$ Mù F'RANçé_33902_Country_5_202105
Content disposition: attachment;filename*=utf-8''3%24%20M%C3%B9%20F%27RAN%C3%A7%C3%A9_33902_Country_5_202105
UTF-8 file name: 3$ Mù F'RANçé_33902_Country_5_202105
在客户端控制台上,您会看到:
3$ Mù F'RANçé_33902_Country_5_202105
3$ Mù F'RANçé_33902_Country_5_202105
使用 spring 框架并遇到此错误的人可以执行以下操作:
HttpHeaders httpHeaders = new HttpHeaders();
ContentDisposition contentDisposition = ContentDisposition.builder("attachment")
.filename("filename with unicode chars.csv", StandardCharsets.UTF_8).build();
httpHeaders.setContentDisposition(contentDisposition);
return ResponseEntity.ok().headers(httpHeaders)
.contentType(MediaType.TEXT_PLAIN).body(byteArrayResourceObj);
对于任何进入此线程的人,这里有有关该主题的更多信息,以及遵循 RFC 2396 的通用解决方案。
首先,这是当前 JAVA 主分支的 URLEncoder.java 类的注释:
/* The list of characters that are not encoded has been
* determined as follows:
*
* RFC 2396 states:
* -----
* Data characters that are allowed in a URI but do not have a
* reserved purpose are called unreserved. These include upper
* and lower case letters, decimal digits, and a limited set of
* punctuation marks and symbols.
*
* unreserved = alphanum | mark
*
* mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"
*
* Unreserved characters can be escaped without changing the
* semantics of the URI, but this should not be done unless the
* URI is being used in a context that does not allow the
* unescaped character to appear.
* -----
*
* It appears that both Netscape and Internet Explorer escape
* all special characters from this list with the exception
* of "-", "_", ".", "*". While it is not clear why they are
* escaping the other characters, perhaps it is safest to
* assume that there might be contexts in which the others
* are unsafe if not escaped. Therefore, we will use the same
* list. It is also noteworthy that this is consistent with
* O'Reilly's "HTML: The Definitive Guide" (page 164).
*
* As a last note, Internet Explorer does not encode the "@"
* character which is clearly not unreserved according to the
* RFC. We are being consistent with the RFC in this matter,
* as is Netscape.
*
*/
这意味着当前的 JAVA 代码库对于要编码的字符非常有意见,并且由于与 Internet Explorer 和 Netscape 浏览器的兼容性考虑而与 RFC 规范背道而驰。这对我来说似乎非常过时了。因此,为了获得有效的 RFC 2396 编码,我们只需将过时的编码字符串替换回原始字符即可:
fun encodeRFC2396(string: String): String =
java.net.URLEncoder.encode(string, Charsets.UTF_8)
.replace("+", " ")
.replace("%7E", "~")
.replace("%21", "!")
.replace("%2A", "*")
.replace("%27", "'")
.replace("%28", "(")
.replace("%29", ")")
然后就可以轻松构建 Content-Disposition 标头了:
val filename = "täöüst-_.!~*'( 1 ).txt"
val contentDispositionHeader =
"Content-Disposition: attachment;filename*=utf-8''${encodeRFC2396(filename)}"
println(contentDispositionHeader)
// Content-Disposition: attachment;filename*=utf-8''t%C3%A4%C3%B6%C3%BCst-_.!~*'( 1 ).txt