从cmd运行jar时我遇到UTF-8编码问题。从IntelliJIdea运行项目时我没有任何问题,显示字符并正确输出。
举个例子;我已经使用了一些类似问题的建议解决方案,stackoverflow,我将列出。
public static String convertToUTF8(String s) {
String out = null;
try {
out = new String(s.getBytes("UTF-8"), "ISO-8859-1");
} catch (java.io.UnsupportedEncodingException e) {
return null;
}
return out;
}
System.out.println(Charset.defaultCharset());
String test = "written in utf-8İİiiŞŞŞĞĞÜĞĞğ";
System.out.println(test);
System.out.println(convertToUTF8(test));
PrintWriter out = new PrintWriter(new File("hello.txt"), "UTF-8");
out.print("written in utf-8İİiiŞŞŞĞĞÜĞĞğ");
out.close();
产出如下;
从IntelliJ Idea运行:
UTF-8
written in utf-8İİiiŞŞŞĞĞÜĞĞğ
written in utf-8Ä°Ä°iiÅÅÅÄÄÃÄÄÄ
使用jar从cmd运行
windows-1252
written in utf-8Ä°Ä°iiÅ?Å?Å?Ä?Ä???Ä?Ä?Ä?
written in utf-8???°???°ii??Å???Å???Å???Å???Å???Å???Å???Å???Å?
我检查过的其他解决方案;
-Dfile.encoding=utf-8
,java -Dfile.encoding=utf-8 -jar file.jar
:什么都不做chcp 65001
将代码页更改为UTF-8:当可见输出固定时,hello.txt保持不变。我想要实现的是从cmd运行jar文件时获得正确的编码处理,因为我打算将jar用作后台工作程序,它将处理大量UTF-8编码字符。
更新:用unicode escape char等效替换UTF-8字符后;像ç到\u011f
一样,代码按预期运行。困扰我的是,虽然我可以做替换(“ç”,'\ u011f')之类的字符串替换功能,并且jar运行顺畅,为什么系统在输出时没有正确处理它?如果系统在运行时理解charç,为什么我在输出中得到Ä字符?
有一些小问题应该很容易解决(至少在Windows上):
String
中创建convertToUTF8()
时,请勿使用charset ISO-8859-1。它不支持土耳其字符,这解释了为什么convertToUTF8()
没有返回预期的String
值。要修复,只需使用UTF-8
代替:
out = new String(s.getBytes("UTF-8"), StandardCharsets.UTF_8.displayName());
defaultCharset()
,因此看起来你的utf-8值被忽略了。这段代码适用于我,其中唯一的重大变化是在方法out
中分配convertToUTF8()
:
import java.io.File;
import java.io.FileNotFoundException;
import java.io.PrintWriter;
import java.io.UnsupportedEncodingException;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
public class Main {
public static void main(String[] args) throws FileNotFoundException, UnsupportedEncodingException {
System.out.println(Charset.defaultCharset());
String test = "written in utf-8İİiiŞŞŞĞĞÜĞĞğ";
System.out.println(test);
System.out.println(convertToUTF8(test));
PrintWriter out = new PrintWriter(new File("hello.txt"), StandardCharsets.UTF_8.displayName());
out.print("written in utf-8İİiiŞŞŞĞĞÜĞĞğ");
out.close();
}
public static String convertToUTF8(String s) {
String out = null;
try {
out = new String(s.getBytes("UTF-8"), StandardCharsets.UTF_8.displayName());
} catch (java.io.UnsupportedEncodingException e) {
return null;
}
return out;
}
}
这是在Intellij IDEA中运行时的输出:
C:\Java\openJDK\jdk-11.0.2\bin\java.exe -javaagent:C:\Users\johndoe\AppData\Local\JetBrains\Toolbox\apps\IDEA-U\ch-0\191.5109.14\lib\idea_rt.jar=60362:C:\Users\johndoe\AppData\Local\JetBrains\Toolbox\apps\IDEA-U\ch-0\191.5109.14\bin -Dfile.encoding=UTF-8 -classpath C:\Users\johndoe\IdeaProjects\TurkishChars\out\production\TurkishChars Main
UTF-8
written in utf-8İİiiŞŞŞĞĞÜĞĞğ
written in utf-8İİiiŞŞŞĞĞÜĞĞğ
Process finished with exit code 0
这是从命令行运行时的控制台输出:
Microsoft Windows [Version 10.0.17763.316]
(c) 2018 Microsoft Corporation. All rights reserved.
C:\Users\johndoe>chcp 65001
Active code page: 65001
C:\Users\johndoe>C:\Java\openJDK\jdk-11.0.2\bin\java.exe -javaagent:C:\Users\johndoe\AppData\Local\JetBrains\Toolbox\apps\IDEA-U\ch-0\191.5109.14\lib\idea_rt.jar=60362:C:\Users\johndoe\AppData\Local\JetBrains\Toolbox\apps\IDEA-U\ch-0\191.5109.14\bin -Dfile.encoding=UTF-8 -classpath C:\Users\johndoe\IdeaProjects\TurkishChars\out\production\TurkishChars Main
UTF-8
written in utf-8İİiiŞŞŞĞĞÜĞĞğ
written in utf-8İİiiŞŞŞĞĞÜĞĞğ
C:\Users\johndoe>cd C:\Users\johndoe\IdeaProjects\TurkishChars
C:\Users\johndoe\IdeaProjects\TurkishChars>type hello.txt
written in utf-8İİiiŞŞŞĞĞÜĞĞğ
C:\Users\johndoe\IdeaProjects\TurkishChars>
更新:
@Noodles建议检查你的字体是一个很好的建议。尝试将测试文本直接粘贴到命令提示符窗口,以验证它是否正确呈现(使用utf-8İİiiŞŞŞĞĞÜĞĞğ中写的回声)。如果没有,请更改字体。 Courier New为我工作: