如何使python 3 print() utf8

Question

如何将 python 3 (3.1)

print("Some text")

制作为 UTF-8 格式的标准输出，或者如何输出原始字节？

测试.py

TestText = "Test - āĀēĒčČ..šŠūŪžŽ" # this is UTF-8
TestText2 = b"Test2 - \xc4\x81\xc4\x80\xc4\x93\xc4\x92\xc4\x8d\xc4\x8c..\xc5\xa1\xc5\xa0\xc5\xab\xc5\xaa\xc5\xbe\xc5\xbd" # just bytes
print(sys.getdefaultencoding())
print(sys.stdout.encoding)
print(TestText)
print(TestText.encode("utf8"))
print(TestText.encode("cp1252","replace"))
print(TestText2)

输出（在 CP1257 中，我将字符替换为字节值

[x00]

）：

utf-8
cp1257
Test - [xE2][xC2][xE7][C7][xE8][xC8]..[xF0][xD0][xFB][xDB][xFE][xDE]  
b'Test - \xc4\x81\xc4\x80\xc4\x93\xc4\x92\xc4\x8d\xc4\x8c..\xc5\xa1\xc5\xa0\xc5\xab\xc5\xaa\xc5\xbe\xc5\xbd'
b'Test - ??????..\x9a\x8a??\x9e\x8e'
b'Test2 - \xc4\x81\xc4\x80\xc4\x93\xc4\x92\xc4\x8d\xc4\x8c..\xc5\xa1\xc5\xa0\xc5\xab\xc5\xaa\xc5\xbe\xc5\xbd'

print

太聪明了... :D 使用带有

print

的编码文本是没有意义的（因为它总是只显示字节的表示而不是真正的字节）并且根本不可能输出字节，因为无论如何都会打印将其编码为

sys.stdout.encoding

。

例如：

print(chr(255))

抛出错误：

Traceback (most recent call last):
  File "Test.py", line 1, in <module>
    print(chr(255));
  File "H:\Python31\lib\encodings\cp1257.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\xff' in position 0: character maps to <undefined>

顺便说一句，

print( TestText == TestText2.decode("utf8"))

返回

False

，尽管打印输出是相同的。

Python 3 如何确定

sys.stdout.encoding

以及如何更改它？

我做了一个

printRAW()

函数，运行良好（实际上它将输出编码为 UTF-8，所以实际上它不是原始的......）：

 def printRAW(*Text):
     RAWOut = open(1, 'w', encoding='utf8', closefd=False)
     print(*Text, file=RAWOut)
     RAWOut.flush()
     RAWOut.close()

 printRAW("Cool", TestText)

输出（现在以 UTF-8 打印）：

Cool Test - āĀēĒčČ..šŠūŪžŽ

printRAW(chr(252))

也能很好地打印

ü

（UTF-8，

[xC3][xBC]

）并且没有错误:)

现在我正在寻找更好的解决方案（如果有的话）...

Answer 1

澄清：

TestText = "Test - āĀēĒčČ..šŠūŪžŽ" # this not UTF-8...it is a Unicode string in Python 3.X.
TestText2 = TestText.encode('utf8') # this is a UTF-8-encoded byte string.

要将 UTF-8 发送到 stdout，无论控制台的编码如何，请使用其接受字节的缓冲区接口：

import sys
sys.stdout.buffer.write(TestText2)

Answer 2

这是我能从手册中得到的最好的结果，但它有点肮脏：

utf8stdout = open(1, 'w', encoding='utf-8', closefd=False) # fd 1 is stdout
print(whatever, file=utf8stdout)

似乎文件对象应该有一种方法来更改其编码，但据我所知没有。

如果你写入 utf8stdout 然后写入 sys.stdout 而没有先调用 utf8stdout.flush() ，反之亦然，可能会发生不好的事情。

Answer 3

按照这个答案

您可以从

python 3.7

开始手动重新配置 stdout 的编码

import sys
sys.stdout.reconfigure(encoding='utf-8')

Answer 4

我在Python 3.6中尝试了zwol的解决方案，但它对我不起作用。对于某些字符串，没有输出打印到控制台。

但是 iljau 的解决方案有效：使用不同的编码重新打开标准输出。

import sys
sys.stdout = open(1, 'w', encoding='utf-8', closefd=False)

Answer 5

您可以使用 utf-8 设置控制台编码：

import sys
sys.stdout = open(sys.stdout.fileno(), mode='w', encoding='utf8', buffering=1)

Answer 6

为我工作蟒蛇3.8

sys.stdout.reconfigure（编码='utf-8'）

如何使python 3 print() utf8

问题描述投票：0回答：6

测试.py

6个回答

为我工作蟒蛇3.8

最新问题

如何使python 3 print() utf8

问题描述 投票：0回答：6

测试.py

6个回答

为我工作 蟒蛇3.8

最新问题

问题描述投票：0回答：6

为我工作蟒蛇3.8