Windows 上的 python `getpass.getuser()` 转为 UTF-8

Question

我们正在使用 Windows 计算机。用户名基于真实姓名，因此可能包含重音符号（例如，在我的例子中为“MichaëlHooreman”）。

Windows 机器使用 CP1252 编码。我使用

getpass.getuser()

来获取用户名，然后在 HTML 文档中报告它们。不幸的是，结果是CP1252而不是UTF8，所以编码问题......

我不想在 CP1252 中编码 HTML。

我尝试过以下Python代码：

import getpass
import locale
import sys

f = locale.getpreferredencoding()
t = sys.getdefaultencoding()
r = getpass.getuser()
if f != t:
    print(f"converting from {f} to {t}")
    print(f"{r=}")
    b = r.encode(f)
    print(f"{b=}")
    r = b.decode(t)
    print(f"{r=}")
print(r)

...它不起作用...

converting from cp1252 to utf-8
r='MichaëlHooreman'
b=b'Micha\xeblHooreman'
Traceback (most recent call last):
  File "C:\Python\projects\xlvalrep\testuser.py", line 13, in <module>
    r = b.decode(t)
        ^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xeb in position 5: invalid continuation byte

在网上花了很多时间，到目前为止还没有解决方案...

我该如何解决这个问题？

谢谢！

Answer 1

getpass.getuser()

的结果是

str

（Unicode字符串），由没有编码的Unicode代码点组成。只需将字符串编码为您想要的编码即可。

示例：

r = 'MichaëlHooreman'
print(f'Unicode:', r)
print(f'CP1252: ', r.encode('cp1252'))
print(f'UTF-8:  ', r.encode())  # encode's default is 'utf8'

输出：

Unicode: MichaëlHooreman
CP1252:  b'Micha\xeblHooreman'
UTF-8:   b'Micha\xc3\xablHooreman'

Windows 上的 python `getpass.getuser()` 转为 UTF-8

问题描述投票：0回答：1

1个回答

最新问题

Windows 上的 python `getpass.getuser()` 转为 UTF-8

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1