检查文件是否未编码两次

问题描述投票：1回答：1

我用这个问题的答案：Using PowerShell to write a file in UTF-8 without the BOM

将文件（UCS-2）编码为UTF-8。问题是，如果我运行编码两次（或更多次），则Cyrillic文本被破坏。如果文件已经在UTF-8中，如何停止编码？

代码是：

$MyFile = Get-Content $MyPath
$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False
[System.IO.File]::WriteAllLines($MyPath, $MyFile, $Utf8NoBomEncoding)

powershell character-encoding

1个回答

1
投票

使用：

$MyFile = Get-Content -Encoding UTF8 $MyPath

最初，当$MyPath是UTF-16LE编码（“Unicode”编码，我认为是你的意思）时，PowerShell将忽略-Encoding参数，因为文件中存在BOM，这明确地标识了编码。如果您的原始文件没有BOM，则需要做更多工作。
一旦你将$MyPath保存为没有BOM的UTF-8，你必须告诉Windows PowerShell [1]你期望用-Encoding UTF8进行UTF-8编码，因为它将文件解释为“ANSI” - 默认编码（根据典型的单个编码）与遗留系统区域设置关联的字节代码页。

[1]请注意，cross-platform PowerShell Core edition默认为无BOM的UTF-8。

最新问题

© www.soinside.com 2019 - 2024. All rights reserved.