c++ 无法让“wcout”打印 unicode,并让“cout”继续工作

问题描述 投票:0回答:4

无法让“wcout”在多个代码页中打印unicode字符串,同时让“cout”继续工作

请帮助我让这 3 行代码一起工作。

std::wcout<<"abc "<<L'\u240d'<<" defg "<<L'א'<<" hijk"<<std::endl;
std::cout<<"hello world from cout! \n";
std::wcout<<"hello world from wcout! \n";

输出:

abc hello world from cout!

我试过:

#include <io.h> 
#include <fcntl.h>
_setmode(_fileno(stdout), _O_U8TEXT);

问题: “wcout”失败

尝试过:

std::locale mylocale("");
std::wcout.imbue(mylocale);

和:

SetConsoleOutputCP(1251);

setlocale(LC_ALL, "");

SetConsoleCP(CP_UTF8)

没有任何效果

c++ windows unicode utf-8 cout
4个回答
13
投票

Microsoft 需要对

_setmode()
进行一些非标准设置,然后
wcout
wcin
才能工作。 这个示例的样板代码相当多,因此并不那么清晰,但它可以在 clang++、g++ 和 MSVC++ 上运行:

#include <iostream>
#include <locale>
#include <locale.h>
#include <stdlib.h>

#ifndef MS_STDLIB_BUGS // Allow overriding the autodetection.
/* The Microsoft C and C++ runtime libraries that ship with Visual Studio, as
 * of 2017, have a bug that neither stdio, iostreams or wide iostreams can
 * handle Unicode input or output.  Windows needs some non-standard magic to
 * work around that.  This includes programs compiled with MinGW and Clang
 * for the win32 and win64 targets.
 */
#  if ( _MSC_VER || __MINGW32__ || __MSVCRT__ )
    /* This code is being compiled either on MS Visual C++, or MinGW, or
     * clang++ in compatibility mode for either, or is being linked to the
     * msvcrt.dll runtime.
     */
#    define MS_STDLIB_BUGS 1
#  else
#    define MS_STDLIB_BUGS 0
#  endif
#endif

#if MS_STDLIB_BUGS
#  include <io.h>
#  include <fcntl.h>
#endif

#if !HAS_APP17_FILESYSTEM && !HAS_TS_FILESYSTEM && __has_include(<filesystem>)
#  include <filesystem> /* MSVC has this header, but not the standard API. */
#  if __cpp_lib_filesystem >= 201703
#    define HAS_CPP17_FILESYSTEM 1
#  endif
#endif

#if !HAS_CPP17_FILESYSTEM && __has_include(<experimental/filesystem>)
#  include <experimental/filesystem>
/* Microsoft screws this one up, too, by not defining the feature-test
 * macro specified by the standard.
 */
#  if __cpp_lib_experimental_filesystem >= 201406 || MS_STDLIB_BUGS
#    define HAS_TS_FILESYSTEM 1
/* With g++6, this requires -lstdc++fs, AFTER this source file on the
 * command line.
 */
#  endif
#endif

#if HAS_CPP17_FILESYSTEM
  using std::filesystem::absolute;
  using std::filesystem::current_path;
  using std::filesystem::directory_entry;
  using std::filesystem::directory_iterator;
  using std::filesystem::is_directory;
  using std::filesystem::exists;
  using std::filesystem::path;
#elif HAS_TS_FILESYSTEM
  using std::experimental::filesystem::absolute;
  using std::experimental::filesystem::current_path;
  using std::experimental::filesystem::directory_entry;
  using std::experimental::filesystem::directory_iterator;
  using std::experimental::filesystem::is_directory;
  using std::experimental::filesystem::exists;
  using std::experimental::filesystem::path;
#else
#  error "This library has neither <filesystem> nor <experimental/filesystem>."
#endif

void init_locale(void)
// Does magic so that wcout can work.
{
#if MS_STDLIB_BUGS
  // Windows needs a little non-standard magic.
  constexpr char cp_utf16le[] = ".1200"; // UTF-16 little-endian locale.
  setlocale( LC_ALL, cp_utf16le );
  _setmode( _fileno(stdout), _O_WTEXT );
  /* Repeat for _fileno(stdin), if needed. */
#else
  // The correct locale name may vary by OS, e.g., "en_US.utf8".
  constexpr char locale_name[] = "";
  setlocale( LC_ALL, locale_name );
  std::locale::global(std::locale(locale_name));
  std::wcin.imbue(std::locale())
  std::wcout.imbue(std::locale());
#endif
}

using std::endl;

int main( const int argc, const char * const argv[] )
{
  init_locale();

  const path cwd = (argc > 1) ? absolute(path( argv[1], std::locale() ))
                              : absolute(current_path());

  if (exists(cwd)) {
    std::wcout << cwd.wstring() << endl;
  } else {
    std::wcerr << "Path does not exist.\n";
    return EXIT_FAILURE;
  }

  if (is_directory(cwd)) {
    for ( const directory_entry &f : directory_iterator(cwd) )
      std::wcout << f.path().filename().wstring() << endl;
  }

  return EXIT_SUCCESS;
}

这可能比实际需要的复杂得多:截至 2018 年,

std::filesystem
不受支持,但
<experimental/filesystem>
永远不会被删除。

这是一个简化版本,仅包含使

wcout
正常工作的样板:

#include <iostream>
#include <locale>
#include <locale.h>

#ifndef MS_STDLIB_BUGS
#  if ( _MSC_VER || __MINGW32__ || __MSVCRT__ )
#    define MS_STDLIB_BUGS 1
#  else
#    define MS_STDLIB_BUGS 0
#  endif
#endif

#if MS_STDLIB_BUGS
#  include <io.h>
#  include <fcntl.h>
#endif

void init_locale(void)
{
#if MS_STDLIB_BUGS
  constexpr char cp_utf16le[] = ".1200";
  setlocale( LC_ALL, cp_utf16le );
  _setmode( _fileno(stdout), _O_WTEXT );
#else
  // The correct locale name may vary by OS, e.g., "en_US.utf8".
  constexpr char locale_name[] = "";
  setlocale( LC_ALL, locale_name );
  std::locale::global(std::locale(locale_name));
  std::wcin.imbue(std::locale())
  std::wcout.imbue(std::locale());
#endif
}

10
投票

C++ 说:

[C++11: 27.4.1/3]:
对相应的宽字符流和窄字符流的混合操作遵循与对
FILE
混合此类操作相同的语义,如 ISO C 标准修正案 1 中所指定。

并且参考文档说:

流的定义已更改,以包括文本流和二进制流的方向概念。在流与文件关联之后,但在对流执行任何操作之前,流是没有方向的。如果将宽字符输入或输出函数应用于无方向的流,则该流将变为宽方向。同样,如果将字节输入或输出操作应用于具有方向的流,则该流将变为面向字节的。此后,只有

fwide()
freopen()
函数可以更改流的方向。

字节输入/输出函数不得应用于面向宽的流,宽字符输入/输出函数不得应用于面向字节的流。

根据我的解释,简而言之,不要混合

std::cout
std::wcout


2
投票

这是因为 Unicode 在代码页中无法表示,导致 wcout 失败。

std::wcout<<"abc "<<L'\u240d'<<" defg "<<L'א'<<" hijk"<<std::endl;
if(std::wcout.fail()){
    std::cout<<"\nConversion didn't succeed\n";
    std::wcout << "This statement has no effect on the console";
    std::wcout.clear();
    std::wcout<<"hello world from wcout! \n";
}
std::cout<<"hello world from cout! \n";
std::wcout<<"hello world from wcout again! \n";

0
投票

今天这个问题可以使用标准库函数来解决。

每次转换所需的步骤是刷新最近使用的流并调用 _setmode(_fileno(stdout), _O_XXX) 准备 stdout 接收正确类型的数据。要从 wcout 更改为二进制数据,需要两次调用 _setmode()

同时使用 flush_setmode 的一个原因是,当库需要两字节值时,避免缓冲区中的字节数为奇数而导致崩溃。 (当然,如果您继续使用相同的流,则在进行更改之前无需执行任何操作。)

你可以看出我正在使用 Windows,因为“ “ 被翻译为” “ 除了二进制输出。

#include <iostream>
#include <io.h>
#inlcude <fcntl.h>

int main( )
{

  // To start sending to the console in UNICODE/wcout
  _setmode(_fileno(stdout), _O_U16TEXT);
  std::wcout << "abc " << L'\u240d' << " defg " << L'א' << " hijk" << std::endl;

  //then to switch to text/cout
  std::wcout.flush();
  _setmode(_fileno(stdout), _O_TEXT);
  std::cout << "hello world from cout! \n";

  // to switch back to wcout
  std::cout.flush();
  _setmode(_fileno(stdout), _O_U16TEXT);
  std::wcout << "hello world from wcout! \n";

  // To switch from wcout to binary output: (One reason to use cout 
  // is that wcout will not accept an odd number of bytes)
  std::wcout.flush();
  _setmode(_fileno(stdout), _O_TEXT);
  _setmode(_fileno(stdout), _O_BINARY);
  uint8_t bytes[]{ 8, 0xa0, 0xff, 0x7f, '\n', 0x00, 0x31, 0x5a };
  std::cout.write((char*)bytes, sizeof(bytes));

  // to switch back to wcout:
  std::cout.flush();
  _setmode(_fileno(stdout), _O_TEXT);
  _setmode(_fileno(stdout), _O_U16TEXT);
  std::wcout << "Done" << std::endl;

  return 0;
}

通过管道将其通过 od -tcx1 给出

0000000   a  \0   b  \0   c  \0      \0  \r   $      \0   d  \0   e  \0
         61  00  62  00  63  00  20  00  0d  24  20  00  64  00  65  00
0000020   f  \0   g  \0      \0 320 005      \0   h  \0   i  \0   j  \0
         66  00  67  00  20  00  d0  05  20  00  68  00  69  00  6a  00
0000040   k  \0  \r  \0  \n  \0   h   e   l   l   o       w   o   r   l
         6b  00  0d  00  0a  00  68  65  6c  6c  6f  20  77  6f  72  6c
0000060   d       f   r   o   m       c   o   u   t   !      \r  \n   h
         64  20  66  72  6f  6d  20  63  6f  75  74  21  20  0d  0a  68
0000100  \0   e  \0   l  \0   l  \0   o  \0      \0   w  \0   o  \0   r
         00  65  00  6c  00  6c  00  6f  00  20  00  77  00  6f  00  72
0000120  \0   l  \0   d  \0      \0   f  \0   r  \0   o  \0   m  \0    
         00  6c  00  64  00  20  00  66  00  72  00  6f  00  6d  00  20
0000140  \0   w  \0   c  \0   o  \0   u  \0   t  \0   !  \0      \0  \r
         00  77  00  63  00  6f  00  75  00  74  00  21  00  20  00  0d
0000160  \0  \n  \0  \b 240 377 177  \n  \0   1   Z   D  \0   o  \0   n
         00  0a  00  08  a0  ff  7f  0a  00  31  5a  44  00  6f  00  6e
0000200  \0   e  \0  \r  \0  \n  \0
         00  65  00  0d  00  0a  0
0000207
© www.soinside.com 2019 - 2024. All rights reserved.