我正在尝试将Bryan Henderson的Perl接口连接到ncurses库:Curses
为了简单练习,我尝试获取在屏幕上键入的单个字符。这是直接基于NCURSES Programming HOWTO进行修改的。
[当我调用Perl库的getchar()
时,我希望收到一个字符,可能是多字节的字符(如this part of the library manpage中所述,这有点复杂,因为必须处理功能键的特殊情况,而无需输入,但这是只是通常的冰壶)。
这是下面代码中的子例程read1ch()
。
这对ASCII字符有效,但对0x7F以上的字符无效。例如,当击中è
(Unicode 0x00E8,UTF-8:0xC3,0xA8)时,实际上我得到的是代码0xE8,而不是UTF-8编码的代码。将其打印到LANG=en_GB.UTF-8
无法正常工作的终端上,无论如何我都期待0xC3A8。
我需要进行哪些更改才能使其正常工作,即以适当的字符或Perl字符串的形式获取è
?
getchar()
的C代码为here btw。也许只是没有使用C_GET_WCH
设置进行编译?如何找出?
尝试使用binmode设置
binmode STDERR, ':encoding(UTF-8)';
binmode STDOUT, ':encoding(UTF-8)';
这应该解决任何编码问题,因为终端需要并发送UTF-8,但这没有帮助。
也尝试用use open设置流编码(对此和上面的方法之间的区别不太清楚,但这也无济于事]
use open qw(:std :encoding(UTF-8));
Perl Curses shim的手册页上说:
如果
wget_wch()
不可用(即Curses库不理解宽字符),这会调用wgetch()
[获取1个字节的字符从curses窗口],但返回尽管如此,上述值。这可能是个问题,因为多字节字符编码(如UTF-8),您将收到两个两字节字符的一个字符字符串(例如,“Ô和“¤”表示“一个”)。
这里可能是这种情况,但此系统上确实存在wget_wch()
。
[试图查看C代码的作用,然后将fprintf
直接添加到curses/Curses-1.36/CursesFunWide.c
的多字节处理代码中,重新编译后,没有设法通过Curses.so
用我自己的系统覆盖LD_LIBRARY_PATH
(为什么不?为什么一切都只能在一半时间内工作?),因此直接替换了系统库(使用THAT!)。
#ifdef C_GET_WCH
wint_t wch;
int ret = wget_wch(win, &wch);
if (ret == OK) {
ST(0) = sv_newmortal();
c_wchar2sv(ST(0), wch);
fprintf(stderr,"Obtained win_t 0x%04lx\n", wch);
XSRETURN(1);
} else if (ret == KEY_CODE_YES) {
XST_mUNDEF(0);
ST(1) = sv_newmortal();
sv_setiv(ST(1), (IV)wch);
XSRETURN(2);
} else {
XSRETURN_UNDEF;
}
#else
这只是一个胖的NOPE,按下ü
时会看到:
Obtained win_t 0x00fc
因此,将运行[[正确的代码],但是数据是ISO-8859-1,而不是UTF-8。因此,wget_wch
的行为不佳。因此,这是一个curses配置问题。呵呵。
ncurses
可能采用的是默认语言环境,即C
。为了使ncurses
使用宽字符,必须“初始化语言环境”,这可能意味着将状态从“未设置”(从而使ncurses
退回到C
)移动到“设置为系统所用指示”(应该是LANG
环境变量中的内容)。 ncurses
的手册页说:该库使用调用程序已初始化的语言环境。通常用setlocale完成:setlocale(LC_ALL,“”);
如果未初始化语言环境,则库将假定字符可以按照ISO-8859-1进行打印,以与某些旧版程序配合使用。您应该初始化语言环境,而不要依赖于未设置语言环境的库。
这也没有用,但我认为解决方案正朝那条路走。
使用perl-Curses的代码
如果尝试,请按BACKSPACE退出循环,因为不再解释CTRL-C。
下面有很多代码,但是关键区域用----- Testing
标记:#!/usr/bin/perl
# pmap -p PID
# shows the per process using
# /usr/lib64/libncursesw.so.6.1
# /usr/lib64/perl5/vendor_perl/auto/Curses/Curses.so
# Trying https://metacpan.org/release/Curses
use warnings;
use strict;
use utf8; # Meaning "This lexical scope (i.e. file) contains utf8"
use Curses; # On Fedora: dnf install perl-Curses
# This didn't fix it
# https://perldoc.perl.org/open.html
use open qw(:std :encoding(UTF-8));
# https://perldoc.perl.org/perllocale.html#The-setlocale-function
use POSIX ();
my $loc = POSIX::setlocale(&POSIX::LC_ALL, "");
# ---
# Surrounds the actual program
# ---
sub setup() {
initscr();
raw();
keypad(1);
noecho();
}
sub teardown {
endwin();
}
# ---
# Mainly for prettyprinting
# ---
my $special_keys = setup_special_keys();
# ---
# Error printing
# ---
sub mt {
return sprintf("%i: ",time());
}
sub ae {
my ($x,$fname) = @_;
if ($x == ERR) {
printw mt();
printw "Got error code from '$fname': $x\n"
}
}
# ---
# Where the action is
# ---
sub announce {
my $res = printw "Type any character to see it in bold! (or backspace to exit)\n";
ae($res, "printw");
return { refresh => 1 }
}
sub read1ch {
# Read a next character, waiting until it is there.
# Use the wide-character aware functions unless you want to deal with
# collating individual bytes yourself!
# Readings:
# https://metacpan.org/pod/Curses#Wide-Character-Aware-Functions
# https://perldoc.perl.org/perlunicode.html#Unicode-Character-Properties
# https://www.ahinea.com/en/tech/perl-unicode-struggle.html
# https://hexdump.wordpress.com/2009/06/19/character-encoding-issues-part-ii-perl/
my ($ch, $key) = getchar();
if (defined $key) {
# it's a function key
printw "Function key pressed: $key";
printw " with known alias '" . $$special_keys{$key} . "'" if (exists $$special_keys{$key});
printw "\n";
# done if backspace was hit
return { done => ($key == KEY_BACKSPACE()) }
}
elsif (defined $ch) {
# "$ch" should be a String of 1 character
# ----- Testing
printw "Locale: $loc\n";
printw "Multibyte output test: öüäéèà периоду\n";
printw sprintf("Received string '%s' of length %i with ordinal 0x%x\n", $ch, length($ch), ord($ch));
{
# https://perldoc.perl.org/bytes.html
use bytes;
printw sprintf("... length is %i\n" , length($ch));
printw sprintf("... contents are %vd\n" , $ch);
}
# ----- Testing
return { ch => $ch }
}
else {
# it's an error
printw "getchar() failed\n";
return {}
}
}
sub feedback {
my ($ch) = @_;
printw "The pressed key is: ";
attron(A_BOLD);
printw("%s\n","$ch"); # do not print $txt directly to make sure escape sequences are not interpreted!
attroff(A_BOLD);
return { refresh => 1 } # should refresh
}
sub do_curses_run {
setup;
my $done = 0;
while (!$done) {
my $bubl;
$bubl = announce();
refresh() if $$bubl{refresh};
$bubl = read1ch();
$done = $$bubl{done};
if (defined $$bubl{ch}) {
$bubl = feedback($$bubl{ch});
refresh() if $$bubl{refresh};
}
}
teardown;
}
# ---
# main
# ---
do_curses_run();
sub setup_special_keys {
# the key codes on the left must be called once to resolve to a numeric constant!
my $res = {
KEY_BREAK() => "Break key",
KEY_DOWN() => "Arrow down",
KEY_UP() => "Arrow up",
KEY_LEFT() => "Arrow left",
KEY_RIGHT() => "Arrow right",
KEY_HOME() => "Home key",
KEY_BACKSPACE() => "Backspace",
KEY_DL() => "Delete line",
KEY_IL() => "Insert line",
KEY_DC() => "Delete character",
KEY_IC() => "Insert char or enter insert mode",
KEY_EIC() => "Exit insert char mode",
KEY_CLEAR() => "Clear screen",
KEY_EOS() => "Clear to end of screen",
KEY_EOL() => "Clear to end of line",
KEY_SF() => "Scroll 1 line forward",
KEY_SR() => "Scroll 1 line backward (reverse)",
KEY_NPAGE() => "Next page",
KEY_PPAGE() => "Previous page",
KEY_STAB() => "Set tab",
KEY_CTAB() => "Clear tab",
KEY_CATAB() => "Clear all tabs",
KEY_ENTER() => "Enter or send",
KEY_SRESET() => "Soft (partial) reset",
KEY_RESET() => "Reset or hard reset",
KEY_PRINT() => "Print or copy",
KEY_LL() => "Home down or bottom (lower left)",
KEY_A1() => "Upper left of keypad",
KEY_A3() => "Upper right of keypad",
KEY_B2() => "Center of keypad",
KEY_C1() => "Lower left of keypad",
KEY_C3 () => "Lower right of keypad",
KEY_BTAB() => "Back tab key",
KEY_BEG() => "Beg(inning) key",
KEY_CANCEL() => "Cancel key",
KEY_CLOSE() => "Close key",
KEY_COMMAND() => "Cmd (command) key",
KEY_COPY() => "Copy key",
KEY_CREATE() => "Create key",
KEY_END() => "End key",
KEY_EXIT() => "Exit key",
KEY_FIND() => "Find key",
KEY_HELP() => "Help key",
KEY_MARK() => "Mark key",
KEY_MESSAGE() => "Message key",
KEY_MOUSE() => "Mouse event read",
KEY_MOVE() => "Move key",
KEY_NEXT() => "Next object key",
KEY_OPEN() => "Open key",
KEY_OPTIONS() => "Options key",
KEY_PREVIOUS() => "Previous object key",
KEY_REDO() => "Redo key",
KEY_REFERENCE() => "Ref(erence) key",
KEY_REFRESH() => "Refresh key",
KEY_REPLACE() => "Replace key",
KEY_RESIZE() => "Screen resized",
KEY_RESTART() => "Restart key",
KEY_RESUME() => "Resume key",
KEY_SAVE() => "Save key",
KEY_SBEG() => "Shifted beginning key",
KEY_SCANCEL() => "Shifted cancel key",
KEY_SCOMMAND() => "Shifted command key",
KEY_SCOPY() => "Shifted copy key",
KEY_SCREATE() => "Shifted create key",
KEY_SDC() => "Shifted delete char key",
KEY_SDL() => "Shifted delete line key",
KEY_SELECT() => "Select key",
KEY_SEND() => "Shifted end key",
KEY_SEOL() => "Shifted clear line key",
KEY_SEXIT() => "Shifted exit key",
KEY_SFIND() => "Shifted find key",
KEY_SHELP() => "Shifted help key",
KEY_SHOME() => "Shifted home key",
KEY_SIC() => "Shifted input key",
KEY_SLEFT() => "Shifted left arrow key",
KEY_SMESSAGE() => "Shifted message key",
KEY_SMOVE() => "Shifted move key",
KEY_SNEXT() => "Shifted next key",
KEY_SOPTIONS() => "Shifted options key",
KEY_SPREVIOUS() => "Shifted prev key",
KEY_SPRINT() => "Shifted print key",
KEY_SREDO() => "Shifted redo key",
KEY_SREPLACE() => "Shifted replace key",
KEY_SRIGHT() => "Shifted right arrow",
KEY_SRSUME() => "Shifted resume key",
KEY_SSAVE() => "Shifted save key",
KEY_SSUSPEND() => "Shifted suspend key",
KEY_SUNDO() => "Shifted undo key",
KEY_SUSPEND() => "Suspend key",
KEY_UNDO() => "Undo key"
};
for (my $f = 1; $f <= 64; $f++) {
$$res{KEY_F($f)} = "KEY_F($f)"
}
return $res
}
使用
strace运行脚本可以帮助...我这样做是为了查看系统调用:
strace -fo strace.out -s 1024 ./foo
并且可以看到读取内容,消息等。可以使用调试库对ncurses进行类似的跟踪,尽管打包程序对于提供启用跟踪的方法并不一致。UTF-8中的。[
ü
为\303\274
(八进制),其Unicode值为252
(十进制)或0xfc
(十六进制)。问题的这一部分似乎错过了这一点:这只是一个胖的NOPE,按ü会看到:
Obtained win_t 0x00fc
因此将运行正确的代码,但是数据是ISO-8859-1,而不是UTF-8。所以是wget_wch表现不佳。因此,这是一个curses配置问题。呵呵。
wget_wch
返回(出于实际目的)一个Unicode值(不是UTF-8字节序列)。 ISO-8859-1代码160-255碰巧(并非巧合地)匹配Unicode代码点,尽管后者在中肯定是不同的[[encoded
[wget_wch
将返回UTF-8字节,但是Perl脚本仅将其用作后备(因为这将导致Perl脚本将UTF-8字符串转换为Unicode值)。