Bash 5.2 由于 malloc 中的断言失败而崩溃,但仅当在 Valgrind 中运行且仅当设置
LC_CTYPE
时才会崩溃。这是一个示例输出:
$ path/to/env - foo=bar LC_CTYPE=C.UTF-8 path/to/valgrind path/to/bash -c 'echo ${foo#spam}'
...
malloc: subst.c:5331: assertion botched
free: called with unallocated block argument
Aborting...==2753214==
==2753214== Process terminating with default action of signal 6 (SIGABRT): dumping core
==2753214== at 0x48DFA8C: __pthread_kill_implementation (in /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6)
==2753214== by 0x4890C85: raise (in /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6)
==2753214== by 0x487A8B9: abort (in /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6)
==2753214== by 0x443AF9: programming_error (in /nix/store/vqvj60h076bhqj6977caz0pfxs6543nb-bash-5.2-p15/bin/bash)
==2753214== by 0x4ACAC4: internal_free.constprop.0 (in /nix/store/vqvj60h076bhqj6977caz0pfxs6543nb-bash-5.2-p15/bin/bash)
==2753214== by 0x450A5E: remove_pattern (in /nix/store/vqvj60h076bhqj6977caz0pfxs6543nb-bash-5.2-p15/bin/bash)
==2753214== by 0x465D2B: parameter_brace_remove_pattern (in /nix/store/vqvj60h076bhqj6977caz0pfxs6543nb-bash-5.2-p15/bin/bash)
==2753214== by 0x46023A: param_expand (in /nix/store/vqvj60h076bhqj6977caz0pfxs6543nb-bash-5.2-p15/bin/bash)
==2753214== by 0x460CD9: expand_word_internal (in /nix/store/vqvj60h076bhqj6977caz0pfxs6543nb-bash-5.2-p15/bin/bash)
==2753214== by 0x466C0D: shell_expand_word_list.constprop.0 (in /nix/store/vqvj60h076bhqj6977caz0pfxs6543nb-bash-5.2-p15/bin/bash)
==2753214== by 0x467479: expand_words (in /nix/store/vqvj60h076bhqj6977caz0pfxs6543nb-bash-5.2-p15/bin/bash)
==2753214== by 0x4361CE: execute_command_internal (in /nix/store/vqvj60h076bhqj6977caz0pfxs6543nb-bash-5.2-p15/bin/bash)
...
==2753214== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
/nix/store/a683qmhmrrzrwn8fmqh53yyylm7yn2hq-test.sh: line 2: 2753214 Aborted (core dumped) /nix/store/v45j2p2izb3pa2fxdw978bahhkb2ghza-toybox-0.8.10/bin/env - LC_CTYPE=C.UTF-8 /nix/store/14fg82n6grqhrd2algx31sv1kmgvz0gl-valgrind-3.21.0/bin/valgrind /nix/store/vqvj60h076bhqj6977caz0pfxs6543nb-bash-5.2-p15/bin/bash -c 'echo ${PATH#":"}'
(完整输出此处)
${parameter#word}
是here描述的一种参数扩展。
指示的源代码行指向here,但问题断言是
free
或malloc
?
尝试一些变化:
foo
或设置为空字符串会导致 Bash 成功(不会崩溃);但 foo
的任何非空设置似乎都会导致崩溃。foo
中存在存在的模式替换该模式,Bash 将在
subst.c:5336
而不是 subst.c:5331
上崩溃;当模式is与参数扩展匹配时以及模式isn't时,这两种情况都会导致崩溃,但在稍微不同的地方。LC_CTYPE
未设置或设置为任何其他语言环境(包括不存在的语言环境)时,Bash 不会崩溃(尽管存在非致命无效 free()
)。我应该如何调试这个问题?
关于再现性的说明:
flake.nix
和 flake.lock
下载到空目录,您应该能够输入 nix run
并且(希望)也会崩溃。我会创建一个特殊的 Bash 版本,其中 Bash 的 malloc 包装被禁用,并尝试像以前一样在 Valgrind 下重现问题。
您遇到了 Bash 本身正在自我诊断 malloc 问题的问题。它不会像 Valgrind 本身那么好。
Bash 的诊断表明
free
是在未分配的块上调用的。 Valgrind 的类似诊断信息更丰富。如果该地址之前存在已分配的对象,Valgrind 将显示该对象,以及释放该对象的回溯。
查看代码,我立刻发现了一些可疑的地方。
remove_wpattern(wparam, ....)
调用可能会返回 wparam
。但这要受free
的限制。
我指的是这个代码块:
oret = ret = remove_wpattern (wparam, n, wpattern, op);
/* Don't bother to convert wparam back to multibyte string if nothing
matched; just return copy of original string */
if (ret == wparam)
{
free (wparam);
free (wpattern);
return (savestring (param));
}
free (wparam);
free (wpattern);
n = strlen (param);
xret = (char *)xmalloc (n + 1);
memset (&ps, '\0', sizeof (mbstate_t));
n = wcsrtombs (xret, (const wchar_t **)&ret, n, &ps);
xret[n] = '\0'; /* just to make sure */
free (oret);
return xret;
请注意,我们称之为
oret = ... remove_wpattern (wparam
。然后有一个代码路径,我们在其中执行 free (wparam)
和 free (oret)
。但remove_wpattern
只能返回wparam
。如果发生这种情况,我们将享受双重免费。
逻辑必须采取措施不让这种情况发生。也许是这样,在这种情况下,这是一个转移注意力的事情。