尝试使用ptrace调用用户函数时出现问题 - nanosleep会导致崩溃

Question

我正在开发一个项目，我需要让一个正在运行的程序按需执行一个函数。为此，我使用ptrace。我知道这是可能的，因为GDB做到了。

现在我使用的代码的改编版本：https://github.com/eklitzke/ptrace-call-userspace此程序显示如何在目标程序中调用fprintf。

当被调用的函数使用nanosleep（）时，我面对的程序出现。如果在跟踪器调用的函数内部调用nanosleep（），则tracee会与SIGSEGV崩溃，但仅在睡眠结束后才会崩溃。如果tracee本身正常调用该函数，则一切正常。

我得出结论，问题与调用函数的方式有关，可能与tracee的堆栈或寄存器值有关。例如，我在进入函数时已经检查过堆栈是16字节对齐的。

跟踪器的代码存在于上面的github中（差异是被调用的函数，我也删除了参数）

tracee的代码是一个简单的虚拟过程，每秒打印一次PID。

调用函数的代码：

#include <stdio.h>
#include <time.h>

void hello()
{
    struct timespec tim1;
    tim1.tv_sec = 1;
    tim1.tv_nsec = 0;
    struct timespec tim2;
    nanosleep(&tim1, &tim2);    
    puts("Hello World!!!");
}

当跟踪的程序崩溃时，回溯如下：

#0  0xfffffffffffffff7 in ?? ()
#1  0x00007effb0e6e6e0 in hello () at hello.c:10
#2  0x00007effb195c005 in ?? ()
#3  0x00007effb1435cc4 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:137
#4  0x00000000004005de in main ()

转储核心的寄存器值：

rax            0xfffffffffffffff7       -9
rbx            0x7ffc858a0e40   140722548903488
rcx            0x7effb1435e12   139636655742482
rdx            0x7ffc858a0df8   140722548903416
rsi            0x7ffc858a0df8   140722548903416
rdi            0x7ffc858a0e08   140722548903432
rbp            0x7ffc858a0e18   0x7ffc858a0e18
rsp            0x7ffc858a0df0   0x7ffc858a0df0
r8             0xffffffffffffffff       -1
r9             0x0      0
r10            0x7ffc858a0860   140722548901984
r11            0x246    582
r12            0x7ffc858a0ec0   140722548903616
r13            0x7ffc858a1100   140722548904192
r14            0x0      0
r15            0x0      0
rip            0xfffffffffffffff7       0xfffffffffffffff7
eflags         0x10246  [ PF ZF IF RF ]
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0

示踪剂的输出：

./call_hello -p 17611
their %rip           0x7effb1435e10
allocated memory at  0x7effb195c000
executing jump to mmap region
successfully jumped to mmap area
their lib            0x7effb0e6e000
their func           0x7effb0e6e000
Adding rel32 to new_text[0]Adding func_delta to new_text[1-4]Adding TRAP to new_text[5]inserting code/data into the mmap area at 0x7effb195c000
setting the registers of the remote process
continuing execution
PTRACE_CONT unexpectedly got status Unknown signal 2943

如果我删除对nanosleep的调用，一切都按预期工作 - “Hello World !!!”打印出来。正如我之前所说，分段故障仅在请求的1秒睡眠后发生。我不知道nanosleep是如何导致指令指针保持0xfffffffffffffff7。关于我应该考虑什么以解决这个问题的任何建议或想法？提前致谢！

我在CentOS Linux版本7.6.1810上测试它。

Answer 1

问题如下：

你的call-hello程序会写两条指令

syscall
call %rax

到％rip寄存器（指令指针）的当前值所指向的存储器。由于您的目标程序在其主循环中对nanosleep()进行了（隐式）调用，因此％rip几乎总是指向系统调用的返回地址（libc中的某个位置）。此时，系统调用执行mmap()然后跳转到返回值（新映射的空间）。

但是后来，在你的hello()函数中，你再次调用nanosleep()。在返回地址，上面仍然有注入的代码！一些随机系统调用被执行（取决于％rax的内容），它失败了错误代码-9（EBADFD），现在是％rax中的0xfffffffffffffff7。然后，call %rax跳到那里，杀死你的过程。

因此，最好的解决方案是找到一个位置，您可以在其中注入并执行4个字节的代码而不会覆盖其他代码。或者，您可以在继续执行hello()之前恢复原始代码，并在执行hello()结束后（陷阱之后）再次将其放入，例如：

// update the mmap area
printf("inserting code/data into the mmap area at %p\n", mmap_memory);
if (poke_text(pid, mmap_memory, new_text, NULL, sizeof(new_text))) {
  goto fail;
}

- if (poke_text(pid, rip, new_word, NULL, sizeof(new_word))) {
+ if (poke_text(pid, rip, old_word, NULL, sizeof(old_word))) {
  goto fail;
}

但是，稍后，您必须重新安装系统调用代码以使munmap()调用发生，例如：

if (ptrace(PTRACE_SETREGS, pid, NULL, &newregs)) {
  perror("PTRACE_SETREGS");
  goto fail;
}

+ if (poke_text(pid, rip, new_word, NULL, sizeof(new_word))) {
+   goto fail;
+ }

new_word[0] = 0xff; // JMP %rax
new_word[1] = 0xe0; // JMP %rax

现在它应该像你期望的那样工作。

尝试使用ptrace调用用户函数时出现问题 - nanosleep会导致崩溃

问题描述投票：2回答：1

1个回答

最新问题

尝试使用ptrace调用用户函数时出现问题 - nanosleep会导致崩溃

问题描述 投票：2回答：1

1个回答

最新问题

问题描述投票：2回答：1