汇编代码返回数组中的最小整数，而不是随机返回倒数第二个或倒数第二个

Question

我正在尝试在nasm中创建一个函数，给定一个整数数组和该数组的长度，该函数将返回最小的整数。这基于CodeWars问题"Find the smallest integer in the array"。我正在64位BlackArch Linux上执行此操作。我的函数看起来像这样：

SECTION .text
global find_smallest_int

find_smallest_int:
  ; [rdi] is the first value in the array.
  ; We'll store the smallest value so far found
  ; in rax. The first value in the array is the
  ; smallest so far found, therefore we store it
  ; in rax.
  mov rax, [rdi]

  ; rsi is the second argument to int find_smallest_int(int *, int)
  ; which represents the length of the array.
  ; Store it in rbx to be explicit.
  mov rbx, rsi

  loop:
    ; Check to see if we've reached the end of the array.
    ; If we have, we jump to the end of the function and 
    ; return the smallest value (which should be whatever
    ; is in rax at the moment.
    cmp rbx, 0
    je end

    ; Subtract one from our counter. This started as 
    ; the number of elements in the array - when it
    ; gets to 0, we'll have looped through the entire thing.
    sub rbx, 1

    ; If rax is smaller than [rdi], we'll jump down to the
    ; rest of the loop. Only if rax is bigger than [rdi] will
    ; we reassign rax to be the new smallest-yet vaue.
    cmp rax, [rdi]
    jl postassign

    assign:
      ; If we execute this code, it means rax was not less
      ; than [rdi]. Therefore, we can safely reassign
      ; rax to [rdi].
      mov rax, [rdi]


    postassign:
    ; Set rdi to point to the next value in the array
    add rdi, 4

    ; if we get here, then we aren't finishing looping yet
    ; because rbx (the counter) hasn't eached 0 yet.
    jmp loop

  end:
    ret

然后我通过以下C代码调用此函数：

extern int find_smallest_int(int *array, int size);

int main(void)
{
    int nums[4] = {800, 300, 100, 11};
    int ret = find_smallest_int(nums, 4);

    return ret;
}

最后，我使用以下命令编译并运行整个程序：

#!/bin/bash

# Make an object file from my assembly code with nasm
nasm -f elf64 -o sum.o call_sum.s

# make an object file from my C code
gcc -O0 -m64 -c -o call_sum.o call_sum.c -g

# compile my two object files into an executable
gcc -O0 -m64 -o run sum.o call_sum.o -g

# Run the executable and get the output in the
# form of the exit code.
./run
echo $?

不是得到最小的整数，而是得到100或11（分别传递给我的汇编函数的整数数组的倒数第二个和最后一个成员）。我得到的结果似乎是完全随机的。我可以运行该程序几次，得到11，然后再运行几次，然后开始得到100。

[如果有人可以帮助我理解这种奇怪的行为，我将非常感激。谢谢！

更新：我实现了对Jester的注释所做的更改（使用32位寄存器保存整数），并且可以正常工作，但我并不十分了解为什么。

Answer 1

此答案的开头基于杰斯特的评论。它只是对此进行了扩展，并更详细地说明了更改。我也做了一些其他更改，其中两个也在解决您来源中的错误。

首先，这部分：

int是4个字节，但是在整个代码中使用8个字节。使用eax代替rax。

您的示例中的这些指令每个都从数组访问8个字节：

    mov rax, [rdi]

    cmp rax, [rdi]

    mov rax, [rdi]

这是因为rax是64位寄存器，所以进行完整的rax加载或与内存操作数进行比较将访问8个字节的内存。在NASM语法中，允许您明确指定内存操作数的大小，例如通过编写以下代码：

    mov rax, qword [rdi]

[如果您这样做，可能早先已经知道您以8字节为单位访问内存（quadwords）。使用rax作为目标寄存器时，尝试显式访问双字将失败。以下行在组装时导致错误“操作数大小不匹配”：

    mov rax, dword [rdi]

以下两行很好，并且都从双字存储操作数加载到rax中。第一个使用零扩展（在写入32位寄存器部分时在AMD64指令集中隐含），第二个使用（显式）符号扩展：

    mov eax, dword [rdi]
    movsx rax, dword [rdi]

（从双字存储操作数到movzx的rax指令不存在，因为对于mov到eax来说是多余的。]

在您的示例中，后来您使用rdi作为4字节宽类型的地址，通过在数组入口指针上加4来前进数组入口指针：

    add rdi, 4

这对于int类型是正确的，但是与使用quadwords作为内存操作数的大小冲突。

Jester的评论给出了另外两个问题：

也不要使用rbx，因为这是保存了被调用方的寄存器，无论如何从rsi复制都是没有意义的。和以前一样，您最好使用esi，因为那是另一个整数。

rsi问题是64位rsi的高32位可能取决于ABI而持有非零值。如果不确定是否允许使用非零值，则应假定允许该值，并且应仅在esi中使用32位值。

rbx（或ebx）问题是，需要在Linux使用的AMD64 psABI的各个函数调用之间保留rbx，有关该ABI的文档，请参阅Where is the x86-64 System V ABI documented?。在简单的测试程序中，更改rbx可能不会导致任何故障，但是在非平凡的情况下很容易会导致失败。

我发现的下一个问题是您对eax的初始化。您是这样写的：

  ; [rdi] is the first value in the array.
  ; We'll store the smallest value so far found
  ; in rax. The first value in the array is the
  ; smallest so far found, therefore we store it
  ; in rax.
  mov rax, [rdi]

但是，正如循环流控制逻辑所证明的那样，您允许调用方将size参数传递为零。在这种情况下，您根本不应访问该数组，因为“数组中的第一个值”甚至可能根本不存在或初始化为任何东西。从逻辑上讲，您应该使用INT_MAX而不是第一个数组条目来初始化最小的值。

还有另一个问题：您正在使用rsi或esi作为无符号数字，倒数到零。但是，在函数声明中，您将size参数的类型指定为int，该类型已签名。我通过将声明更改为unsigned int来解决此问题。

我对您的程序进行了其他一些可选更改。我在功能的“子”标签中使用了NASM本地标签，这很有用，因为如果要添加同一个源文件，则可以在其他功能中重复使用.loop或.end。

我还纠正了其中的一条注释，以注意，对于eax小于数组项，我们跳转，而对于eax大于或等于，则不跳转。您可以将此条件跳转更改为jle，也可以进行相等比较。出于清晰度或性能的考虑，可以说最好选择其中一个，但是我对哪一个没有太多的答案。

[我也用dec esi代替了sub esi, 1，这不是很好，但是和我在一起坐得更好。在32位模式下，dec esi是单字节指令。但是在64位模式下则不是这样。 dec esi是2个字节，而sub esi, 1是3个字节。

此外，我将esi的初始检查从使用cmp更改为test，这要好一点，请参考Test whether a register is zero with CMP reg,0 vs OR reg,reg?

最后，我将实际的循环条件更改为循环主体的末尾，这意味着循环使用的跳转指令少了。循环主体开始的无条件跳转将替换为检查while条件的条件跳转。函数开头仍需要test来处理零长度数组的可能性。另外，我没有使用cmp或test再次检查esi中的零，而是使用dec指令已经设置的零标志来检查esi是否减为零。 >

您可以将ecx或rcx用作循环计数器，但这在现代CPU上可能不会有太大的优势。如果您使用jrcxz，jecxz或loop指令，则代码会更紧凑。但是由于性能较慢，不建议使用它们。

而不是比较dword [rdi]，然后，如果小于或等于eax，则从同一内存dword加载，您可以先将数组条目的值加载到寄存器中，然后将其用作[ C0]和cmp。这可能会更快，但是会导致更多操作码字节。

我另外使用的一种技巧是将目标索引（在64位模式下为mov提前4）是使用单个rdi指令，该指令仅修改标志和索引寄存器。这是单字节指令，而不是4字节scasd，但运行起来可能很慢。

我上传了一个回购的原始源文件，并对add rdi, 4进行了改进（根据stackoverflow内容的CC BY-SA使用条件。）我也修改了C部分和测试脚本，但是这些都是琐碎的且基本不相关对你的问题。这是汇编源：

https://hg.ulukai.org/ecm/testsmal/file/2b8637ca416a/

汇编代码返回数组中的最小整数，而不是随机返回倒数第二个或倒数第二个

问题描述投票：1回答：1

1个回答

最新问题

汇编代码返回数组中的最小整数，而不是随机返回倒数第二个或倒数第二个

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1