用结构和 ctypes 解释 Python 中的位打包数据：结果不对齐

Question

我有一些以位打包格式存储的数据，我正在尝试使用 Python 提取这些数据。具体来说，这些是旧的位图字体，其中：

字形的每一行都是一组 n 位（其中 n 是字符宽度）。
仅保证第一行从字节边界开始。
每个字形的位图数据被打包成大端字节序列。

例如，这是一个字形的大端十六进制数据，宽度为 9 位：

1F 1F DB 79 3D BF FE F7 86 CE 3E

如果我们手动分解它，我们可以看到这个位模式代表一个小笑脸：

8-bit bytes:        9-bit chunks:   character (0='.', 1='$'):
                                         
0x1F = 00011111     000111110       ...$$$$$. 
0x1F = 00011111     001111111       ..$$$$$$$ 
0xDB = 11011011     011011011       .$$.$$.$$ 
0x79 = 01111001     110010011       $$..$..$$ 
0x3D = 00111101     110110111       $$.$$.$$$ 
0xBF = 10111111     111111111       $$$$$$$$$ 
0xFE = 11111110     101111011       $.$$$$.$$ 
0xF7 = 11110111     110000110       $$....$$. 
0x86 = 10000110     110011100       $$..$$$.. 
0xCE = 11001110     011111000       .$$$$$... 
0x3E = 00111110

所以我尝试在Python中做同样的事情，但是打包时我无法让数据正确对齐。

这就是我到目前为止所拥有的 - 只是一些测试代码，看看我是否可以从这个特定的数据中获得预期的结果。请注意

field_type

的选择，因为这似乎是我问题的根源：

import struct
import ctypes

field_type = ctypes.c_ulonglong

class PackedBitmap(ctypes.BigEndianStructure):
    _fields_ = [ ('line00', field_type, 9),
                 ('line01', field_type, 9),
                 ('line02', field_type, 9),
                 ('line03', field_type, 9),
                 ('line04', field_type, 9),
                 ('line05', field_type, 9),
                 ('line06', field_type, 9),
                 ('line07', field_type, 9),
                 ('line08', field_type, 9),
                 ('line09', field_type, 9) ]

bm = PackedBitmap()

struct.pack_into('>11s', bm, 0, 
                 b'\x1F\x1F\xDB\x79\x3D\xBF\xFE\xF7\x86\xCE\x3E')

for field in bm._fields_:
    bin_str = f'{getattr(bm, field[0]):09b}'
    print(bin_str + '     ' + bin_str.replace('0','.').replace('1','$'))

但是无论我为 PackedBitmap 的字段选择哪种 C 类型，我都无法获得正确的输出。错误总是存在的，数据类型的大小似乎决定了第一个错误将在哪里发生。

field_type = ctypes.c_ulonglong

：

000111110     ...$$$$$.
001111111     ..$$$$$$$
011011011     .$$.$$.$$
110010011     $$..$..$$
110110111     $$.$$.$$$
111111111     $$$$$$$$$
101111011     $.$$$$.$$
100001101     $....$$.$   # <- first error
100111000     $..$$$...
111110000     $$$$$....

field_type = ctypes.c_uint32

：

000111110     ...$$$$$.
001111111     ..$$$$$$$
011011011     .$$.$$.$$
001111011     ..$$$$.$$   # <- first error
011111111     .$$$$$$$$
111110111     $$$$$.$$$
100001101     $....$$.$
100111000     $..$$$...
111110000     $$$$$....
000000000     .........

field_type = ctypes.c_uint16

：

000111110     ...$$$$$.
110110110     $$.$$.$$.   # <- first error
001111011     ..$$$$.$$
111111101     $$$$$$$.$
100001101     $....$$.$
001111100     ..$$$$$..
000000000     .........
000000000     .........
000000000     .........
000000000     .........

我不确定这里发生了什么：9 位的长度非常适合这些字段类型（分别为 64、32 和 16 位），所以不应该按预期打包吗？  我缺少什么以及如何解决这个问题？

看起来

ctypes.Strucuture

Answer 1

我们可以打印字段使用的偏移量：

print(PackedBitmap.line00)
...
print(PackedBitmap.line09)

这给了我们

<Field type=c_ulonglong_be, ofs=0:55, bits=9> # 1
<Field type=c_ulonglong_be, ofs=0:46, bits=9> # 2
<Field type=c_ulonglong_be, ofs=0:37, bits=9> # 3
<Field type=c_ulonglong_be, ofs=0:28, bits=9> # 4
<Field type=c_ulonglong_be, ofs=0:19, bits=9> # 5
<Field type=c_ulonglong_be, ofs=0:10, bits=9> # 6
<Field type=c_ulonglong_be, ofs=0:1, bits=9>  # 7
<Field type=c_ulonglong_be, ofs=8:55, bits=9> # 8
<Field type=c_ulonglong_be, ofs=8:46, bits=9> # 9
<Field type=c_ulonglong_be, ofs=8:37, bits=9> # 10

第一个字段从位

到

63

。这个很好
第七位从位
```
1
```
到

9

。这个也可以啊
第二个字节中的八位从位
```
55
```
到

63

。所以我们丢失了字节

的

位。

这可能是一个错误，或者至少是意外的。当您切换到其他数据类型时，您会发现切换到下一个字节时首先发生错误。

但是，这里的实现既不使用

ctypes

也不使用

struct

:

它将字节作为 1 和 0 连接成一个大字符串

然后将它们分成长度为 9 的部分

from collections.abc import Iterator
from itertools import batched

raw = b'\x1F\x1F\xDB\x79\x3D\xBF\xFE\xF7\x86\xCE\x3E'


def regroup_bits(buffer: bytes, bits: int) -> Iterator[str]:
    binary = ''.join(f'{byte:08b}' for byte in buffer)

    return (''.join(batch) for batch in batched(binary, bits))


for i in regroup_bits(raw, 9):
    print(i.replace('0', '.').replace('1', '$'))

yield

构建一个 9 位值来优化它，从而更加内存友好。

另请注意，此实现不会填充最后一个值。

用结构和 ctypes 解释 Python 中的位打包数据：结果不对齐

问题描述投票：0回答：1

1个回答

最新问题

用结构和 ctypes 解释 Python 中的位打包数据：结果不对齐

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1