找到连续的零和零

Question

我正在寻找将整数流转换为计算连续1和0的列表的最快方法。

例如整数[4294967295,4194303,3758096384]

在位水平：

11111111111111111111111111111111
11111111111111111111110000000000
00000000000000000000000000000111

（每个位串都以小端顺序排列）

所以程序应该输出三个值：[54 39 3]有54个，其次是39个零，最后是3个。

我一直在研究这些算法：http://graphics.stanford.edu/~seander/bithacks.html#ZerosOnRightLinear

可能我需要按照这些方式写一些东西

i=(the first bit of the first integer)
repeat till the end
    find the number of consecutive i's in this integer
    if we reach the end of the integer, continue with the next
    else i = (not)i

但我想知道是否有人能想出更好的方法来做到这一点。

目前该功能是在Matlab中构建的，如下所示：

%get all bits in a long vector
data = uint32([4294967295,4194303,3758096384]);
logi = false([1,length(data)*32]);
for ct = 1:length(data)
    logi(1+32*(ct-1):ct*32)=bitget(data(1+(ct-1)),1:32);
end
%count consecutive 1s and 0s
Lct=1;
L=1;i = logi(1);
for ct = 2:length(logi)
    if logi(ct)==i
        L(Lct)=L(Lct)+1;
    else
        i=logi(ct);
        Lct=Lct+1;
        L(Lct)=1;
    end
end

>> L = 54    39     3

注意：我花了一些时间来解决问题。因此，关于语言的评论和问题的确切性质。希望（在经过多次编辑之后）这个问题现在处于可以找到它的形式，答案也可以对其他人有用。

Answer 1

早些时候我误解了这个问题。现在我知道你在问什么。这应该工作，我测试过：

#include <iostream>
#include <deque>

using namespace std;

//old version for whole collection
void ConsecutiveOnesAndZeros(deque<uint32_t> values, deque<uint8_t> &outCount)
{
    int i;
    if (!values.empty()) {
        uint8_t count = 0, lastBit = (values[0] & 1);
        for (uint32_t &value : values)
        {
            for (i = 0; (i < 32) && (value != 0); i++)
            {
                if (lastBit != uint8_t((value >> i) & 1))
                {
                    outCount.push_back(count);
                    count = 0;
                    lastBit = !lastBit;
                }
                count++;
            }
            if (i < 32) count += (32 - i);
        }
        outCount.push_back(count);
    }
}

//stream version for receiving integer
void ConsecutiveOnesAndZeros(uint32_t value, uint8_t &count, uint8_t &lastBit, deque<uint8_t> &outCount)
{
    int i;
    for (i = 0; (i < 32) && (value != 0); i++)
    {
        if (lastBit != uint8_t((value >> i) & 1))
        {
            if(count) outCount.push_back(count);
            count = 0;
            lastBit = !lastBit;
        }
        count++;
    }
    if (i < 32) count += (32 - i);
}

int main()
{
    deque<uint8_t> outCount;
    deque<uint32_t> stream = { 4294967295u,4194303u,3758096384u };

    ConsecutiveOnesAndZeros(stream, outCount);
    for (auto res : outCount) {
        printf_s("%d,", res);
    }
    printf_s("\n");

    uint8_t count = 0, bit = 0;
    outCount.clear();
    for (auto val : stream) 
        ConsecutiveOnesAndZeros(val, count, bit, outCount);
    if (count) outCount.push_back(count);

    for (auto res : outCount) {
        printf_s("%d,", res);
    }
    printf_s("\n");

    system("pause");
}

更新 - 我已经对检查值进行了一些优化！= 0.我还将ConsecutiveOnesAndZeros划分为两个函数，用于从接收流中提供下一个整数。

Answer 2

好吧，你可以尝试通过将第一部分分成线程来加快速度。

例如，如果你有一个你描述的函数，你可以将它们中的几个称为std::thread或std::future，具体取决于你希望如何接近它。完成后你可以比较两个边界位（一个在前一个结束，一个在下一个开始），并将第一个结果计数添加到最后一个结果计数或将结果推到结果之前，结果的所有其他部分都被推送到之前，没有任何比较。

如果您的输入很短，这当然会过度。

Answer 3

首先，要说你的样本数是错误的，因为第二个最重要的位在一个，它应该大于2147483643，但它只是4194303，第三个应该是7，所以我猜你有将它们转换为十进制时将位位置反转。请参阅我在main()开头的注释的最后一个完整代码，关于如何确定数字（在您的示例中看起来像这样）与您的位模式对应的数字是（十六进制/十进制）：

[0xffffffff/4294967295][0xfffffc00/4294966272][0x00000007/7]

（如果我们把更多的重量数字放在左边，为什么我们也不用二进制？）

为了解决你的问题，你可以考虑当你在一个数字的LSB部分中有n连续的那个，并且你将该值递增一个，那么你将所有那些连续的值切换为零（通过进位传播）直到接下来你最后一个，如果你有n连续零并递减值，那么你将所有这些零转换成一个......好吧，再多一点，因为进一步再次进行级联。我们的想法是检查我们在LSB中有什么位，并根据这个，增加或减少该值并将其与原始值进行异或....您将得到的结果是一个数字，其数量与LSB作为LSB的相等位，再加上一个，例如：

 1100100011111111

当LSB为1时，我们递增它：

 1100100100000000
        ^^^^^^^^^ changed bits.

如果我们现在用前一个xor这个值：

 0000000111111111  => 9 "1" bits, that indicate that 8 "1" consecutive bits were present

如果我们准备一个switch语句，其中包含我们可以从此函数获得的所有可能值，您可以获得以下结果的非常有效的方法：

 int get_consecutive_bits(unsigned value)
 {
     unsigned next = value;
     switch (value) {
     case 0: case ~0: return 32; /* these are special cases, see below */
     }
     switch (value & 1) { /* get the lower bit */
     case 0: next--; break; /* decrement */
     case 1: next++; break; /* increment */
     }
     switch (value ^ next) { /* make the xor */
     case 0x00000003: return 1;
     case 0x00000007: return 2;
     case 0x0000000f: return 3;
     case 0x0000001f: return 4;
     case 0x0000003f: return 5;
     case 0x0000007f: return 6;
     /* ... */
     case 0xffffffff: return 31;
     } /* switch */
 }

现在，您必须累积该值，以防下一个数组单元以与前一个完成相同的位值开始。我们从来没有case 0x00000001声明的原因是我们在第二位强制进位，所以我们总是有一个1或更多的值，两位改变（...0000001 => ...0000010 => ...0000011和...11111110 => ...11111101 => ...00000011）这也意味着值为0000...0000和1111...1111我们应该比单词长度多一点，使这些值特殊（因为它们使进位到msb的下一位，即第33位）所以我们首先检查这些值。

这是在一个阵列单元的块中执行任务的非常有效的方法。当你获得的值包括MSB时，你必须累积，因为下一个单词可以从你之前结束的那个位开始。

下一个代码应该说明算法：

pru_49297910.c

/* pru_49297910.c -- answer to https://stackoverflow.com/questions/49297910/
 * Author: Luis Colorado <[email protected]>
 * Date: Wed Apr 24 11:12:21 EEST 2019
 * Copyright: (C) Luis Colorado.  All rights reserved.
 * License: BSD.  Open source.
 */

#include <cassert>
#include <iostream>

#define BITS_PER_ELEMENT    32

int get_consecutive_bits(unsigned value)
{
    switch (value) {
    case 0: case ~0: /* these are special cases, see below */
            return BITS_PER_ELEMENT;
    }
    unsigned next = value;
    switch (value & 1) { /* get the lower bit */
    case 0: next--; break; /* decrement */
    case 1: next++; break; /* increment */
    }
    switch (value ^ next) { /* make the xor */
    case 0x00000003: return 1;      case 0x00000007: return 2;
    case 0x0000000f: return 3;      case 0x0000001f: return 4;
    case 0x0000003f: return 5;      case 0x0000007f: return 6;
    case 0x000000ff: return 7;      case 0x000001ff: return 8;
    case 0x000003ff: return 9;      case 0x000007ff: return 10;
    case 0x00000fff: return 11;     case 0x00001fff: return 12;
    case 0x00003fff: return 13;     case 0x00007fff: return 14;
    case 0x0000ffff: return 15;     case 0x0001ffff: return 16;
    case 0x0003ffff: return 17;     case 0x0007ffff: return 18;
    case 0x000fffff: return 19; case 0x001fffff: return 20;
    case 0x003fffff: return 21; case 0x007fffff: return 22;
    case 0x00ffffff: return 23; case 0x01ffffff: return 24;
    case 0x03ffffff: return 25; case 0x07ffffff: return 26;
    case 0x0fffffff: return 27; case 0x1fffffff: return 28;
    case 0x3fffffff: return 29; case 0x7fffffff: return 30;
    case 0xffffffff: return 31;
    } /* switch */
    assert(!"Impossible");
    return 0;
}

#define FLUSH() do{                         \
            runlen(accum, state);   \
        state ^= 1;                         \
        accum = 0;                          \
    } while (0)

void run_runlen_encoding(unsigned array[], int n, void (*runlen)(int, unsigned))
{
    int state = 0; /* always begin in 0 */
    int accum = 0; /* accumulated bits */
    while (n--) {
        /* see if we have to change */
        if (state ^ (array[n] & 1)) /* we changed state */
                    FLUSH();
            int nb = BITS_PER_ELEMENT; /* number of bits to check */
            int w = array[n];
        while (nb > 0) {
                    int b = get_consecutive_bits(w);
                    if (b < nb) {
                            accum += b;
                            FLUSH();
                            w >>= b;
                            nb -= b;
                    } else {  /* b >= nb, we only accumulate nb */
                accum += nb;
                            nb = 0;
                    }
            }
    }
    if (accum)
            FLUSH();
} /* run_runlen_encoding */

void output_runlen(int n, unsigned kind)
{
    if (n) { /* don't print for n == 0 */
            static int i = 0;
            std::cout << "[" << n << "/" << kind << "]";
            if (!(++i % 10))
                    std::cout << std::endl;
    }
} /* output_runlen */

int main()
{
     /* 0b1111_1111_1111_1111_1111_1111_1111_1111, 0b1111_1111_1111_1111_1111_1100_0000_0000, 0b0000_0000_0000_0000_0000_0000_0000_0111 */
     /*    0xf____f____f____f____f____f____f____f,    0xf____f____f____f____f____c____0____0,    0x0____0____0____0____0____0____0____7 */
     /*                                0xffffffff,                                0xfffffc00,                                0x00000007 */
    unsigned int array[] =
#if 1
        { 0xffffffff, 0xfffffc00, 0x00000007 }; /* correct values for your example */
#else
            { 4294967295, 4194303, 3758096384 }; /* original values, only first matches. */
#endif
    size_t array_n = sizeof array / sizeof array[0];

    run_runlen_encoding(array, array_n, output_runlen);
    std::cout << std::endl;
} /* main */

Note:

由于我们需要计算进位在一个增量中跳多远，我们必须从较低有效位到最高位，使得输出与您尝试的顺序相反，但我相信您将能够改变使其显示为您在问题中所述的顺序。

程序输出显示：

$ pru_49297910
[3/1][39/0][54/1]

找到连续的零和零

问题描述投票：3回答：3

3个回答

pru_49297910.c

Note:

最新问题

找到连续的零和零

问题描述 投票：3回答：3

3个回答

pru_49297910.c

Note:

最新问题

问题描述投票：3回答：3