std::vector 扩展时会消耗多少内存？

Question

在大多数 Linux 编译器中，

std::vector

将以 2 倍扩展。我已经在 g++11 上进行了测试。

所以，我假设当 std::vector 从 N 扩展到 2N 时会发生以下情况：

向量在 N 处达到极限。它消耗 X 字节的内存。
向量将为
```
2*X
```
元素数组分配
```
2*N
```
字节内存。因此向量总共可能需要
```
3*X
```
字节。
向量将元素从原始
```
N
```
大小的数组复制或移动到
```
2*N
```
大小的数组中。
向量破坏了原始数组。所以现在需要
```
2*X
```
字节的内存。

作为结论，向量的内存使用峰值应该是

3*X

。因此，我编写了以下代码来测试此行为。

在代码中，我们将 2M 个

Writes

实例放入向量中。并将使用

getPeakRSS

记录此程序中的最大内存占用。

我会不时打印

Current memory usage

。

usage

信息来自

getPeakRSS

。

expected

是通过将

Writes

的大小乘以向量

ww

的当前计数来计算的。

amp

字段为

usage / expected

。如果我是正确的，我可以发现

amp

的最大值约为 3。

#include <fstream>
#include <sstream>
#include <string>
#include <iostream>
#include <vector>

struct Writes16 {
    uint64_t w1;
    uint64_t w2;
};

struct Writes32 {
    Writes16 w1;
    Writes16 w2;
};

struct Writes128 {
    Writes32 w1;
    Writes32 w2;
    Writes32 w3;
    Writes32 w4;
};

struct Writes {
    Writes128 w1;
    Writes16 w2;
    uint64_t w3;
};

/*
 * Author:  David Robert Nadeau
 * Site:    http://NadeauSoftware.com/
 * License: Creative Commons Attribution 3.0 Unported License
 *          http://creativecommons.org/licenses/by/3.0/deed.en_US
 */

#if defined(_WIN32)
#include <windows.h>
#include <psapi.h>

#elif defined(__unix__) || defined(__unix) || defined(unix) || (defined(__APPLE__) && defined(__MACH__))
#include <unistd.h>
#include <sys/resource.h>

#if defined(__APPLE__) && defined(__MACH__)
#include <mach/mach.h>

#elif (defined(_AIX) || defined(__TOS__AIX__)) || (defined(__sun__) || defined(__sun) || defined(sun) && (defined(__SVR4) || defined(__svr4__)))
#include <fcntl.h>
#include <procfs.h>

#elif defined(__linux__) || defined(__linux) || defined(linux) || defined(__gnu_linux__)
#include <stdio.h>

#endif

#else
#error "Cannot define getPeakRSS( ) or getCurrentRSS( ) for an unknown OS."
#endif

/**
 * Returns the peak (maximum so far) resident set size (physical
 * memory use) measured in bytes, or zero if the value cannot be
 * determined on this OS.
 */
size_t getPeakRSS( )
{
#if defined(_WIN32)
    /* Windows -------------------------------------------------- */
    PROCESS_MEMORY_COUNTERS info;
    GetProcessMemoryInfo( GetCurrentProcess( ), &info, sizeof(info) );
    return (size_t)info.PeakWorkingSetSize;

#elif (defined(_AIX) || defined(__TOS__AIX__)) || (defined(__sun__) || defined(__sun) || defined(sun) && (defined(__SVR4) || defined(__svr4__)))
    /* AIX and Solaris ------------------------------------------ */
    struct psinfo psinfo;
    int fd = -1;
    if ( (fd = open( "/proc/self/psinfo", O_RDONLY )) == -1 )
        return (size_t)0L;      /* Can't open? */
    if ( read( fd, &psinfo, sizeof(psinfo) ) != sizeof(psinfo) )
    {
        close( fd );
        return (size_t)0L;      /* Can't read? */
    }
    close( fd );
    return (size_t)(psinfo.pr_rssize * 1024L);

#elif defined(__unix__) || defined(__unix) || defined(unix) || (defined(__APPLE__) && defined(__MACH__))
    /* BSD, Linux, and OSX -------------------------------------- */
    struct rusage rusage;
    getrusage( RUSAGE_SELF, &rusage );
#if defined(__APPLE__) && defined(__MACH__)
    return (size_t)rusage.ru_maxrss;
#else
    return (size_t)(rusage.ru_maxrss * 1024L);
#endif

#else
    /* Unknown OS ----------------------------------------------- */
    return (size_t)0L;          /* Unsupported. */
#endif
}

std::vector<Writes> ww;

int main() {
    static_assert(sizeof(Writes) == 152);
    const size_t K = 1024;
    const size_t M = 1024 * K;
    size_t location = 0;
    for (int m = 0; m < 2; m++) {
        for (int j = 0; j < M; j++) {
            ww.emplace_back(Writes());
            if (j % 4096 == 0) {
                if (j == 0) {
                    location = (size_t)&ww[0];
                }
                size_t usage = getPeakRSS();
                size_t expected = ww.size() * 152;
                double rate = usage * 1.0 / expected;
                printf("Current memory usage %d: %u %.4fMB expected %u %.4fMB amp %.4f cap %u\n", m, usage, usage / 1024.0 / 1024.0, expected, expected / 1024.0 / 1024.0, rate, ww.capacity());
            }
        }
    }
    // To ensure the elements are copied. There is no realloc happening.
    printf("Location prev %u end %u\n", location, (size_t)&ww[0]);
    return 0;
}

但是，我发现了以下内容。看来在扩展过程中，内存使用峰值只有

2*X

字节，而不是

3*X

。

...
Current memory usage 0: 163250176 155.6875MB expeted 158761112 151.4064MB amp 1.0283 cap 1048576
Current memory usage 1: 322584576 307.6406MB expeted 159383704 152.0001MB amp 2.0239 cap 2097152
...

我想知道为什么会保存额外的

字节？

=====

我知道 std::vector 的扩展方式完全取决于编译器。所以我想知道 g++ 中会发生什么优化，从而使一些内存分配“消除”。

Answer 1

区别在于残余内存和虚拟内存。

让我们通过从一个较大的分配（假设为 1M）到下一个更高的分配（2M）来简化所发生的情况：

向量分配一个大小。通过新分配器，它最终解析为新的内存映射。这会分配虚拟内存空间，但尚未填充物理内存，因此它不会消耗 RSS。在各种实用程序中，此虚拟内存显示为
```
AS
```
（例如在
```
prlimit
```
中）或
```
VIRT
```
（例如在
```
top
```
中）
向量从前面慢慢地填充这个内存。当第一次写入每个新内存页时，会发生页面错误，此时操作系统会分配新的物理内存页。这会慢慢增加 RSS。当然，操作系统可能会对此进行优化并一次分配多个页面，但它不会一次分配所有页面
向量容量不足。它分配新内存，大小是原始大小的两倍。与步骤 1 一样，该内存是新分配的虚拟内存，没有附加物理内存，因此 RSS 不会增长
向量现在将现有元素复制到新映射。与步骤 2 中一样，这会增加 RSS，但只会增加第一个映射的大小
旧的映射现已被释放。由于这是一个巨大的分配，新分配器会将其释放回操作系统。这会将 RSS 缩小回 50% 左右
向量现在继续增长到剩余的映射内存中

RSS 使用峰值出现在步骤 4 结束时。此时，1M 元素的旧映射已完全映射。然而，新映射仅填充了现有 1M 元素的一半。分配的额外容量尚未被触及，因此未分配物理内存页，因此不考虑 RSS。这是您观察到的放大器系数 2。

std::vector 扩展时会消耗多少内存？

问题描述投票：0回答：1

1个回答

最新问题

std::vector 扩展时会消耗多少内存？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1