简单,就像将32位RGBA像素清除到单颜色的阵列应该快速燃烧。但这不是我案件中发生的事情
typedef struct PXLimage
{
int width;
int height;
unsigned char* data;
} PXLimage;
typedef struct PXLcolor
{
unsigned char rgba[4];
} PXLcolor;
void pxlImageClearColor(PXLimage* image, PXLcolor color)
{
for(uint32_t i = 0; i < image->width * image->height; i++)
{
memcpy(image->data + (i * 4), color.rgba, 4);
}
}
I使用GPROF介绍了此软件渲染C库,并获得了以下结果:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls us/call us/call name
49.85 45.70 45.70 200000 228.50 228.50 pxlImageClearColor
23.42 67.17 21.47 _mcount_private
11.52 77.73 10.56 1137265408 0.01 0.01 pxlImageSetPixelColor
9.75 86.67 8.94 __fentry__
4.51 90.80 4.13 200000 20.65 206.13 pxlRendererDrawTriangle
0.76 91.50 0.70 200000 3.50 66.64 pxlRendererDrawRect
0.14 91.63 0.13 200000 0.65 4.23 pxlRendererDrawLine
0.01 91.64 0.01 200000 0.05 0.05 pxlGetKey
0.01 91.65 0.01 200000 0.05 228.55 pxlRendererClearColor
0.01 91.66 0.01 _pxlOutOfImageRange
0.01 91.67 0.01 main
0.00 91.67 0.00 2000000 0.00 0.00 pxlRendererSetDrawColor
0.00 91.67 0.00 200000 0.00 0.00 pxlWindowPresent
0.00 91.67 0.00 100000 0.00 0.00 pxlWindowPollEvents
0.00 91.67 0.00 10 0.00 0.00 _pxlFree
0.00 91.67 0.00 10 0.00 0.00 _pxlMalloc
48%?为什么?所以我尝试更优化更多
void pxlImageClearColor(PXLimage* image, PXLcolor color)
{
for(uint32_t i = 0; i < image->width * image->height; i++)
{
((uint32_t*)image->data)[i] = *((uint32_t*)color.rgba);
}
}
-PROFILER结果:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls us/call us/call name
48.45 39.80 39.80 200000 199.00 199.00 pxlImageClearColor
24.31 59.77 19.97 _mcount_private
11.43 69.16 9.39 1137265408 0.01 0.01 pxlImageSetPixelColor
10.24 77.57 8.41 __fentry__
4.59 81.34 3.77 200000 18.85 183.78 pxlRendererDrawTriangle
0.73 81.94 0.60 200000 3.00 59.15 pxlRendererDrawRect
0.15 82.06 0.12 pxlImageGetPixelColor
0.09 82.13 0.07 200000 0.35 3.54 pxlRendererDrawLine
0.01 82.14 0.01 200000 0.05 0.05 pxlGetKey
0.00 82.14 0.00 2000000 0.00 0.00 pxlRendererSetDrawColor
0.00 82.14 0.00 200000 0.00 199.00 pxlRendererClearColor
0.00 82.14 0.00 200000 0.00 0.00 pxlWindowPresent
0.00 82.14 0.00 100000 0.00 0.00 pxlWindowPollEvents
0.00 82.14 0.00 10 0.00 0.00 _pxlFree
0.00 82.14 0.00 10 0.00 0.00 _pxlMalloc
速度更快,但仍然很慢,我想念什么?我该如何优化?这一切都在我的CPU(i5-12450h)上运行,在一个核心上,一个像素数组正渲染到每个帧中的Win32窗口中。我们将图像清除为单个颜色,在顶部绘制一些东西,然后将其显示在窗户上。
memset(image->data, color.rgba[0], image->width * image->height * 4);
即可。您的后备应是您自己的自定义模拟实现,允许超过8位Paramater,在这种情况下为32位参数。您的memset32看起来像: `
void memset32(void* dest, uint32_t value, size_t count)
{
uint32_t* dest32 = (uint32_t*)dest;
for (size_t i = 0; i < count; i++)
{
dest32[i] = value;
}
}