最快的方法将RGBA图像清除为C

问题描述 投票:0回答:1

简单,就像将32位RGBA像素清除到单颜色的阵列应该快速燃烧。但这不是我案件中发生的事情

typedef struct PXLimage
{
    int width;

    int height;

    unsigned char* data;

} PXLimage;


typedef struct PXLcolor
{
    unsigned char rgba[4];

} PXLcolor;


void pxlImageClearColor(PXLimage* image, PXLcolor color)
{
    for(uint32_t i = 0; i < image->width * image->height; i++)
    {
        memcpy(image->data + (i * 4), color.rgba, 4);
    }
}

I使用GPROF介绍了此软件渲染C库,并获得了以下结果:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  us/call  us/call  name    
 49.85     45.70    45.70   200000   228.50   228.50  pxlImageClearColor
 23.42     67.17    21.47                             _mcount_private
 11.52     77.73    10.56 1137265408     0.01     0.01  pxlImageSetPixelColor
  9.75     86.67     8.94                             __fentry__
  4.51     90.80     4.13   200000    20.65   206.13  pxlRendererDrawTriangle
  0.76     91.50     0.70   200000     3.50    66.64  pxlRendererDrawRect
  0.14     91.63     0.13   200000     0.65     4.23  pxlRendererDrawLine
  0.01     91.64     0.01   200000     0.05     0.05  pxlGetKey
  0.01     91.65     0.01   200000     0.05   228.55  pxlRendererClearColor
  0.01     91.66     0.01                             _pxlOutOfImageRange
  0.01     91.67     0.01                             main
  0.00     91.67     0.00  2000000     0.00     0.00  pxlRendererSetDrawColor
  0.00     91.67     0.00   200000     0.00     0.00  pxlWindowPresent
  0.00     91.67     0.00   100000     0.00     0.00  pxlWindowPollEvents
  0.00     91.67     0.00       10     0.00     0.00  _pxlFree
  0.00     91.67     0.00       10     0.00     0.00  _pxlMalloc

48%?为什么?所以我尝试更优化更多

void pxlImageClearColor(PXLimage* image, PXLcolor color) { for(uint32_t i = 0; i < image->width * image->height; i++) { ((uint32_t*)image->data)[i] = *((uint32_t*)color.rgba); } }
 -PROFILER结果:

Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls us/call us/call name 48.45 39.80 39.80 200000 199.00 199.00 pxlImageClearColor 24.31 59.77 19.97 _mcount_private 11.43 69.16 9.39 1137265408 0.01 0.01 pxlImageSetPixelColor 10.24 77.57 8.41 __fentry__ 4.59 81.34 3.77 200000 18.85 183.78 pxlRendererDrawTriangle 0.73 81.94 0.60 200000 3.00 59.15 pxlRendererDrawRect 0.15 82.06 0.12 pxlImageGetPixelColor 0.09 82.13 0.07 200000 0.35 3.54 pxlRendererDrawLine 0.01 82.14 0.01 200000 0.05 0.05 pxlGetKey 0.00 82.14 0.00 2000000 0.00 0.00 pxlRendererSetDrawColor 0.00 82.14 0.00 200000 0.00 199.00 pxlRendererClearColor 0.00 82.14 0.00 200000 0.00 0.00 pxlWindowPresent 0.00 82.14 0.00 100000 0.00 0.00 pxlWindowPollEvents 0.00 82.14 0.00 10 0.00 0.00 _pxlFree 0.00 82.14 0.00 10 0.00 0.00 _pxlMalloc
速度更快,但仍然很慢,我想念什么?我该如何优化?

这一切都在我的CPU(i5-12450h)上运行,在一个核心上,一个像素数组正渲染到每个帧中的Win32窗口中。我们将图像清除为单个颜色,在顶部绘制一些东西,然后将其显示在窗户上。

c optimization rendering
1个回答
0
投票
memset(image->data, color.rgba[0], image->width * image->height * 4);

即可。您的后备应是您自己的自定义模拟实现,允许超过8位Paramater,在这种情况下为32位参数。您的memset32看起来像: `

void memset32(void* dest, uint32_t value, size_t count)
{
    uint32_t* dest32 = (uint32_t*)dest; 
    for (size_t i = 0; i < count; i++)
    {
        dest32[i] = value;
    }
}


    

© www.soinside.com 2019 - 2024. All rights reserved.