我写了一个类似grafika的ContinualCaptureActivity的视频录制demo(ContinentCaptureActivity.java源代码)。
不同的是grafika使用的是硬件编码,而我使用的是软件编码。对于软件编码,我使用 PBO 从 GPU 获取每个视频帧,速度非常快,并将图像数据复制到 ffmpeg,然后进行 h264 编码。
对于大多数设备来说,性能是可以接受的,glMapBufferRange() 花费的时间少于5ms,memcpy() 花费的时间少于10ms。
但是在华为mate7的手机上性能较低。 glMapBufferRange() 花费了 15~30ms,memcpy() 花费了 25~35ms。
我在mate7上测试过普通memcpy(),复制普通内存时速度要快得多。
真的很奇怪,谁能帮帮我?
设备信息:
chipset of the phone: HiSilicon Kirin 925
cpu of the phone: Quad-core 1.8 GHz Cortex-A15 & quad-core 1.3 GHz Cortex-A7
查看详情:huawei mate 7
pbo代码如下:
final int buffer_num = 1;
final int pbo_id[] = new int[buffer_num];
private void getPixelFromPBO(int width, int height, boolean isDefaultFb) {
try {
long start = System.currentTimeMillis();
final int pbo_size = width * height * 4;
if (mFrameNum == 0) {
GLES30.glGenBuffers(buffer_num, pbo_id, 0);
Log.d(TAG, "glGenBuffers pbo_id[0]:" + pbo_id[0]);
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, pbo_id[0]);
//glBufferData creates a new data store for the buffer object currently bound to target
GLES30.glBufferData(GLES30.GL_PIXEL_PACK_BUFFER, pbo_size, null, GLES30.GL_DYNAMIC_READ);
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);
}
GLES30.glPixelStorei(GLES30.GL_PACK_ALIGNMENT, 1);
checkGlError("glPixelStorei");
//we need read GL_BACK when the default frame buffer is binded
//glReadBuffer specifies a color buffer as the source for subsequent glReadPixels, , glCopyTexImage2D, glCopyTexSubImage2D, and glCopyTexSubImage3D commands
if (isDefaultFb) {
GLES30.glReadBuffer(GLES30.GL_BACK);
} else {
GLES30.glReadBuffer(GLES30.GL_COLOR_ATTACHMENT0);
}
checkGlError("glReadBuffer");
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, pbo_id[0]);
checkGlError("glBindBuffer 1 ");
long ts = System.currentTimeMillis();
glReadPixelsPBOJNI(0, 0, width, height, GLES30.GL_RGBA, GLES30.GL_UNSIGNED_BYTE, 0);
Log.d(TAG, "glReadPixelsPBOJNI took " + (System.currentTimeMillis() - ts) + "ms\n\n\n");
//GLES30.glReadPixels(0, 0, width, height, GLES30.GL_RGBA, GLES30.GL_UNSIGNED_BYTE, null);
//glReadPixelsPBOJNI(0, 0, height, width, GLES30.GL_RGBA, GLES30.GL_UNSIGNED_BYTE, 0);
checkGlError("glReadPixels");
ts = System.currentTimeMillis();
ByteBuffer buf = (ByteBuffer) GLES30.glMapBufferRange(
GLES30.GL_PIXEL_PACK_BUFFER, 0, pbo_size, GLES30.GL_MAP_READ_BIT);
checkGlError("glMapBufferRange");
Log.d(TAG, "*****glMapBufferRange took " + (System.currentTimeMillis() - ts) + "ms");
ts = System.currentTimeMillis();
cpoyDataToFFmpeg(buf, 1, 1);
Log.d(TAG, "####cpoyDataToFFmpeg took " + (System.currentTimeMillis() - ts) + "ms\n\n\n");
GLES30.glUnmapBuffer(GLES30.GL_PIXEL_PACK_BUFFER);
checkGlError("glUnmapBuffer");
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);
checkGlError("glBindBuffer 0 ");
}catch (Exception e) {
Log.e(TAG, "DO PBO exp", e);
}
}
最后,我意识到我应该使用双PBO来改善数据传输,并且我们应该注意数据对齐。
单一PBO无法进行转移改进。