我正在尝试一种无需将所有文件加载到内存中即可移动到轨道上所需位置的方法。并且without使用vorbisfile
,因为文件存储在远程服务器中。我阅读了文档中有关寻求的段落,但无法理解。
如果远程服务器允许您将HTTP GET与Range标头一起使用,则可以通过发送一堆针对不同部分的请求来“伪造”文件访问,就像对本地文件一样……
假设:该文件是Ogg封装的,并且其中仅包含Vorbis流...
如果操作正确,在大多数情况下,寻求传输的空间应小于100KB。
更新:
二等分搜索有点不直观...这个想法是在文件中跳转以寻找正确的页面,但是每次跳转都是由先前的跳转和当前页面“通知”的。例子可能是最好的:
要在具有1,000,000个样本的文件中尝试对300,000个样本进行采样(我假设我们在上面的第4步中:]
可能有更好的算法,但这是基本思想。
请记住,颗粒位置是页面结尾处的样本计数,因此,当您找到正确的页面时,其颗粒位置将比您的目标稍微大。
查找ogg文件非常困难。
要理解的事情清单
您将需要此功能
int buffer_data(){
//oy is an ogg_sync_state https://xiph.org/ogg/doc/libogg/ogg_sync_state.html
//in is just a file
char *buffer=ogg_sync_buffer(&oy,4096);
int bytes=fread(buffer,1,4096,&in);
ogg_sync_wrote(&oy,bytes);
return(bytes);
}
在下面的代码中,除了ogg页面和ogg数据包,我还将添加另一层,本质上是文件缓冲区。本质上,我的代码仅将每个文件缓冲区的第一个同步结束页面一分为二。
当我找不到ogg_page_sync时,我的代码创建第二个块游标以加载下一个4k文件缓冲区,直到找到页面同步或超出边界为止。
#include <unordered_map>
struct _page_info {
size_t block_number;
double_t time;
ogg_int64_t granulepos;
};
struct _page_info left_page = { .time = 0, .block_number = 0, .granulepos = 0 };
struct _page_info mid_page = { .time = 0, .block_number = 0, .granulepos = 0 };
struct _page_info right_page = { .time = DBL_MAX, .block_number = 0x7FFFFFFFFFFFFFFF, .granulepos = 0x7FFFFFFFFFFFFFFF };
unordered_map<int, double> block_time;
HashMap<ogg_int64_t, _page_info> page_info_table;
ogg_page og;
while (left <= right) {
//Seek to block
size_t mid_block = left + (right - left) / 2;
int block = mid_block;
if (block_time.has(block)) {
//Check whether this block has been visited
break;
}
//clear the sync state
ogg_sync_reset(&oy);
file.seek(block * buffer_size);
buffer_data();
bool next_midpoint = true;
while (true) {
//keep syncing until a page is found. Buffer is only 4k while ogg pages can be up to 65k in size
int ogg_page_sync_state = ogg_sync_pageout(&oy, &og);
if (ogg_page_sync_state == -1) {
//Give up when the file advances past the right boundary
if (buffer_data() == 0) {
right = mid_block;
break;
} else {
//increment block size we buffered the next block
block++;
}
} else {
if (ogg_page_sync_state == 0) {
//Check if I reached the end of the file
if (buffer_data() == 0) {
right = mid_block;
break;
} else {
block++;
}
} else {
//Only pages with a end packet have granulepos. Check the stream
if (ogg_page_packets(&og) > 0 && ogg_page_serialno(&og) == vo.serialno) {
next_midpoint = false;
break;
}
}
}
}
if (next_midpoint)
continue;
ogg_int64_t granulepos = ogg_page_granulepos(&og);
ogg_int64_t page_number = ogg_page_pageno(&og);
struct _page_info pg_info = { .time = ogg_page_granulepos(vd, granulepos), .block_number = mid_block, .granulepos = granulepos };
page_info_table[page_number] = pg_info;
block_time[mid_block] = pg_info.time;
mid_page = pg_info;
//I can finally implement the binary search comparisons
if (abs(p_time - pg_info.time) < .001) {
//The video managed to be equal
right_page = pg_info;
break;
}
if (pg_info.time > p_time) {
if (pg_info.granulepos < right_page.granulepos)
right_page = pg_info;
right = mid_block;
} else {
if (pg_info.granulepos > left_page.granulepos)
left_page = pg_info;
left = mid_block;
}
}
完成后,基本上就回溯了ogg_pages,直到找到所需的ogg_packet。
这里是一个使用串行增量数据包计算时间戳的技巧
while(ogg_page_packetout(&oy, &og) > 0)
ogg_stream_pagein(&vo, &og);
ogg_int64_t last_granule = ogg_page_granulepos(&og);
ogg_int64_t total_granule = ogg_page_packets(&og));
while(ogg_stream_packetout(&vo, &op) > 0 ) {
double time = vorbis_granule_time(&vd, last_granule + op.packetno - total_granule--);
}
https://xiph.org/ogg/doc/libogg/reference.html
https://github.com/xiph/theora/blob/master/examples/player_example.c