多媒体知识体系之音频格式
2016-07-28 10:04:59 2 举报
AI智能生成
多媒体知识体系之音频格式
作者其他创作
大纲/内容
AAC
常见格式
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">LATM格式</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">Low-overhead MPEG-4 Audio TransportMultiplex</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">主要由</span><span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">AudioSpecificConfig(音频特定配置单元)与音频负载</span>
带内传
每一个LATM 帧,都含有一个AudioSpecificConfig 信息
带外传
每一个LATM帧都不含有AudioSpecificConfig 信息<br>而通过其他方式把AudioSpecificConfig信息发送到解码端<br>由于AudioSpecificConfig 信息一般是不变的<br>所以只需发送一次即可
<span lang="EN-US" style="word-wrap: break-word; color: rgb(102, 102, 102); font-size: 12pt; line-height: 24px; font-family: 'Times New Roman', serif;">CMMB</span><span style="word-wrap: break-word; color: rgb(102, 102, 102); font-family: 宋体; font-size: 12pt; line-height: 24px;">中音频压缩标准为</span><span lang="EN-US" style="word-wrap: break-word; color: rgb(102, 102, 102); font-size: 12pt; line-height: 24px; font-family: 'Times New Roman', serif;">AAC</span><span style="word-wrap: break-word; color: rgb(102, 102, 102); font-family: 宋体; font-size: 12pt; line-height: 24px;">时,默认采用</span><span lang="EN-US" style="word-wrap: break-word; color: rgb(102, 102, 102); font-size: 12pt; line-height: 24px; font-family: 'Times New Roman', serif;">LATM</span><span style="word-wrap: break-word; color: rgb(102, 102, 102); font-family: 宋体; font-size: 12pt; line-height: 24px;">封装</span>
<span style="word-wrap: break-word; color: rgb(102, 102, 102); font-family: 宋体; font-size: 12pt; line-height: 24px;">适用于</span><span lang="EN-US" style="word-wrap: break-word; color: rgb(102, 102, 102); font-size: 12pt; line-height: 24px; font-family: 'Times New Roman', serif;">RTP</span><span style="word-wrap: break-word; color: rgb(102, 102, 102); font-family: 宋体; font-size: 12pt; line-height: 24px;">传输</span>
ADTS
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">Audio Data Transport Stream</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">ADTS头</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">syncword</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">ID 总是设置为1</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">layer Indicates which layer is used. Set to ‘00’</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;"> protection_absent 表示是否误码校验</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;"> profile 表示使用哪个级别的AAC,如01 Low Complexity(LC)--- AACLC</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">sampling_frequency_index 表示使用的采样率下标</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">sampling_frequency_index sampling frequeny [Hz]</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">channel_configuration 表示声道数</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">frame_length 一个ADTS帧的长度包括ADTS头和raw data block.</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">adts_buffer_fullness 0x7FF 说明是码率可变的码流</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">number_of_raw_data_blocks_in_frame</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">表示ADTS帧中有number_of_raw_data_blocks_in_frame + 1个AAC原始帧</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">number_of_raw_data_blocks_in_frame == 0 表示说ADTS帧中有一个AAC数据块并不是说没有</span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">一个AAC原始帧包含一段时间内1024个采样及相关数据</span>
<p style="font-family: Arial; font-size: 14px; line-height: 26px; color: rgb(51, 51, 51); margin-top: 0px; margin-bottom: 0px; padding-top: 0px; padding-bottom: 0px;"><strong>adts_fixed_header();</strong><br></p>
<div> adts_fixed_header()</div><div> {</div><div> syncword: 12 bslbf</div><div> ID: 1 bslbf</div><div> layer: 2 uimsbf</div><div> protection_absent: 1 bslbf</div><div> profile: 2 uimsbf</div><div> sampling_frequency_index: 4 uimsbf</div><div> private_bit: 1 bslbf</div><div> channel_configuration: 3 uimsbf</div><div> original/copy: 1 bslbf</div><div> home: 1 bslbf</div><div> }</div>
<strong style="color: rgb(51, 51, 51); font-family: Arial; font-size: 14px; line-height: 26px;">adts_variable_header();</strong>
<div>adts_variable_header()</div><div>{</div><div> copyright_identification_bit: 1 bslbf</div><div> copyright_identification_start: 1 bslbf</div><div> frame_length: 13 bslbf</div><div> adts_buffer_fullness: 11 bslbf</div><div> number_of_raw_data_blocks_in_frame: 2 uimsfb</div><div>}</div>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">ADTS帧 </span>
<span style="color: rgb(51, 51, 51); font-family: Verdana, Arial, Tahoma; font-size: 14px; line-height: 25px;">原始帧加上ADTS头进行ADTS 的封装</span>
默认状况下,编码参数如下:双声道,采样率24KHZ,帧长变长,码流可变码率的码流,<br>一般采用的AAC profile为AAC-LC。<br>
MPEG-2 TS中多以ADTS格式封装AAC
开源解码库
FAAD2
<span style="color: rgb(64, 64, 64); font-family: verdana, tahoma, arial; font-size: 10.6667px; line-height: 21.3333px; background-color: rgb(248, 248, 248);">an open source MPEG-4 and MPEG-2 AAC decoder</span>
<div>NeAACDecHandle NEAACAPI NeAACDecOpen(void);</div><div>创建解码环境并返回一个句柄</div><div>void NEAACAPI NeAACDecClose(NeAACDecHandle hDecoder);</div><div>关闭解码环境</div><div>NeAACDecConfigurationPtr NEAACAPI NeAACDecGetCurrentConfiguration(NeAACDecHandle hDecoder);</div><div>获取当前解码器库的配置</div><div>unsigned char NEAACAPI NeAACDecSetConfiguration(NeAACDecHandle hDecoder, NeAACDecConfigurationPtr config);</div><div>为解码器库设置一个配置结构</div><div>long NEAACAPI NeAACDecInit(NeAACDecHandle hDecoder, unsigned char *buffer, unsigned long buffer_size, unsigned long *samplerate, unsigned char *channels);</div><div>初始化解码器库</div><div>void* NEAACAPI NeAACDecDecode(NeAACDecHandle hDecoder, NeAACDecFrameInfo *hInfo, unsigned char *buffer, unsigned long buffer_size);</div><div>解码AAC数据</div>
MP3
PCM
<span style="color: rgb(51, 51, 51); font-family: 'Courier New'; font-size: 16px; line-height: 27.7778px;">Pulse Code Modulation也被称为 脉码编码调制</span>
<span style="color: rgb(51, 51, 51); font-family: 'Courier New'; font-size: 16px; line-height: 27.7778px;">基本组织单位是BYTE(8bit)或WORD(16bit)</span>
<span style="color: rgb(51, 51, 51); font-family: 'Courier New'; font-size: 16px; line-height: 27.7778px;">一般情况下,一帧PCM是由2048次采样组成的</span>
最常见的PCM格式
<span style="color: rgb(51, 51, 51); font-family: 'Microsoft Yahei'; font-size: 16px; line-height: 30px;">WAV、APE、FLAC</span>
<span style="color: rgb(51, 51, 51); font-family: 'Microsoft Yahei'; font-size: 16px; line-height: 30px;">最常说的“无损音频”来说</span><span style="color: rgb(51, 51, 51); font-family: 'Microsoft Yahei'; font-size: 16px; line-height: 30px;">,一般都是指传统CD格式中的16bit/44.1kHz采样率的文件格式,而知所以称为无损压缩,也是因为其包含了20Hz-22.05kHz这个完全覆盖人耳可闻范围的频响频率而得名</span>
数据存放
<img src="http://hi.csdn.net/attachment/201107/25/0_1311585049gQlJ.gif">
对于ffmpeg来说
音频数据会保存在AVFrame中extended_data数组中,如果是打包模式(packed),就只用extended_data[0],<br>如果是planar模式,则每个channel分别保存在extended_data[i]中。<br>对于音频,只有linesize[0]有效,打包模式保存整个音频帧的buff大小,planar模式保存每个channel的buff大小
<img src="http://static.oschina.net/uploads/space/2013/1009/210700_BdlQ_589963.jpg">
<img src="http://static.oschina.net/uploads/space/2013/1009/210802_UeK3_589963.jpg">
<div>short *sample_buffer_L = pFrame->extended_data[0];//存放着左声道的数据</div><div>short *sample_buffer_R = pFrame->extended_data[1];//存放着右声道的数据</div><div><br></div><div>两者都是16bit,而裸的PCM文件里的数据是按照 LRLRLRLR 这样存储的,所以我们需要按照这种格式存储16bit的数据:</div><div><br></div><div>//Left channel</div><div>data[i] = (char)(sample_buffer_L[j] & 0xff);//左声道低8位</div><div>data[i+1] = (char)((sample_buffer_L[j]>>8) & 0xff);;//左声道高8位</div><div>//Right channel</div><div>data[i+2] = (char)(sample_buffer_R[j] & 0xff);//右声道低8位</div><div>data[i+3] = (char)((sample_buffer_R[j]>>8) & 0xff);;//右声道高8位</div>
收藏
0 条评论
下一页