參考自:https://blog.csdn.net/u012323667/article/details/79214336
https://blog.csdn.net/szfhy/article/details/52448906
G711也稱爲PCM(脈衝編碼調製),是國際電信聯盟制定出來的一套語音壓縮標準,主要用於電話。G711編碼的聲音清晰度好,語音自然度高,但壓縮效率低,數據量大常在32Kbps以上(推薦使用64Kbps)。它主要用脈衝編碼調製對音頻採樣,採樣率爲8KHz。它利用一個64Kbps未壓縮通道傳輸語音訊號。其壓縮率爲1:2,即把16位數據壓縮成8位
。G.711是主流的波形聲音編解碼器。
G.711 標準下主要有兩種壓縮算法。一種是µ-law algorithm (又稱often u-law, ulaw, mu-law),主要運用於北美和日本;另一種是a-law algorithm,主要運用於歐洲和世界其他地區。其中,後者是特別設計用來方便計算機處理的。這兩種算法都使用一個採樣率爲8kHz的輸入來創建64Kbps的數字輸出。
1、a-law
a-law也叫g711a,輸入的是13位(其實是S16的高13位
),使用在歐洲和其他地區,這種格式是經過特別設計的,便於數字設備進行快速運算。在WAV文件中的識別標誌是 WAVE_FORMAT_ALAW
。
目前最主要的用途是將13bit的數據轉化成爲8bit的數據,13bit的最高位是符號位,轉化成爲的8bit並不是線性的,而是每幾位都有其意義。
- 第7位: 代表符號位,1表示正數,0表示負數,這點和一般的計算機系統正好相反
- 第4-6位: 這實際上是一個表示一種冪級數的位
- 第0-3位: 是具體的量化數值
運算過程如下:
(1) 取符號位並取反得到s,
(2) 獲取強度位eee,獲取方法如圖所示
(3) 獲取高位樣本位wxyz
(4) 組合爲seeewxyz,將seeewxyz逢偶數位取反,編碼完畢
示例:
輸入pcm數據爲3210,二進制對應爲(0000 1100 1000 1010)
二進制變換下排列組合方式(0 0001 1001 0001010)
(1) 獲取符號位最高位爲0,取反,s=1
(2) 獲取強度位0001,查表,編碼制應該是eee=100
(3) 獲取高位樣本wxyz=1001
(4) 組合爲11001001,逢偶數位取反爲10011100
編碼完畢。
編碼代碼如下:
#define MAX (32635)
void encode(unsigned char *dst, short *src, size_t len)
{
for(int i = 0; i < len ; i++)
{
// *dst++ = *src++;
short pcm = *src++;
int sign = (pcm & 0x8000) >> 8;
if(sign != 0)
pcm = -pcm;
if(pcm > MAX) pcm = MAX;
int exponent = 7;
int expMask;
for(expMask = 0x4000; (pcm & expMask) == 0 && exponent >0; exponent--,expMask >>= 1){}
int mantissa = (pcm >> ((exponent == 0) ? 4 : (exponent + 3))) & 0x0f;
unsigned char alaw = (unsigned char)(sign | exponent << 4 | mantissa);
*dst++ = (unsigned char)(alaw ^0xD5);
}
}
譯碼代碼如下:
void decode(short *dst, unsigned char *src, size_t len)
{
for(size_t i=0; i < len ; i++)
{
unsigned char alaw = *src++;
alaw ^= 0xD5;
int sign = alaw & 0x80;
int exponent = (alaw & 0x70) >> 4;
int data = alaw & 0x0f;
data <<= 4;
data += 8; //丟失的a 寫1
if(exponent != 0) //將wxyz前面的1補上
data += 0x100;
if(exponent > 1)
data <<= (exponent - 1);
*dst++ = (short)(sign == 0 ? data : -data);
}
}
從編解碼的過程中,不難看出這種編碼方式的思路:
首先,我們確定目的,原始的碼流是13bit,除去符號位,12bit;我們需要將其轉化成爲8bit,除去符號位,還有7bit。
基本是思想就是:記錄符號位,記錄下最有效數據位(也就是除去符號位第一個1)後的4位,記錄下這4位移動到最低位所需的次數(放在0-3位中),解碼的時候根據這些信息還原,也就是最高的5位會是準確的,後面緊跟一個1是用來補償捨棄的數據,這就是誤差的來源。
但是在實際的操作中,是會直接將最後4位捨棄,然後在開始保留數據操作,也就是要是有效數據小於4位,最後全捨棄了,就會補償爲8。若大於4位小於8位,最後準確的數據可能就不會有5位了。
2、µ-law
µ-law也叫g711µ,使用在北美和日本,輸入的是14位
,編碼算法就是查表,沒啥複雜算法,就是基礎值+平均偏移值,具體示例如下:
pcm=2345
(1)取得範圍值
+4062 to +2015 in 16 intervals of 128
(2)得到基礎值0x90,
(3)間隔數128,
(4)區間基本值4062,
(5)當前值2345和區間基本值差異4062-2345=1717,
(6)偏移值=1717/間隔數=1717/128,取整得到13,
(7)輸出爲0x90+13=0x9D
爲了簡化編碼過程,原始的線性幅度增加了33,使得編碼範圍從(0 - 8158)變爲(33 - 8191)。結果如下表所示:
Biased Linear Input Code | Compressed Code |
---|---|
00000001wxyza | 000wxyz |
0000001wxyzab | 001wxyz |
000001wxyzabc | 010wxyz |
00001wxyzabcd | 011wxyz |
0001wxyzabcde | 100wxyz |
001wxyzabcdef | 101wxyz |
01wxyzabcdefg | 110wxyz |
1wxyzabcdefgh | 111wxyz |
每個偏置線性輸入碼都有一個前導1來標識段號。段號的值等於7減去前導0的個數。量化間隔的個數是直接可用的四個位wxyz。後面的位(a - h)被忽略。
由上可見,這種編碼方式和a-Law是極爲類似的,是將14bit的數據編碼成爲8bit的數據(最高位都是符號位),8bit的每位數據的意義和a-Law編碼是一致的。但是和a-Law不同的是,這是對14bit的數據進行壓縮,並且壓縮完後的數據不僅僅是偶數位取反,而是每一位都需要做取反操作
。
並且,注意到表中,這裏的最高位都是從1開始,和a-Law相比,這裏加了一位,但是seg還是3bit,所以對照兩表的第一個數是有明顯差異的,於是每個數加33是很有必要的,然後在解碼的時候減去,相當於移位就直接減少了一種情況。
注意:在g711.c代碼中,BIAS的值是0x84,該值是由33<<2得來的。
g711.c代碼如下:
/*
* This source code is a product of Sun Microsystems, Inc. and is provided
* for unrestricted use. Users may copy or modify this source code without
* charge.
*
* SUN SOURCE CODE IS PROVIDED AS IS WITH NO WARRANTIES OF ANY KIND INCLUDING
* THE WARRANTIES OF DESIGN, MERCHANTIBILITY AND FITNESS FOR A PARTICULAR
* PURPOSE, OR ARISING FROM A COURSE OF DEALING, USAGE OR TRADE PRACTICE.
*
* Sun source code is provided with no support and without any obligation on
* the part of Sun Microsystems, Inc. to assist in its use, correction,
* modification or enhancement.
*
* SUN MICROSYSTEMS, INC. SHALL HAVE NO LIABILITY WITH RESPECT TO THE
* INFRINGEMENT OF COPYRIGHTS, TRADE SECRETS OR ANY PATENTS BY THIS SOFTWARE
* OR ANY PART THEREOF.
*
* In no event will Sun Microsystems, Inc. be liable for any lost revenue
* or profits or other special, indirect and consequential damages, even if
* Sun has been advised of the possibility of such damages.
*
* Sun Microsystems, Inc.
* 2550 Garcia Avenue
* Mountain View, California 94043
*/
/*
* g711.c
*
* u-law, A-law and linear PCM conversions.
*/
#define SIGN_BIT (0x80) /* Sign bit for a A-law byte. */
#define QUANT_MASK (0xf) /* Quantization field mask. */
#define NSEGS (8) /* Number of A-law segments. */
#define SEG_SHIFT (4) /* Left shift for segment number. */
#define SEG_MASK (0x70) /* Segment field mask. */
static short seg_end[8] = {0xFF, 0x1FF, 0x3FF, 0x7FF,
0xFFF, 0x1FFF, 0x3FFF, 0x7FFF};
/* copy from CCITT G.711 specifications */
unsigned char _u2a[128] = { /* u- to A-law conversions */
1, 1, 2, 2, 3, 3, 4, 4,
5, 5, 6, 6, 7, 7, 8, 8,
9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24,
25, 27, 29, 31, 33, 34, 35, 36,
37, 38, 39, 40, 41, 42, 43, 44,
46, 48, 49, 50, 51, 52, 53, 54,
55, 56, 57, 58, 59, 60, 61, 62,
64, 65, 66, 67, 68, 69, 70, 71,
72, 73, 74, 75, 76, 77, 78, 79,
81, 82, 83, 84, 85, 86, 87, 88,
89, 90, 91, 92, 93, 94, 95, 96,
97, 98, 99, 100, 101, 102, 103, 104,
105, 106, 107, 108, 109, 110, 111, 112,
113, 114, 115, 116, 117, 118, 119, 120,
121, 122, 123, 124, 125, 126, 127, 128};
unsigned char _a2u[128] = { /* A- to u-law conversions */
1, 3, 5, 7, 9, 11, 13, 15,
16, 17, 18, 19, 20, 21, 22, 23,
24, 25, 26, 27, 28, 29, 30, 31,
32, 32, 33, 33, 34, 34, 35, 35,
36, 37, 38, 39, 40, 41, 42, 43,
44, 45, 46, 47, 48, 48, 49, 49,
50, 51, 52, 53, 54, 55, 56, 57,
58, 59, 60, 61, 62, 63, 64, 64,
65, 66, 67, 68, 69, 70, 71, 72,
73, 74, 75, 76, 77, 78, 79, 79,
80, 81, 82, 83, 84, 85, 86, 87,
88, 89, 90, 91, 92, 93, 94, 95,
96, 97, 98, 99, 100, 101, 102, 103,
104, 105, 106, 107, 108, 109, 110, 111,
112, 113, 114, 115, 116, 117, 118, 119,
120, 121, 122, 123, 124, 125, 126, 127};
static int
search(
int val,
short *table,
int size)
{
int i;
for (i = 0; i < size; i++) {
if (val <= *table++)
return (i);
}
return (size);
}
/*
* linear2alaw() - Convert a 16-bit linear PCM value to 8-bit A-law
*
* linear2alaw() accepts an 16-bit integer and encodes it as A-law data.
*
* Linear Input Code Compressed Code
* ------------------------ ---------------
* 0000000wxyza 000wxyz
* 0000001wxyza 001wxyz
* 000001wxyzab 010wxyz
* 00001wxyzabc 011wxyz
* 0001wxyzabcd 100wxyz
* 001wxyzabcde 101wxyz
* 01wxyzabcdef 110wxyz
* 1wxyzabcdefg 111wxyz
*
* For further information see John C. Bellamy's Digital Telephony, 1982,
* John Wiley & Sons, pps 98-111 and 472-476.
*/
unsigned char
linear2alaw(
int pcm_val) /* 2's complement (16-bit range) */
{
int mask;
int seg;
unsigned char aval;
if (pcm_val >= 0) {
mask = 0xD5; /* sign (7th) bit = 1 */
} else {
mask = 0x55; /* sign bit = 0 */
pcm_val = -pcm_val - 8;
}
/* Convert the scaled magnitude to segment number. */
seg = search(pcm_val, seg_end, 8);
/* Combine the sign, segment, and quantization bits. */
if (seg >= 8) /* out of range, return maximum value. */
return (0x7F ^ mask);
else {
aval = seg << SEG_SHIFT;
if (seg < 2)
aval |= (pcm_val >> 4) & QUANT_MASK;
else
aval |= (pcm_val >> (seg + 3)) & QUANT_MASK;
return (aval ^ mask);
}
}
/*
* alaw2linear() - Convert an A-law value to 16-bit linear PCM
*
*/
int
alaw2linear(
unsigned char a_val)
{
int t;
int seg;
a_val ^= 0x55;
t = (a_val & QUANT_MASK) << 4;
seg = ((unsigned)a_val & SEG_MASK) >> SEG_SHIFT;
switch (seg) {
case 0:
t += 8;
break;
case 1:
t += 0x108;
break;
default:
t += 0x108;
t <<= seg - 1;
}
return ((a_val & SIGN_BIT) ? t : -t);
}
#define BIAS (0x84) /* Bias for linear code. */
/*
* linear2ulaw() - Convert a linear PCM value to u-law
*
* In order to simplify the encoding process, the original linear magnitude
* is biased by adding 33 which shifts the encoding range from (0 - 8158) to
* (33 - 8191). The result can be seen in the following encoding table:
*
* Biased Linear Input Code Compressed Code
* ------------------------ ---------------
* 00000001wxyza 000wxyz
* 0000001wxyzab 001wxyz
* 000001wxyzabc 010wxyz
* 00001wxyzabcd 011wxyz
* 0001wxyzabcde 100wxyz
* 001wxyzabcdef 101wxyz
* 01wxyzabcdefg 110wxyz
* 1wxyzabcdefgh 111wxyz
*
* Each biased linear code has a leading 1 which identifies the segment
* number. The value of the segment number is equal to 7 minus the number
* of leading 0's. The quantization interval is directly available as the
* four bits wxyz. * The trailing bits (a - h) are ignored.
*
* Ordinarily the complement of the resulting code word is used for
* transmission, and so the code word is complemented before it is returned.
*
* For further information see John C. Bellamy's Digital Telephony, 1982,
* John Wiley & Sons, pps 98-111 and 472-476.
*/
unsigned char
linear2ulaw(
int pcm_val) /* 2's complement (16-bit range) */
{
int mask;
int seg;
unsigned char uval;
/* Get the sign and the magnitude of the value. */
if (pcm_val < 0) {
pcm_val = BIAS - pcm_val;
mask = 0x7F;
} else {
pcm_val += BIAS;
mask = 0xFF;
}
/* Convert the scaled magnitude to segment number. */
seg = search(pcm_val, seg_end, 8);
/*
* Combine the sign, segment, quantization bits;
* and complement the code word.
*/
if (seg >= 8) /* out of range, return maximum value. */
return (0x7F ^ mask);
else {
uval = (seg << 4) | ((pcm_val >> (seg + 3)) & 0xF);
return (uval ^ mask);
}
}
/*
* ulaw2linear() - Convert a u-law value to 16-bit linear PCM
*
* First, a biased linear code is derived from the code word. An unbiased
* output can then be obtained by subtracting 33 from the biased code.
*
* Note that this function expects to be passed the complement of the
* original code word. This is in keeping with ISDN conventions.
*/
int
ulaw2linear(
unsigned char u_val)
{
int t;
/* Complement to obtain normal u-law value. */
u_val = ~u_val;
/*
* Extract and bias the quantization bits. Then
* shift up by the segment number and subtract out the bias.
*/
t = ((u_val & QUANT_MASK) << 3) + BIAS;
t <<= ((unsigned)u_val & SEG_MASK) >> SEG_SHIFT;
return ((u_val & SIGN_BIT) ? (BIAS - t) : (t - BIAS));
}
/* A-law to u-law conversion */
unsigned char
alaw2ulaw(
unsigned char aval)
{
aval &= 0xff;
return ((aval & 0x80) ? (0xFF ^ _a2u[aval ^ 0xD5]) :
(0x7F ^ _a2u[aval ^ 0x55]));
}
/* u-law to A-law conversion */
unsigned char
ulaw2alaw(
unsigned char uval)
{
uval &= 0xff;
return ((uval & 0x80) ? (0xD5 ^ (_u2a[0xFF ^ uval] - 1)) :
(0x55 ^ (_u2a[0x7F ^ uval] - 1)));
}