FFmpeg音頻解碼邏輯詳解

這篇博客主要任務是講封裝格式的音頻文件解碼成pcm數據，然後使用ffplay播放，其中ffplay 一定要配置環境變量否則不能播放

一、解封裝

解封裝主要任務就是把mp3 等封裝格式的文件，解析到FFmpeg對應的結構體中(AVFormatContext)。
AVFormatContext是FFmpeg一個存放解封裝格式數據的結構體，裏面存放了有媒體流(音視頻流），媒體文件包含的流的個數等等
解封裝主要有三步 avformat_alloc_context() 、avformat_open_input(&avFormatContext, src_url, NULL, &avDictionary) 、avformat_find_stream_info(avFormatContext, NULL)
avformat_alloc_context ：申請一個空的AVFormatContext 結構體
avformat_open_input ：打開媒體文件，檢驗這個文件是否是一個可以打開的文件
avformat_find_stream_info ：找到媒體文件中的流信息，然後這是到結構體中

   AVFormatContext *avFormatContext = avformat_alloc_context();

    avformat_network_init();
    AVDictionary *avDictionary;
    av_dict_set(&avDictionary, "timeout", "20000000", 0);

    if (avformat_open_input(&avFormatContext, src_url, NULL, &avDictionary)) {
        return;
    }

    av_dict_free(&avDictionary);

    if (avformat_find_stream_info(avFormatContext, NULL) < 0) {
        return;
    }

二、解碼

解碼需要理解四個結構體AVStream、 AVPacket 和 AVFrame 以及 AVCodecContext，其中AVPacket 是存放是編碼格式的一幀數據， AVFrame 存放的是解碼後的一幀數據。解碼的過程其實就是從AVCodecContext 取出一個AVPacket 解碼成 AVFrame的過程。

第一步、獲取到`AVCodecContext`

獲取到編解碼器上下文結構體，首先我們要得到流，因爲才能得到流身上的編解碼Id，才能找到編解碼器 AVCodec，然後再能得到編解碼器上下文 AVCodecContext

avFormatContext->streams[i] ： streams 是一個二級指針（指針數組），存放的是媒體文件中的流數據，如音頻流視頻流以及字幕流
avStream->codecpar : AVCodecParameters 編解碼器參數裏面存放了編解碼器相關的信息
avcodec_find_decoder(avCodecParameters->codec_id) : 獲取到編解碼器通過流的編解碼器參數的編解碼器id
avcodec_alloc_context3(avCodec) ：通過編解碼器獲取到編解碼器上下文
avcodec_parameters_to_context(avCodecContext, avCodecParameters) ：將編解碼器參數設置給編解碼器上下文
avcodec_open2(avCodecContext, avCodec, NULL) 將編解碼器打開，此時編解碼器上下文才可以用

	int audio_stream_index = -1;
    AVStream *avStream;
    for (int i = 0; i < avFormatContext->nb_streams; ++i) {
        avStream = avFormatContext->streams[i];
        if (avStream->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) {
            audio_stream_index = i;
            break;
        }
    }

    if (audio_stream_index == -1) {
        return;
    }
    AVCodecParameters *avCodecParameters = avStream->codecpar;
    AVCodec *avCodec = avcodec_find_decoder(avCodecParameters->codec_id);

    AVCodecContext *avCodecContext = avcodec_alloc_context3(avCodec);
    if (avcodec_parameters_to_context(avCodecContext, avCodecParameters)) {
        return;
    }
    if (avcodec_open2(avCodecContext, avCodec, NULL)) {
        return;
    }

第二步、AVPacket 轉換成AVFrame

主要就是從過AVFromatContext 取出一個AVPacket數據然後交給 AVCodecContext轉換成 AVFrame數據

av_packet_alloc()：申請一個AVPacket結構體空的
av_read_frame(avFormatContext, avPacket) ：從AVFromatContext中讀出一幀數據給AVPacket
av_frame_alloc : 申請一個AVFrame結構體
avcodec_send_packet(avCodecContext, avPacket) ：將avPacket數據發送給avCodecContext 這裏有些不好理解
avcodec_receive_frame(avCodecContext, avFrame) ：將avPacket數據轉給AVFrame

   AVPacket *avPacket = av_packet_alloc();

    while (av_read_frame(avFormatContext, avPacket) >= 0) {

        AVFrame *avFrame = av_frame_alloc();
        if (avcodec_send_packet(avCodecContext, avPacket)) {
            break;
        }

        int ret = avcodec_receive_frame(avCodecContext, avFrame);
        if (ret == AVERROR(EAGAIN)) {
            continue;
        } else if (ret < 0) {
            return;
        }

        if (avPacket->stream_index != audio_stream_index) {
            return;
        }
		//至此 一幀數據解碼完成
	}

三、重採樣

對PCM原始數據的採樣率、幀格式、通道數進行重採樣功能封裝

第一步、重採樣準備設置

swr_alloc() ：創建一個SwrContext 音頻轉換上下文結構體
swr_alloc_set_opts(swrContext, out_channel_layout, out_sample_fmt, out_sample_rate,in_channel_layout, in_sample_fmt, in_sample_rate,0, NULL) ：設置輸出和輸入格式，其中採樣率最好採用動態獲取，因爲每個音頻的採樣率可能不同
swr_init(swrContext) ：初始化上下文
計算緩衝數據輸出

    // 1、初始化一些重採樣需要的設置
    //輸入的信息
    AVSampleFormat in_sample_fmt = avCodecContext->sample_fmt;
    uint64_t in_channel_layout = avCodecContext->channel_layout;
    int in_sample_rate = avCodecContext->sample_rate;
    LOGD("in_sample_rate = %d",in_sample_rate);

    //輸出信息
    int out_channel_layout = AV_CH_LAYOUT_STEREO;
    int out_sample_rate = in_sample_rate;
    AVSampleFormat out_sample_fmt = AV_SAMPLE_FMT_S16;

    SwrContext *swrContext = swr_alloc();
    swr_alloc_set_opts(swrContext,
                       out_channel_layout, out_sample_fmt, out_sample_rate,
                       in_channel_layout, in_sample_fmt, in_sample_rate,
                       0, NULL);

    swr_init(swrContext);

    //定義一個緩存
    int channel_layout = av_get_channel_layout_nb_channels(out_channel_layout);
    int sample_fmt = av_get_bytes_per_sample(AV_SAMPLE_FMT_S16);
    LOGD("channel_layout = %d", channel_layout);
    int out_buffer_size = channel_layout * in_sample_rate * sample_fmt;
    LOGD("out_buffer_size = %d", out_buffer_size);
    uint8_t *out_buffers = static_cast<uint8_t *>(av_malloc(out_buffer_size/*2 * out_sample_rate*/));

    FILE *out_pcm = fopen(dst_file_path, "wb");

第二步、開始轉換

動態計算輸出數量

int64_t dst_nb_samples = av_rescale_rnd(swr_get_delay(swrContext,avFrame->sample_rate) + avFrame->nb_samples,
                                                out_sample_rate,avFrame->sample_rate,AV_ROUND_UP);

2、swr_convert() 真正轉換的api，avFrame->data 中往緩衝中輸入數據
3、計算採樣緩衝的大小方便，然後從緩衝往文件中寫


        int64_t dst_nb_samples = av_rescale_rnd(swr_get_delay(swrContext,avFrame->sample_rate) + avFrame->nb_samples,
                                                out_sample_rate,avFrame->sample_rate,AV_ROUND_UP);

        // 音頻重採樣
        // 聲卡 要求 音頻 輸出的格式統一（採用率統一，通道數統一，...）
        // 把PCM原始音頻數據  --> 統一處理 --》 建立統一格式

        swr_convert(swrContext,
                // 輸出相關的
                    &out_buffers, dst_nb_samples,
                // 輸入相關的
                    (const uint8_t **) (avFrame->data), avFrame->nb_samples
        );

        int out_buffer_size = av_samples_get_buffer_size(NULL, 2, avFrame->nb_samples,
                                                         out_sample_fmt, 1);
        fwrite(out_buffers, 1, out_buffer_size, out_pcm);

最後貼上完整的代碼

#include <jni.h>
#include <string>

extern "C" {
#include <libavformat/avformat.h>
#include <libswresample/swresample.h>
}

#include <android/log.h>

#define TAG "dsh"
#define LOGD(...)__android_log_print(ANDROID_LOG_DEBUG,TAG,__VA_ARGS__)


extern "C" JNIEXPORT jstring JNICALL
Java_com_example_audiodecode_MainActivity_stringFromJNI(
        JNIEnv *env,
        jobject /* this */) {
    std::string hello = "Hello from C++";
    return env->NewStringUTF(hello.c_str());
}
extern "C"
JNIEXPORT void JNICALL
Java_com_example_audiodecode_MainActivity_audioDecode(JNIEnv *env, jobject thiz, jstring input,
                                                      jstring output) {

    const char *src_url = env->GetStringUTFChars(input, NULL);
    const char *dst_file_path = env->GetStringUTFChars(output, NULL);

    AVFormatContext *avFormatContext = avformat_alloc_context();

    avformat_network_init();
    AVDictionary *avDictionary;
    av_dict_set(&avDictionary, "timeout", "20000000", 0);

    if (avformat_open_input(&avFormatContext, src_url, NULL, &avDictionary)) {
        return;
    }

    av_dict_free(&avDictionary);

    if (avformat_find_stream_info(avFormatContext, NULL) < 0) {
        return;
    }

    int audio_stream_index = -1;
    AVStream *avStream;
    for (int i = 0; i < avFormatContext->nb_streams; ++i) {
        avStream = avFormatContext->streams[i];
        if (avStream->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) {
            audio_stream_index = i;
            break;
        }
    }

    if (audio_stream_index == -1) {
        return;
    }
    AVCodecParameters *avCodecParameters = avStream->codecpar;
    AVCodec *avCodec = avcodec_find_decoder(avCodecParameters->codec_id);

    AVCodecContext *avCodecContext = avcodec_alloc_context3(avCodec);
    if (avcodec_parameters_to_context(avCodecContext, avCodecParameters)) {
        return;
    }
    if (avcodec_open2(avCodecContext, avCodec, NULL)) {
        return;
    }

    // todo 定義重載樣需要的數據
    // 1、初始化一些重採樣需要的設置
    //輸入的信息
    AVSampleFormat in_sample_fmt = avCodecContext->sample_fmt;
    uint64_t in_channel_layout = avCodecContext->channel_layout;
    int in_sample_rate = avCodecContext->sample_rate;
    LOGD("in_sample_rate = %d",in_sample_rate);

    //輸出信息
    int out_channel_layout = AV_CH_LAYOUT_STEREO;
    int out_sample_rate = in_sample_rate;
    AVSampleFormat out_sample_fmt = AV_SAMPLE_FMT_S16;

    SwrContext *swrContext = swr_alloc();
    swr_alloc_set_opts(swrContext,
                       out_channel_layout, out_sample_fmt, out_sample_rate,
                       in_channel_layout, in_sample_fmt, in_sample_rate,
                       0, NULL);

    swr_init(swrContext);

    //定義一個緩存
    int channel_layout = av_get_channel_layout_nb_channels(out_channel_layout);
    int sample_fmt = av_get_bytes_per_sample(AV_SAMPLE_FMT_S16);
    LOGD("channel_layout = %d", channel_layout);
    int out_buffer_size = channel_layout * in_sample_rate * sample_fmt;
    LOGD("out_buffer_size = %d", out_buffer_size);
    uint8_t *out_buffers = static_cast<uint8_t *>(av_malloc(out_buffer_size/*2 * out_sample_rate*/));

    FILE *out_pcm = fopen(dst_file_path, "wb");

    AVPacket *avPacket = av_packet_alloc();

    while (av_read_frame(avFormatContext, avPacket) >= 0) {

        AVFrame *avFrame = av_frame_alloc();
        if (avcodec_send_packet(avCodecContext, avPacket)) {
            break;
        }

        int ret = avcodec_receive_frame(avCodecContext, avFrame);
        if (ret == AVERROR(EAGAIN)) {
            continue;
        } else if (ret < 0) {
            return;
        }

        if (avPacket->stream_index != audio_stream_index) {
            return;
        }

        //已經轉碼完成


        int64_t dst_nb_samples = av_rescale_rnd(swr_get_delay(swrContext,avFrame->sample_rate) + avFrame->nb_samples,
                                                out_sample_rate,avFrame->sample_rate,AV_ROUND_UP);

        // 音頻重採樣
        // 聲卡 要求 音頻 輸出的格式統一（採用率統一，通道數統一，...）
        // 把PCM原始音頻數據  --> 統一處理 --》 建立統一格式

        swr_convert(swrContext,
                // 輸出相關的
                    &out_buffers, dst_nb_samples,
                // 輸入相關的
                    (const uint8_t **) (avFrame->data), avFrame->nb_samples
        );

        int out_buffer_size = av_samples_get_buffer_size(NULL, 2, avFrame->nb_samples,
                                                         out_sample_fmt, 1);
        fwrite(out_buffers, 1, out_buffer_size, out_pcm);


        av_frame_free(&avFrame);
    }
    av_packet_free(&avPacket);

    fclose(out_pcm);

    env->ReleaseStringUTFChars(input, src_url);
    env->ReleaseStringUTFChars(output, dst_file_path);
}

上層代碼

public class MainActivity extends AppCompatActivity {

    // Used to load the 'native-lib' library on application startup.
    static {
        System.loadLibrary("native-lib");
    }

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);

        // Example of a call to a native method
        TextView tv = findViewById(R.id.sample_text);
        tv.setText(stringFromJNI());


        File externalCacheDir = getExternalCacheDir();
        Log.d("dsh",externalCacheDir.getAbsolutePath());
        File file_in = new File(externalCacheDir,"test.mp3");
        File file_out = new File(externalCacheDir,"test.pcm");
        audioDecode(file_in.getAbsolutePath(),file_out.getAbsolutePath());
    }

    /**
     * A native method that is implemented by the 'native-lib' native library,
     * which is packaged with this application.
     */
    public native String stringFromJNI();

    public native void audioDecode(String input,String output);
}

FFmpeg音頻解碼邏輯詳解

一、解封裝

二、解碼

第一步、獲取到`AVCodecContext`

第二步、AVPacket 轉換成AVFrame

三、重採樣

第一步、重採樣準備設置

第二步、開始轉換

再談23種設計模式（3）：行爲型模式（學習筆記）

Power Automate Desktop 安裝完，登錄後老是提示one driver 錯誤

微前端學習筆記(4):從微前端到微模塊之EMP與hel-micro方案探索

微前端學習筆記（1）：微前端總體架構概述，從微服務發微

985 碩士程序員，空窗 4 個月沒有 Offer！

一文搞懂 Spring 循環依賴

賽博鬥地主——使用大語言模型扮演Agent智能體玩牌類遊戲。

VScode右鍵打開(添加到右鍵)

記一次 .NET某工控視覺自動化系統卡死分析

WindowsServer--SQL Server搭建主從同步實現讀寫分離 - 事務性分發

小白學習後端開發之Spring框架註解大全（一）

完全搞懂CoordinatorLayout Behavior 你能做些什麼

View事件分發機制，看完這些你一定能懂

詳解Rxjava原理，其實沒有那麼難

數據結構必須要懂的類型之二叉排序樹

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

FFmpeg音頻解碼邏輯詳解

一、 解封裝

二、解碼

第一步、 獲取到AVCodecContext

第二步、AVPacket 轉換成AVFrame

三、重採樣

第一步、重採樣準備設置

第二步、開始轉換

一、解封裝

第一步、獲取到`AVCodecContext`