ffmpeg將多媒體文件的Video Stream每幀畫面保存爲PPM格式圖片

轉自：http://blog.chinaunix.net/uid-20846214-id-4193590.html

注：本文參考http://dranger.com/ffmpeg/tutorial01.html，但是這篇比較老舊了，文中用的最新版的FFmpeg，很多API都跟老版的不同，請大家注意。

      在最簡單的情況下，其實處理Video和Audio的步驟是非常簡單的：
1：open video_stream從video.avi中
2：從video_stream中讀取packet到frame裏面
3：如果frame不完整就goto到第2步繼續
4：對frame做些處理
5：跳到第2步重複
      packet包含了要被解碼成原始數據幀frame的數據塊。每個packet都包含了完整的frames。

      這章，我們會打開一個媒體文件，從文件裏面讀取Video Stream，然後把幀frame寫入一個PPM文件，PPM(Portable Pixelmap)文件是一種linux圖片格式，它很簡單，只包含格式，圖像寬高，bit數等信息以及圖像數據。

一：打開文件
要用FFmpeg庫中的支持，必須包含它的頭文件：

#include <avcodec.h>
#include <avformat.h>
...
...
int main(int argc, char **argv)
{
...
av_register_all();
...
}

這裏調用的av_register_all會註冊所有FFmpeg庫支持的文件格式和codec，並且av_register_all只調用一次。當一個文件被打開的時候，FFmpeg會自動找到對應的codec。

接下來打開媒體文件：

AVFormatContext *pFormatCtx;
// Open video file
if(avformat_open_input(&pFormatCtx, argv[1], NULL, NULL)!=0)
return -1; // Couldn't open file

      這裏調用avformat_open_input函數打開媒體文件的，媒體文件名由main函數參數argv傳遞進來。
        avformat_open_input函數會讀取媒體文件的頭並且把這些信息保存到AVFormatContext結構體中，最後2個參數是用來指定文件格式，buffer大小和格式參數，設置成NULL的話，libavformat庫會自動去探測它們。

      接下來我們需要Check Out這個文件的stream信息：

// Retrieve stream information
if (av_find_stream_info(pFormatCtx) < 0)
return -1; // Couldn't find stream information

然後調用一個用來debug的函數dump_format，會在屏幕上打印一些該媒體文件的信息：

// Dump information about file onto standard error
av_dump_format(pFormatCtx, 0, argv[1], 0);

到此，該媒體的Video流會在一個數組中，pFormatCtx->streams是這個數組的指針，數組大小是pFormatCtx->nb_streams，下面來從數組中找到Video流數據：

int i;
AVCodecContext *pCodecCtx;
// Find the first video stream
videoStream = -1;
for (i=0; i<pFormatCtx->nb_streams; i++) {
if (pFormatCtx->streams[i]->codec->codec_type == CODEC_TYPE_VIDEO) {
videoStream = i;
break;
}
}
if(videoStream == -1)
return -1; // Didn't find a video stream
// Get a pointer to the codec context for the video stream
pCodecCtx = pFormatCtx->streams[videoStream]->codec;

這裏我們得到了這個流使用的codec的所有信息。用pCodecCtx 指向這個信息位置。然後下面就利用pCodecCtx 來找到這個codec並打開它：

AVCodec *pCodec;
// Find the decoder for the video stream
pCodec = avcodec_find_decoder(pCodecCtx->codec_id);
if (pCodec == NULL) {
fprintf(stderr, "Unsupported codec!\n");
return -1; // Codec not found
}
// Open codec
if(avcodec_open2(pCodecCtx, pCodec, NULL) < 0)
return -1; // Could not open codec

二：存儲數據
現在我們需要一個地方來存放從媒體文件中解碼出的原始數據幀frame：

AVFrame *pFrame;
// Allocate video frame
pFrame = avcodec_alloc_frame();

還需要一個地方來存放從這個pFrame幀轉換成的RGB幀：

// Allocate an AVFrame structure
pFrameRGB = avcodec_alloc_frame();
if(pFrameRGB == NULL)
return -1;

然後把每個RGB幀畫面輸出到PPM文件中，PPM文件格式用24-bit RGB保存。現在要做的就是把幀frame從當前格式轉換成RGB。當然FFmpeg會幫我們完成這個任務，大多數情況下，我們想把幀frame轉換成指定格式。
別急，還需要一個內存buffer，用來存放媒體文件的中即將要被轉換的原始數據，我們用avpicture_get_size函數來獲取需要的size：

uint8_t *buffer;
int numBytes;
// Determine required buffer size and allocate buffer
numBytes = avpicture_get_size(PIX_FMT_RGB24, pCodecCtx->width,
pCodecCtx->height);
buffer = (uint8_t *)av_malloc(numBytes*sizeof(uint8_t));

這裏用的av_malloc函數其實只是封裝了malloc，並做了一些內存對齊操作，對於內存的操作，防止內存溢出以及釋放，都跟malloc一樣，需要我們自己來做。
下面將pFrameRGB與buffer關聯起來，這裏的AVPicture 結構是AVFrame 結構的子集，AVPicture 結構與AVFrame 結構的開始部分一模一樣：

// Assign appropriate parts of buffer to image planes in pFrameRGB
// Note that pFrameRGB is an AVFrame, but AVFrame is a superset
// of AVPicture
avpicture_fill((AVPicture *)pFrameRGB, buffer, PIX_FMT_RGB24,
pCodecCtx->width, pCodecCtx->height);

到此都準備好了，下面開始讀流：
三：讀數據
下面通過函數av_read_frame讀取Video Stream到packet中。然後用解碼器pCodecCtx從packet.data中解碼出原始數據幀並存放到pFrame中。參數frameFinished判斷轉換的結果。如果轉換完成，調用img_convert函數把原始數據幀pFrame轉換成我們要的RGB幀pFrameRGB，並調用SaveFrame函數把RGB幀保存成一個個的PPM圖片(這裏我們只保存了Video Stream的前15張圖片，可以根據個人需要修改)。while循環最後會調用av_free_packet函數清除av_read_frame函數中讀入packet的數據，然後循環繼續讀packet，繼續解碼，繼續轉換，繼續保存成PPM圖片，直到讀完整個媒體文件。注意這裏自己準備一個pSwsCtx結構，這個結構比較靈活，可以對即將要生成的PPM圖片進行操作配置，如反轉圖片。

int frameFinished;
AVPacket packet;
i=0;
while(av_read_frame(pFormatCtx, &packet)>=0) {
// Is this a packet from the video stream?
if(packet.stream_index==videoStream) {
// Decode video frame pCodecCtx, pFrame, &frameFinished, &packet
avcodec_decode_video2(pCodecCtx, pFrame, &frameFinished, &packet);
// Did we get a video frame?
if(frameFinished) {
// Convert the image from its native format to RGB
sws_scale(pSwsCtx, pFrame->data, pFrame->linesize, 0,
pCodecCtx->height, pFrameRGB->data, pFrameRGB->linesize);
// Save the frame to disk
if(++i<=15)
SaveFrame(pFrameRGB, pCodecCtx->width,
pCodecCtx->height, i);
}
}
// Free the packet that was allocated by av_read_frame
av_free_packet(&packet);
}

      SaveFrame函數如下，先新建一個ppm文件，先把文件頭寫進去，然後把數據寫進去，
        ppm的第一部分由三行ASCII碼組成：
            第一行是P2 or P3 or P6，我們寫的P6
            第二行是圖像的大小，先是列像素數，後是行像素數，中間有一個空格，我們寫的%d %d
            第三行是一個介於1和65535之間的整數，而且必須是文本的，用來表示每一個像素的一個分量用幾個比特表示，我們寫的255，即8bit表示一個像素分量，那一個像素就是24-bit了。
        三行之後是圖像的純數據流，從左到右，從上到下。我們這裏寫數據部分，pFrame->data[0]是數據頭，y是目前寫入的行數，pFrame->linesize[0]是每行的字節數，pFrame->data[0]+y*pFrame->linesize[0]就是每行數據開頭的地址。width是每行像素個數，width*3就是每行要寫的數據個數，以像素分量爲單位。

void SaveFrame(AVFrame *pFrame, int width, int height, int iFrame) {
FILE *pFile;
char szFilename[32];
int y;
// Open file
sprintf(szFilename, "frame%d.ppm", iFrame);
pFile=fopen(szFilename, "wb");
if(pFile==NULL)
return;
// Write header
fprintf(pFile, "P6\n%d %d\n255\n", width, height);
// Write pixel data
for(y=0; y<height; y++)
fwrite(pFrame->data[0]+y*pFrame->linesize[0], 1, width*3, pFile);
// Close file
fclose(pFile);
}

到此，該做些收尾工作了，釋放內存，如果是在main函數中，最後記得返回：

// Free the RGB image
av_free(buffer);
av_free(pFrameRGB);
// Free the YUV frame
av_free(pFrame);
// Close the codec
avcodec_close(pCodecCtx);
// Close the video file
av_close_input_file(pFormatCtx);
return 0;

這裏的av_free函數對應上面的av_malloc函數。

OK，到此完成了！編譯，運行，會發現當前目錄下有15長PPM圖片：

冬天的烤地瓜

發佈了37 篇原創文章 · 獲贊 35 · 訪問量 14萬+

私信關注

ffmpeg將多媒體文件的Video Stream每幀畫面保存爲PPM格式圖片

Power Automate Desktop 安裝完，登錄後老是提示one driver 錯誤

再談23種設計模式（3）：行爲型模式（學習筆記）

微前端學習筆記(4):從微前端到微模塊之EMP與hel-micro方案探索

微前端學習筆記（1）：微前端總體架構概述，從微服務發微

985 碩士程序員，空窗 4 個月沒有 Offer！

一文搞懂 Spring 循環依賴

賽博鬥地主——使用大語言模型扮演Agent智能體玩牌類遊戲。

VScode右鍵打開(添加到右鍵)

記一次 .NET某工控視覺自動化系統卡死分析

WindowsServer--SQL Server搭建主從同步實現讀寫分離 - 事務性分發

回調方法介紹之中國好室友篇（Java示例）（什麼是回調函數？）

編寫播放器SDK過程中參考的一些文檔

C++ 堆上創建對象與棧上創建對象的區別創建對象時有和無花括號的區別

Android 使用NDK-build生成so文件 C++ JNI NDK

Ubuntu 16.04 + win7 雙系統引導修復 grub修復之路

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結