Python+Opencv2(三)保存視頻關鍵幀

一、前言

依然是博主畢設的手語檢測,好多圖片要處理哦!
在這裏插入圖片描述
今天要處理視頻,接觸了一下,本來是畢設partner另一小姐姐主要研究的。
記錄下我在視頻處理方面的簡單分析~

機器視覺中不可分離的一部分——視頻識別,當然了,視頻識別需要處理數據幀,用opencv是極好的;視頻提取圖像,在視頻上繪製關鍵特徵,分割圖像,保存圖像都是別叫重要的模塊。

我們大多數時候都是對全視頻幀數處理,會因爲視頻過大處理每一幀數據非常耗時;但在特定場合下,我們沒有必要處理無效的視頻幀數,That’s too bad;所以我們需要提取關鍵幀,即有效識別幀數。

一段視頻:(手語:學校)
在這裏插入圖片描述
視頻截取每一幀保存爲圖片:
在這裏插入圖片描述

手語識別的需求:
在這裏插入圖片描述
(圖片截取於:《基於神經網絡的中小詞彙量中國手語識別研究》_李曉旭)
在這裏插入圖片描述

事實上,我們真正只需要識別關鍵幀,!
在這裏插入圖片描述

二、視頻中保存每幀圖片

可選部分~
在這裏插入圖片描述
主要是:cv2.imwrite()函數


import cv2
import os

# 從.avi 類型的視頻中提取圖像
def splitFrames(sourceFileName):

    # 在這裏把後綴接上
    video_path = os.path.join('video/', sourceFileName + '.avi')
    outPutDirName = 'video/img_' + sourceFileName + '/'

    if not os.path.exists(outPutDirName):
        #如果文件目錄不存在則創建目錄
        os.makedirs(outPutDirName)

    cap = cv2. VideoCapture(video_path) # 打開視頻文件
    num = 1
    while True:
        # success 表示是否成功,data是當前幀的圖像數據;.read讀取一幀圖像,移動到下一幀
        success, data = cap.read()
        if not success:
            break
        # im = Image.fromarray(data, mode='RGB') # 重建圖像
        # im.save('C:/Users/Taozi/Desktop/2019.04.30/' +str(num)+".jpg") # 保存當前幀的靜態圖像
        cv2.imwrite( outPutDirName +str(num)+".jpg", data)

        num = num + 1

        # if num % 20 == 0:
        #     cv2.imwrite('./Video_dataset/figures/' + str(num) + ".jpg", data)

        print(num)
    cap.release()

# 從.mp4 數據類型的視頻中提取圖像
def splitFrames_mp4(sourceFileName):

    # 在這裏把後綴接上
    video_path = os.path.join('video/', sourceFileName + '.mp4')
    times = 0
    # 提取視頻的頻率,每25幀提取一個
    # frameFrequency = 25
    # 輸出圖片到當前目錄vedio文件夾下
    outPutDirName = 'video/video_' + sourceFileName + '/'

    # 如果文件目錄不存在則創建目錄
    if not os.path.exists(outPutDirName):
        os.makedirs(outPutDirName)

    camera = cv2.VideoCapture(video_path)
    while True:
        times+=1
        res, image = camera.read()
        if not res:
            # print('not res , not image')
            break

        # if times%frameFrequency==0:
        #     cv2.imwrite(outPutDirName + str(times)+'.jpg', image)
        #     print(outPutDirName + str(times)+'.jpg')

        cv2.imwrite(outPutDirName + str(times) + '.jpg', image)
        print(times,end='\t')
    print('\n圖片提取結束')
    camera.release()

if __name__ == '__main__':

    im_file = 'video/'

    # for im_name in im_names:
    for im_name in os.listdir(im_file):
        suffix_file = os.path.splitext(im_name)[-1]
        if suffix_file == '.mp4':
            print('~~~~~~~~~~ 從.mp4 視頻提取圖像 ~~~~~~~~~~~~~~~')

            sourceFileName = os.path.splitext(im_name)[0]
            splitFrames_mp4(sourceFileName)

        elif suffix_file == '.avi' :
            print('~~~~~~~~~~ 從.avi 視頻提取圖像 ~~~~~~~~~~~~~~~')

            sourceFileName = os.path.splitext(im_name)[0]
            splitFrames(sourceFileName)

三、幀間差法

1.兩間查分法

步驟:

  • 首先,我們加載視頻並計算每幀之間的幀間差異
  • 然後,選擇以下三種提取有效幀的方法中的一種來提取關鍵幀
  1. 使用差值順序
    前幾幀具有最大的幀間平均差被認爲是關鍵幀。

  2. 使用差分閾值
    平均幀間差大於平均幀間差的幀被認爲是關鍵幀。

  3. 使用本地最大平均幀間差爲局部最大值的幀爲被認爲是關鍵幀。

需要注意的是,平滑平均差值之前,計算局部最大值可以有效地消除噪聲,重複提取相似場景的幀。

(1)處理一段視頻

作者:python實現視頻關鍵幀提取(基於幀間差分)
源於:以下代碼出自zyb_as的github

作者運用的是上述第三種方法——提取的是幀差最大值:
在這裏插入圖片描述

# -*- coding: utf-8 -*-
"""
Created on Tue Dec  4 16:48:57 2018
keyframes extract tool
this key frame extract algorithm is based on interframe difference.
The principle is very simple
First, we load the video and compute the interframe difference between each frames
Then, we can choose one of these three methods to extract keyframes, which are 
all based on the difference method:
    
1. use the difference order
    The first few frames with the largest average interframe difference 
    are considered to be key frames.
2. use the difference threshold
    The frames which the average interframe difference are large than the 
    threshold are considered to be key frames.
3. use local maximum
    The frames which the average interframe difference are local maximum are 
    considered to be key frames.
    It should be noted that smoothing the average difference value before 
    calculating the local maximum can effectively remove noise to avoid 
    repeated extraction of frames of similar scenes.
After a few experiment, the third method has a better key frame extraction effect.
The original code comes from the link below, I optimized the code to reduce 
unnecessary memory consumption.
https://blog.csdn.net/qq_21997625/article/details/81285096
@author: zyb_as
""" 
import cv2
import operator # 內置操作符函數接口(後面排序用到)
import numpy as np
import matplotlib.pyplot as plt
import os
import sys
from scipy.signal import argrelextrema # 極值點

 
def smooth(x, window_len=13, window='hanning'):
    """使用具有所需大小的窗口使數據平滑。
    
    This method is based on the convolution of a scaled window with the signal.
    The signal is prepared by introducing reflected copies of the signal 
    (with the window size) in both ends so that transient parts are minimized
    in the begining and end part of the output signal.
    該方法是基於一個標度窗口與信號的卷積。
    通過在兩端引入信號的反射副本(具有窗口大小)來準備信號,
    使得在輸出信號的開始和結束部分中將瞬態部分最小化。
    input:
        x: the input signal輸入信號 
        window_len: the dimension of the smoothing window平滑窗口的尺寸
        window: the type of window from 'flat', 'hanning', 'hamming', 'bartlett', 'blackman'
            flat window will produce a moving average smoothing.
            平坦的窗口將產生移動平均平滑
    output:
        the smoothed signal平滑信號
        
    example:
    import numpy as np    
    t = np.linspace(-2,2,0.1)
    x = np.sin(t)+np.random.randn(len(t))*0.1
    y = smooth(x)
    
    see also: 
    
    numpy.hanning, numpy.hamming, numpy.bartlett, numpy.blackman, numpy.convolve
    scipy.signal.lfilter
 
    TODO: 如果使用數組而不是字符串,則window參數可能是窗口本身   
    """
    print(len(x), window_len)
    # if x.ndim != 1:
    #     raise ValueError, "smooth only accepts 1 dimension arrays."
    #提高ValueError,“平滑僅接受一維數組。”
    # if x.size < window_len:
    #     raise ValueError, "Input vector needs to be bigger than window size."
    #提高ValueError,“輸入向量必須大於窗口大小。”
    # if window_len < 3:
    #     return x
    #
    # if not window in ['flat', 'hanning', 'hamming', 'bartlett', 'blackman']:
    #     raise ValueError, "Window is on of 'flat', 'hanning', 'hamming', 'bartlett', 'blackman'"
 
    s = np.r_[2 * x[0] - x[window_len:1:-1],
              x, 2 * x[-1] - x[-1:-window_len:-1]]
    #print(len(s))
 
    if window == 'flat':  # moving average平移
        w = np.ones(window_len, 'd')
    else:
        w = getattr(np, window)(window_len)
    y = np.convolve(w / w.sum(), s, mode='same')
    return y[window_len - 1:-window_len + 1]
 

class Frame:
    """class to hold information about each frame
    用於保存有關每個幀的信息
    """
    def __init__(self, id, diff):
        self.id = id
        self.diff = diff
 
    def __lt__(self, other):
        if self.id == other.id:
            return self.id < other.id
        return self.id < other.id
 
    def __gt__(self, other):
        return other.__lt__(self)
 
    def __eq__(self, other):
        return self.id == other.id and self.id == other.id
 
    def __ne__(self, other):
        return not self.__eq__(other)
 
 
def rel_change(a, b):
    x = (b - a) / max(a, b)
    print(x)
    return x
 
def getEffectiveFrame(videopath,dir):
    # 如果文件目錄不存在則創建目錄
    if not os.path.exists(dir):
        os.makedirs(dir)
    (filepath, tempfilename) = os.path.split(videopath)#分離路徑和文件名
    (filename, extension) = os.path.splitext(tempfilename)#區分文件的名字和後綴
    #Setting fixed threshold criteria設置固定閾值標準
    USE_THRESH = False
    #fixed threshold value固定閾值
    THRESH = 0.6
    #Setting fixed threshold criteria設置固定閾值標準
    USE_TOP_ORDER = False
    #Setting local maxima criteria設置局部最大值標準
    USE_LOCAL_MAXIMA = True
    #Number of top sorted frames排名最高的幀數
    NUM_TOP_FRAMES = 50
    #smoothing window size平滑窗口大小
    len_window = int(50)

    print("target video :" + videopath)
    print("frame save directory: " + dir)
    # load video and compute diff between frames加載視頻並計算幀之間的差異
    cap = cv2.VideoCapture(str(videopath)) 
    curr_frame = None
    prev_frame = None 
    frame_diffs = []
    frames = []
    success, frame = cap.read()
    i = 0 
    while(success):
        luv = cv2.cvtColor(frame, cv2.COLOR_BGR2LUV)
        curr_frame = luv
        if curr_frame is not None and prev_frame is not None:
            #logic here
            diff = cv2.absdiff(curr_frame, prev_frame)#獲取差分圖
            diff_sum = np.sum(diff)
            diff_sum_mean = diff_sum / (diff.shape[0] * diff.shape[1])#平均幀
            frame_diffs.append(diff_sum_mean)
            frame = Frame(i, diff_sum_mean)
            frames.append(frame)
        prev_frame = curr_frame
        i = i + 1
        success, frame = cap.read()   
    cap.release()
    
    # compute keyframe
    keyframe_id_set = set()
    if USE_TOP_ORDER:
        # sort the list in descending order以降序對列表進行排序
        frames.sort(key=operator.attrgetter("diff"), reverse=True)# 排序operator.attrgetter
        for keyframe in frames[:NUM_TOP_FRAMES]:
            keyframe_id_set.add(keyframe.id) 
    if USE_THRESH:
        print("Using Threshold")#使用閾值
        for i in range(1, len(frames)):
            if (rel_change(np.float(frames[i - 1].diff), np.float(frames[i].diff)) >= THRESH):
                keyframe_id_set.add(frames[i].id)   
    if USE_LOCAL_MAXIMA:
        print("Using Local Maxima")#使用局部極大值
        diff_array = np.array(frame_diffs)
        sm_diff_array = smooth(diff_array, len_window)#平滑
        frame_indexes = np.asarray(argrelextrema(sm_diff_array, np.greater))[0]#找極值
        for i in frame_indexes:
            keyframe_id_set.add(frames[i - 1].id)# 記錄極值幀數
            
        plt.figure(figsize=(40, 20))
        plt.locator_params("x", nbins = 100)
        # stem 繪製離散函數,polt是連續函數
        plt.stem(sm_diff_array,linefmt='-',markerfmt='o',basefmt='--',label='sm_diff_array')
        plt.savefig(dir + filename+'_plot.png')
    
    # save all keyframes as image將所有關鍵幀另存爲圖像
    cap = cv2.VideoCapture(str(videopath))
    curr_frame = None
    keyframes = []
    success, frame = cap.read()
    idx = 0
    while(success):
        if idx in keyframe_id_set:
            name = filename+'_' + str(idx) + ".jpg"
            cv2.imwrite(dir + name, frame)
            keyframe_id_set.remove(idx)
        idx = idx + 1
        success, frame = cap.read()
    cap.release()

if __name__ == "__main__":
    print(sys.executable)

    #Video path of the source file源文件的視頻路徑
    videopath= 'video/school.mp4'
    #Directory to store the processed frames存儲已處理幀的目錄
    dir = 'video/extract_result/'
    
    getEffectiveFrame(videopath,dir)

效果:
在這裏插入圖片描述

(2)批量處理視頻

在這裏插入圖片描述

# -*- coding: utf-8 -*-

import cv2
import os
import time
import operator # 內置操作符函數接口(後面排序用到)
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import argrelextrema # 極值點

 
def smooth(x, window_len=13, window='hanning'):
    """使用具有所需大小的窗口使數據平滑。
    """
    print(len(x), window_len)
    
    s = np.r_[2 * x[0] - x[window_len:1:-1],
              x, 2 * x[-1] - x[-1:-window_len:-1]]
    #print(len(s))
 
    if window == 'flat':  # moving average平移
        w = np.ones(window_len, 'd')
    else:
        w = getattr(np, window)(window_len)
    y = np.convolve(w / w.sum(), s, mode='same')
    return y[window_len - 1:-window_len + 1]
 

class Frame:
    """用於保存有關每個幀的信息
    """
    def __init__(self, id, diff):
        self.id = id
        self.diff = diff
 
    def __lt__(self, other):
        if self.id == other.id:
            return self.id < other.id
        return self.id < other.id
 
    def __gt__(self, other):
        return other.__lt__(self)
 
    def __eq__(self, other):
        return self.id == other.id and self.id == other.id
 
    def __ne__(self, other):
        return not self.__eq__(other)
 
 
def rel_change(a, b):
    x = (b - a) / max(a, b)
    print(x)
    return x
 
def getEffectiveFrame(videopath,dirfile):
    # 如果文件目錄不存在則創建目錄
    if not os.path.exists(dirfile):
        os.makedirs(dirfile)
    (filepath, tempfilename) = os.path.split(videopath)#分離路徑和文件名
    (filename, extension) = os.path.splitext(tempfilename)#區分文件的名字和後綴
    #Setting fixed threshold criteria設置固定閾值標準
    USE_THRESH = False
    #fixed threshold value固定閾值
    THRESH = 0.6
    #Setting fixed threshold criteria設置固定閾值標準
    USE_TOP_ORDER = False
    #Setting local maxima criteria設置局部最大值標準
    USE_LOCAL_MAXIMA = True
    #Number of top sorted frames排名最高的幀數
    NUM_TOP_FRAMES = 50
    #smoothing window size平滑窗口大小
    len_window = int(50)

    print("target video :" + videopath)
    print("frame save directory: " + dirfile)
    # load video and compute diff between frames加載視頻並計算幀之間的差異
    cap = cv2.VideoCapture(str(videopath)) 
    curr_frame = None
    prev_frame = None 
    frame_diffs = []
    frames = []
    success, frame = cap.read()
    i = 0 
    while(success):
        luv = cv2.cvtColor(frame, cv2.COLOR_BGR2LUV)
        curr_frame = luv
        if curr_frame is not None and prev_frame is not None:
            #logic here
            diff = cv2.absdiff(curr_frame, prev_frame)#獲取差分圖
            diff_sum = np.sum(diff)
            diff_sum_mean = diff_sum / (diff.shape[0] * diff.shape[1])#平均幀
            frame_diffs.append(diff_sum_mean)
            frame = Frame(i, diff_sum_mean)
            frames.append(frame)
        prev_frame = curr_frame
        i = i + 1
        success, frame = cap.read()   
    cap.release()
    
    # compute keyframe
    keyframe_id_set = set()
    if USE_TOP_ORDER:
        # sort the list in descending order以降序對列表進行排序
        frames.sort(key=operator.attrgetter("diff"), reverse=True)# 排序operator.attrgetter
        for keyframe in frames[:NUM_TOP_FRAMES]:
            keyframe_id_set.add(keyframe.id) 
    if USE_THRESH:
        print("Using Threshold")#使用閾值
        for i in range(1, len(frames)):
            if (rel_change(np.float(frames[i - 1].diff), np.float(frames[i].diff)) >= THRESH):
                keyframe_id_set.add(frames[i].id)   
    if USE_LOCAL_MAXIMA:
        print("Using Local Maxima")#使用局部極大值
        diff_array = np.array(frame_diffs)
        sm_diff_array = smooth(diff_array, len_window)#平滑
        frame_indexes = np.asarray(argrelextrema(sm_diff_array, np.greater))[0]#找極值
        for i in frame_indexes:
            keyframe_id_set.add(frames[i - 1].id)# 記錄極值幀數
            
        plt.figure(figsize=(40, 20))
        plt.locator_params("x", nbins = 100)
        # stem 繪製離散函數,polt是連續函數
        plt.stem(sm_diff_array,linefmt='-',markerfmt='o',basefmt='--',label='sm_diff_array')
        plt.savefig(dirfile + filename+'_plot.png')
    
    # save all keyframes as image將所有關鍵幀另存爲圖像
    cap = cv2.VideoCapture(str(videopath))
    curr_frame = None
    keyframes = []
    success, frame = cap.read()
    idx = 0
    while(success):
        if idx in keyframe_id_set:
            name = filename+'_' + str(idx) + ".jpg"
            cv2.imwrite(dirfile + name, frame)
            keyframe_id_set.remove(idx)
        idx = idx + 1
        success, frame = cap.read()
    cap.release()

if __name__ == "__main__":
    print("[INFO]Effective Frame.")
    start = time.time()
    videos_path= 'dataset/vedio/onehand/'
    outfile = 'dataset/vedio/extract_result/'#處理完的幀
    video_files = [os.path.join(videos_path, video_file) for video_file in os.listdir(videos_path)]
    #
    for video_file in video_files:
        getEffectiveFrame(video_file,outfile)
    print("[INFO]Extract Result time: ", time.time() - start)
    

(3)擴展

這是採用的第三種方法,在手語識別中適用性一般,存在的缺點:

  1. 提取出的關鍵幀數量較少,極準確特徵手勢表達不強
    在這裏插入圖片描述
    (增加除最高幀差點額外的點)

  2. 間幀差別容易受極端幀影響
    視頻處理工具查看每一幀情況:(這段視頻中前三幀有黑屏,導致平均幀差過大,提取不到關鍵幀。)
    在這裏插入圖片描述
    幀差曲線:(失真)
    在這裏插入圖片描述
    (對選取的視頻要做處理,有極端幀要去除)

是否採用:

  1. 使用差值順序
    前幾幀具有最大的幀間平均差被認爲是關鍵幀。

  2. 使用差分閾值
    平均幀間差大於平均幀間差的幀被認爲是關鍵幀。

有待驗證~

2.三間差分法

兩間差分法:
在這裏插入圖片描述

三間差分法:
在這裏插入圖片描述
雙目攝像機:
在這裏插入圖片描述
有一點難了,我還是等畢設partner的結果吧。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章