Java | LIUM.jar 解析

原創

_djinn

2019-07-31 03:19

Java | LIUM.jar 解析

Content

文章目錄

Java | LIUM.jar 解析

數據

LIUM.jar 解析

Lium SpkDiarization 分析

簡介

LIUM_SpkDiarization是專用於說話人記錄（即說話人劃分和聚合）的軟件。

基於CLR/NCLR陣列；
含括MFCC計算
實現說話/沉默檢測及話者記錄的功能
工具在收音機或電視節目中運用效果最優

快速開始

示例

執行jar文件直接調用爲廣播新聞錄音開發額話者記錄方法。假設我們需要計算音頻文件 ./showName.wav的記錄./showName.seg,需要使用的命令行語句如下：

java -Xmx2024m -jar ./LIUM_SpkDiarization.jar --fInputMask=./showName.wav --sOutputMask=./showName.seg --doCEClustering  showName<

設計參數設置說明如下：

java 啓用JVM(java virtual machine)編譯器
可選設置-Xmx2048m 設置JVM的存儲空間爲2048M，該大小可處理1h的電視節目的解析。
jar./LIUM_SpkDiarization.jar 明確使用的jar包
可選設置 --fInputMask=./showName.wav 明確解析的音頻文件。LIUM工具的解析對象範圍爲16kHz / 16bit PCM 單聲道的wave格式音頻（在擴展工具中音頻文件格式可被自動檢測）
可選設置sOutputMask=/showName.seg 指定包括劃分的輸出文件
當選項 –doCEClustering showName 被設置，程序將在最後計算NCLR/CE集羣，記錄錯誤率將最小化。如果–doCEClustering未被設置程序將在性別檢測後終止，劃分結構足以適用於翻譯系統。

結果(.seg文件)爲：

;; cluster S0 [ score:FS = -32.91888060194539 ] [ score:FT = -33.244850010608765 ] [ score:MS = -32.984933267352574 ] [ score:MT = -33.36216870764537 ] 
test21 1 0 281 F S U S0
test21 1 665 514 F S U S0
test21 1 1351 823 F S U S0
test21 1 2174 1939 F S U S0
test21 1 4113 265 F S U S0

與實際驗證作比較

0~2.31s M
(2.31s~6.65s F)
6.65s~11.79s M
13.51s~21.74s M&F
21.74s~41.13s M&F
41.13s~43.78s M

開始分割近似成功，後期分割非常非常非常不準確

其他可能的選項

使用

缺少fMask和fDesc

#!/bin/bash
show="ubm"
# Input segmentation file, %s will be substituted with $show
seg="./%s.seg"
# Where is the mfcc, %s will be substituted with the name of the segment
show
fMask="./mfcc/%s.mfcc"
# The MFCC vector description, here it corresponds to 12 MFCC + Energy
# spro4=the mfcc was computed by SPro4 tools
# 1:1:0:0:0:0: 1 = present, 0 not present.
# order : static, E, delta, delta E, delta delta delta delta E
# 13: total size of a feature vector in the mfcc file
# 1:0:0:1 CMS by cluster
fDesc="spro4,1:1:0:0:0:0,13,1:0:0:1"
# The GMM used to initialize EM, %s will be substituted with $show
gmmInit="./%s.init.gmms"
# The output GMM, %s will be substituted with $show
gmm="./%s.gmms"
# Initialize the UBM, ie a GMM with 8 diagonal Gaussian components
java -Xmx1024m -cp ./LIUM_SpkDiarization.jar
fr.lium.spkDiarization.programs.MTrainInit --sInputMask=$seg
--fInputMask=$fMask --fInputDesc=$fDesc --kind=DIAG --nbComp=8
--emInitMethod=split_all --emCtrl=1,5,0.05 --tOutputMask=$gmmInit $show
# Train the UBM via EM
java -Xmx1024m -cp ./LIUM_SpkDiarization.jar
fr.lium.spkDiarization.programs.MTrainEM --sInputMask=$seg --fInputMask=
$fMask --emCtrl=1,20,0.01 --fInputDesc=$fDesc --tInputMask=$gmmInit
--tOutputMask=$gmm $show

數據

LIUM_SpkDiarization中的三個主要數據：記錄(diarization)，分割格式(segment format)和聲音特徵(acoustic feature)

Diarization（記錄）

LIUM Toolkit中最重要的文件是Diarization文件。所有程序由分割文件驅動，大部分程序用於生成分割文件。

分割格式

Diarization File 格式與MDTM或STM NIST格式相似。其中每一行是一個分割。

test 1 0 1907 F S U S0

test是電視節目的名字
1是頻道號
0 是分割的開始
1907是嗯分割的長度
F 是話者性別（F爲女，M爲男，U未知）
S 是band類型（T爲話筒，S爲錄音室）
U 是環境類型（M爲音樂，S爲演講，U未知）
S0是話者標籤

LIUM.jar 解析

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Java | LIUM.jar 解析

Java | LIUM.jar 解析

Content

文章目錄

Lium SpkDiarization 分析

簡介

快速開始

示例

其他可能的選項

數據

Diarization（記錄）

分割格式

LIUM.jar 解析

工作中用到的腳本合集

通過f-string編寫簡潔高效的Python格式化輸出代碼

24-5-18 X

Shader | 透明效果

Shader | Unity Shader入門之基礎紋理

Python | PyQt5環境搭建

CV | VNect環境搭建

PyQt5 | 摸索記錄

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結