Use DTW to Compare Recordings
Import, trim, and preprocess four recordings of the first sentence of Alice in Wonderland.
show complete Wolfram Language input
In[2]:=
data:image/s3,"s3://crabby-images/3dbf6/3dbf625b51af8648d25f53c110e726288a9238c1" alt="Click for copyable input"
alice = ConformAudio[
MapThread[
AudioNormalize[
AudioChannelMix[AudioTrim[AudioResample[Import[#1], 11025], #2],
1]] &, {urls, times}]]
Out[2]=
data:image/s3,"s3://crabby-images/7d954/7d9543ab91924e6e72792005d031f541a7c32b5b" alt=""
Show the plots for the signals.
In[3]:=
data:image/s3,"s3://crabby-images/5eccd/5eccd7f7cf07ec4a1f5beca3d4e6850bb821f5ff" alt="Click for copyable input"
AudioPlot[alice, ImageSize -> Medium]
Out[3]=
data:image/s3,"s3://crabby-images/2e8a1/2e8a1cf1084af689abefc9dcd1c66abf2cd2a027" alt=""
Compute and plot the MFCC features for the samples.
In[4]:=
data:image/s3,"s3://crabby-images/6363c/6363c077d21edf653b740101bb91d90537c4cb2e" alt="Click for copyable input"
mfcc = AudioLocalMeasurements[#, "MFCC",
PartitionGranularity -> {.05, .01}]["Values"] & /@ alice;
In[5]:=
data:image/s3,"s3://crabby-images/ae200/ae20028dcc64d8492ec2c8c4b2de3d8a2ce8ec57" alt="Click for copyable input"
Column[MatrixPlot[#, PlotTheme -> "Minimal", ImageSize -> Medium] & /@
Transpose /@ mfcc]
Out[5]=
data:image/s3,"s3://crabby-images/ca283/ca2834b17700638fd87730220193a5ccb6fde43c" alt=""
Compute the dynamic time warping distance between the recordings using WarpingDistance.
In[6]:=
data:image/s3,"s3://crabby-images/e8dca/e8dcabd4c7bb2a2e10758c4e53c60fad5a9d139a" alt="Click for copyable input"
DistanceMatrix[mfcc,
DistanceFunction -> WarpingDistance] // MatrixPlot
Out[6]=
data:image/s3,"s3://crabby-images/f3aa9/f3aa955210150dbd72f5c4c376cf0f6690f0b57c" alt=""
Compute the dynamic time warping correspondence between two of the recordings using WarpingCorrespondence.
In[7]:=
data:image/s3,"s3://crabby-images/ea056/ea0567bb0af2d5daa4ed288059f0d46026854e03" alt="Click for copyable input"
{n, m} = WarpingCorrespondence[mfcc[[1]], mfcc[[2]]];
show complete Wolfram Language input
Out[8]=
data:image/s3,"s3://crabby-images/369bd/369bd47119109c9ed14874ee50b5dc2fe51f2f23" alt=""