Machine-Learning- HW2: Phoneme Classification Solved

Your shopping cart is empty.

Task: Multiclass Classification M M M AH AH SH SH IH IH IH N N N N ...

Framewise phoneme prediction from speech.

What is a phoneme?

A unit of speech sound in a language that can serve to distinguish one word from the other.

bat / pat , bad / bed
Machine Learning → M AH SH IH N L ER N IH NG
Data Preprocessing
Acoustic Features - MFCCs (Mel Frequency Cepstral Coefficients)

shape (11,39) label

More Information About the Data
Since each frame only contains 25 ms of speech, a single frame is prev frames future frames unlikely to represent a complete phoneme

Usually, a phoneme will span several frames flatten reshape to (11,39)Hint: post-processing may help
Concatenate the neighboring phonemes for trainingIn this HW, we concatenate the past and the future five frames for training (total 11 frames)
○ You may reshape the input (1,429) back to (11,39) to get separated 11 frames

○ Just remember that the label corresponds to the center frame

Finding testing labels or doing human labeling are strictly prohibited!
Introduction to Digital Speech Processing

Dataset & Data Format
Dataset: TIMIT Acoustic-Phonetic Continuous Speech Corpus
○ Phonetically balanced for English

Data Format (The TAs have already preprocessed the data) timit_11/npy → training data (# of training frames, 11 x feature dim)
npy → framewise phoneme label (0-38)
npy → testing data (# of testing frames, 11 x feature dim) ● Acoustic features (39-dim MFCC)
○ Concatenate the past and the future five frames (feature dim = 11 x 39)

○ The phoneme label of each input corresponds to the center frame

Using additional data is prohibited. Your final grade will be multiplied by 0.9!

Class
Phoneme
Example
Class
Phoneme
Example
Class
Phoneme
Example
0
iy
beet
13
l
lay
26
dx
muddy
1
ih
bit
14
r
ray
27
g
gay
2
eh
bet
15
y
yacht
28
p
pea
3
ae
bat
16
w
way
29
t
tea
4
ah
but
17
er
bird
30
k
key
5
uw
boot
18
m
mom
31
z
zone
6
uh
book
19
n
noon
32
v
van
7
aa
bob
20
ng
sing
33
f
fin
8
ey
bait
21
ch
choke
34
th
thin
9
ay
bite
22
jh
joke
35
s
sea
10
oy
boy
23
dh
then
36
sh
she
11
aw
bout
24
b
bee
37
hh
hay
12
ow
boat
25
d
day
38
sil
silence/closure sounds
Sample Code
Colab Link:

https://colab.research.google.com/github/ga642381/ML2021-Spring/blob/main/HW 02/HW02-1.ipynb ● Simple baseline

○ You should able to pass the simple baseline using the sample code provided.

Strong baseline
○ Model architecture (layers? dimension? activation function?)

○ Training (batch size? optimizer? learning rate? epoch?)

○ Tips (batch norm? dropout? regularization?)

2 Hessian Matrix
Task Introduction
Task: Hessian Matrix
Imagine we are training a neural network, and we try to find out whether the model reaches a local minima-like point, saddle point, or none of the above. We can make our decision by calculating the Hessian matrix. What is Hessian?

Hessian is the second order partial derivatives of a model. It is highly recommended to watch the lecture video before starting this part.

Task Introduction
The target function in this task is a one-variable sinc function.

You will get

a model checkpoint trained by TA, ● a batch of training data, ● a loss function.
You will calculate the Hessian matrix and make the decision accordingly.

Gradient Norm / Minimum Ratio
1. Gradient Norm

In a normal training process, we rarely have gradients equal to zero. In this homework, we regard those gradient norm less than 1e-3 as zero.

2. Minimum Ratio
For an ideal local minima, all the eigenvalues of the hessian matrix are greater than zero. We define the proportion of positive eigenvalues as minimum ratio.

In this homework, if minimum ratio is greater than 0.5 and gradient norm is less than 1e-3, then we assume that the model is at “local minima like”.

Gradient Norm / Minimal Ratio
In this homework, we assume that

gradient norm < 1e-3 and minimum ratio > 0.5 => local minima like, ● gradient norm < 1e-3 and minimum ratio <= 0.5 => saddle point, ● gradient norm >= 1e-3 => none of the above.
Important Notice
You don’t need to and shouldn’t change any part of the code.
You can only use colab to run the code. Otherwise, your result might differ due to environmental issue.
You will get a different checkpoint according to your student ID, so please make sure to fill in your student ID in the sample code correctly.
Sample Code
Colab Link:

https://colab.research.google.com/github/ga642381/ML2021-Spring/blob/main/HW

02/HW02-2.ipynb

After executing the sample code, you should get a result like this.
Notice that each student will get a different answer, so your answer may differ from the example.
Choose your answer from local minima like, saddle point, or none of the above

Shopping cart

US$0

Machine-Learning- HW2: Phoneme Classification Solved

More products