[음성신호처리11] - Linear Prediction

Notice

Recent Posts

Recent Comments

Link

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

공부 정리 블로그

[음성신호처리11] - Linear Prediction 본문

대학원 수업/음성신호

[음성신호처리11] - Linear Prediction

따옹 2023. 5. 10. 21:26

Introduction to Linear Prediction

• Speech sounds

1. Deterministic sounds : periodic or impulsive sources (vowels, plosives)

2. Stochastic sounds : noise sources (fricatives)

• Estimation of parameters of an all-pole system function

▫ Linear prediction analysis

▫ Pitch synchronous analysis

Time-Dependent Processing

• Many analysis techniques assume that speech signals are quasi-stationary(라고 가정을 많이 함), which means that speech characteristics change relatively slowly, and hence over a short-time interval, e.q., 20-40ms, the vocal tract and its input sources are “stationary”.(라고 가정할 정도로 상대적으로 느리게 움직임)

• A short window can give adequate time resolution but poor frequency resolution.(long window의 경우 그 반대)

• Time-frequency resolution cannot be met simultaneously, by the uncertainty principle.

• Window size in practice(실제상황에서는 30msec 많이 씀 윈도우 길이 / 거기에 대해서 5~10씩 overwarp 하면서 이동함)

▫ Duration : 20 ~ 40ms

▫ Shift : 5 ~ 10ms

All-Pole Modeling of Deterministic Signals (1)

Formulation

▫ Consider a transfer function(gloth~lip까지 가는) model, H(z), from the glottis to the lips, and the output for deterministic signals (speech signals with a periodic or impulsive source).

all-pole 필터를 구성하는 기본 뼈대 G(z), V(z), R(z) (크게 고려하지 않아도 되는 부분들)

전체를 모델링하면 All-pole로 모델링(vocal tract 해당)

A, a_k를 찾는 것이 목적

All-Pole Modeling of Deterministic Signals (2)

• The basic idea is that each speech sample is approximated as a linear combination of past speech samples.

위의 조건을 허용하자

과거 p 번째 까지 조합해서 weighted filter를 만들 수 있음 s[n]을 도출

필터출력을 만들어 냄 , A(gain 을 줘서 더해주는 형태로) => 일종의 innovation 역할을 함

우리가 하고자 하는 것은 a_k를 적절히 잘 찾는 것

ak: linear prediction coefficients

•In the z-domain

H를 z 도메인에서 성대 성도를 쪼개서 봄

해석해보면, U_g가 입력으로 들어와서 보컬 트랙을 거쳐 speech로 만들어짐

위의 식을 z도메인으로 변경해서 알아볼 수도 있음

All-Pole Modeling of Deterministic Signals (3)

• Estimation of a_k is called linear prediction analysis.

a_k를 추정하기 위해서

• Quantization of a_k is called linear prediction coding (LPC).

• Assuming that u_g [n] is a train of unit samples, we can think of s[n] as a linear combination of past samples s[n] except at which u_g [n]≠0.

All-Pole Modeling of Deterministic Signals (4)

a_k를 추정해야하므로, 알파로 둔다

a_k => α_k

n-k => α_k 대응

α_k Z ^(-k) 의 성분과 S(z)에 의해 분리된 성분으로 볼 수 있음

error를 minimize 하는 alpha를 찾자 -> linear prediction 문제

All-Pole Modeling of Deterministic Signals (5)

IF a_k == α_k

추정값과 ideal value가 똑같아 진다면

The input sequence Aug [n] can be recovered by passing s[n] through A(z). → Inverse filter

speech를 넣고 A(z)를 복원

• When we pass Au_g [n](allpole filter)(A(z)) through the system 1/A(z), we obtain s[n](출력). → Synthesis filter

speech를 넣어서 excitation - source(vocal cord), soruce를 넣어서 출력을 내는 구조들

AU source (vocal code)

Error Minimization (1)

결과적으로는 [n-M-p, n+M] s_n[m]을 정의. 이렇게 정의해야 온전히 계산할 수 있으므로,

The sequence sn [m] is defined over [n-M-p, n+M].

Now let’s find the minimum of the error by setting the derivative of E_n to zero.

미분을 구해서 0이 돼야함

α_k => filter coeifficient

~13까지 수식 부분 다시 듣기

geometric하게 생각해 볼 때,

Sn target value, Sn의 근사치가 되는 것을 space상에 투영하고 가장 가까운 지점이 될 것

투영했기 때문에 공간과 에러는 직교한다

Prediction error is orthogonal to the vector space spanned by the previous samples.

Autocorrelation Functions of Speech

r[k](Autocorrelation_k) = E[ S[n]S[n-k] ]

k만큼 delay 됐을 때,

파워에서 가장 크고 점점 줄어드는 모양을 갖게됨

voiced(vowel)이므로 pitch가 보임

(c) unvoiced 자기자신은 큰데, 한 곳만 지나도 신호가 매우 약해짐 , (d)voiced plosive :low frequency에 집중돼있고 느리게 변화

source 신호에 gain 을 곱한 것이 입력 신호가 됨

Criterion of “Goodness” (1)

• Time Domain

▫ Because assumed minimum phase all-pole model is not adequate, we cannot see an idealized glottal source waveform at the output of the inverse filter.

▫ Idealized residual should be impulse trains, impulses, or white noise.

▫ However, in reality, we hear the speech itself at the prediction error.

이부분 캡쳐 필요(교수님 필기)

voiced -> inverse filtering한 결과들에서 에러들이 꽤 많이 나타나개 됨 / unvoiced -> inverse 크기만 좀 줄어든 형태로 나타냄 / fricative -> 비슷하지만 inverse amplitude가 살작 줄어든 형태로 나타남

voiced : 비교적 잘 나옴

unvoiced : (크기만 좀 줄어든 형태로 나타남)

unvoiced pricative : 비슷한 출력이 나오면서 amplitude가 줄어든 형태로 나옴

filter order를 어떻게 하느냐에 따라

p를 바꿀 때마다 엔벨롭이 어떻게 나오는 지

p가 커질수록 디테일을 잘 따라감

(d)오더가 너무 커서 피치성분까지 다 따라가므로 얻고자 하는 것과는 거리가 있음

(b), (c)를 얻는 것이 목적이다

• (a) a windowed speech signal,

• (b) the LPC error signal,(residual 신호)

• (c) the signal spectrum with the LPC spectral envelope superimposed

• (d) the LPC error spectrum(엔벨롭, 주기성분 차이 해당)

Example

• The reconstructed waveform is speech-like with loss of absolute phase structure because of its minimum-phase characteristic.

• The reconstructed waveform is more peaky, which causes a “buzzy” quality

이걸로 이상적인 임펄스로 만들어서 vocal system에 넣어보면넣어보면

minimun phase 잘 안 맞으면 뭉개지는 게 나옴

all -pole model이 수학적으로 심플하게 나타낼 수 있지만음성으로 적용했을 때는 실제 스피치와 유사한 결과가 나오기는 쉽지 않음

그럼에도 불구하고 더 좋은 모델은 없으니 vocal tract에 해당하는 a_k(speech coding)들을 코딩할 때는 유용하게 활용되고 있다

'대학원 수업 > 음성신호' 카테고리의 다른 글

[음성신호처리13]-vocoders;voice coding (0)	2023.05.17
[음성신호처리12]-pitch (0)	2023.05.17
[음성신호9.5] - STFT, Filter bank (0)	2023.05.10
[신호처리 10] MFCC (0)	2023.05.10
[음성신호처리6]-Speech production (1)	2023.04.24

'대학원 수업/음성신호' Related Articles

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

공부 정리 블로그

공부 정리 블로그

[음성신호처리11] - Linear Prediction 본문

[음성신호처리11] - Linear Prediction

Introduction to Linear Prediction

Time-Dependent Processing

All-Pole Modeling of Deterministic Signals (1)

All-Pole Modeling of Deterministic Signals (2)

All-Pole Modeling of Deterministic Signals (3)

All-Pole Modeling of Deterministic Signals (4)

All-Pole Modeling of Deterministic Signals (5)

Error Minimization (1)

Autocorrelation Functions of Speech

Criterion of “Goodness” (1)

Example

'대학원 수업 > 음성신호' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역