Non-Stationary Nature of Speech Signal (Procedure) : Speech Signal Processing Laboratory : Electronics & Communications : Amrita Vishwa Vidyapeetham Virtual Lab

The first step is to generate a singletone sine wave. A sine wave is characterized by its three parameters, namely, amplitude, frequency and phase. To study the concept of stationarity and non-stationarity, the frequency parameter is important and hence we consider maximum amplitude to be 1 and phase to be 0. On a digital machine the sine wave needs to be sampled for plotting or processing and the sampling frequency should be higher than the requirement of baseband sampling theorem. That is, sampling frequency should be more than twice the maximum frequency component for the case of sine wave. For smooth contour of sine wave, it is better to consider the sampling frequency to be much higher than the frequency of the sine wave. For instance, a 10 Hz sine wave sampled at 1000 Hz sampling frequency and for a duration of 1 sec can be generated in Scilab using the code given below.

In the above code, we can generate different sine waves by changing the frequency value from 10 to the required values (of course up to about 250-300 Hz). A stationary signal which is non-sinusoidal in nature can also be generated using a suitable function in place of sine.

The spectrum of the 10 Hz sine wave generated can be computed using the fast Fourier transform (FFT) command available in the Scilab. The signal is for 1 sec duration and hence has 1000 samples. For efficient computation, the next higher binary power is chosen, that is, 1024 point FFT can be used. A Scilab code for computing the spectrum is given below. The spectrum will be shown only for half portion, since the magnitude spectrum is symmetric with respect to origin. A peak at 10 Hz indicates presence of 10 Hz component in the signal.

Generation of multitone sine wave and its spectrum

On the similar lines described for singletone sine wave, a multitone sine wave can be generated. The only difference is that the number of frequency components are more than one. For better visualization, the sampling frequency should be much higher than the highest frequency component. For instance, a multitone sine wave made of 10, 50 and 100 Hz frequency components can be generated using the Scilab code given below. The shape of the signal that will be generated is relatively more complicated compared to the single tone case. Hence it becomes difficult to find the frequency components by direct measurement in the time domain. However, the important point is, all the frequency components are present at all instants of time. Hence the multitone sine wave that will be generated by this program will also be a stationary signal.

To observe the different frequency components, the discrete Fourier transform (DFT) can be computed using FFT for the entire sequence. A Scilab code for the same is given below. On executing this program, a magnitude spectrum showing peaks at 10, 50 and 100 Hz will be generated.

Generation of non-stationary multitone sine wave and its spectrum

The next step is to generate a non-stationary signal. A simple way of generating such signal is to use different combinations of the single tone components available. A Scilab code to generate a non-stationary multitone sine wave made of different combinations of 10, 50 and 100 Hz components is given below. The shape of the waveform will be different in different regions indicating the change in frequency components.

The spectrum of the non-stationary signal will be meaningful if it is computed over regions that can be treated as stationary. For instance, in the above given code, the whole signal can be divided into four stationary regions and hence each region can be considered separately for computing the magnitude spectrum. A Scilab code to generate magnitude spectra of different stationary regions of the non-stationary signal generated above is given below.

Plotting waveform and spectra of speech signal

The speech signal can be recorded using microphone as a transducer and sampled and stored using suitable sampling frequency. Such a digitized version of the speech signal will be stored in file and mostly used format is the microsoft wav format. Such a signal can be plotted in the time domain to observe the time varying characteristics of the signal. A Scilab code to load the speech signal stored in a microsoft wave format and plotting its waveform is given below.

The speech signal can be assumed to be stationary for all practical processing point of view when it is considered in blocks of 10-30 msec. The spectrum of any portion of speech signal can be computed using FFT by viewing it in blocks of 10-30 msec. A Scilab code for computing the spectra of selected portions is given below.

Limitation of Fourier Representation

The last point to be understood as part of this experiment is the limitation of conventional Fourier representation in resolving the frequency components present in the non-stationary signal. For this a program can be written that takes the entire non-stationary signal once and computes the spectrum. A Scilab code to compute spectrum of the entire non-stationary signal made of 10, 50 and 100 Hz components is given below. The output of this program will indicate the different frequency components present in the signal, but not when. The crucial information in case of non-stationary signals is the timing information which will be missing altogether in case of conventional Fourier representation.

The same thing is true with respect to speech also. A Scilab code for computing magnitude spectrum of whole speech signal is given below.

Generation of singletone sine wave and its spectrum

Generation of multitone sine wave and its spectrum

Generation of non-stationary multitone sine wave and its spectrum

Plotting waveform and spectra of speech signal

Limitation of Fourier Representation