Audio, Image and Video Processing: Week 3: Digital Signal Processing

This week we covered quite a lot of topics. Ironically I couldn't listen to most of the lecture due to having my fingers in my ears to block out the painful sound effects included with the slides. Turns out learning about audio processing is tricky when you have hypersensitive hearing.

Missing most of the lecture made the class test afterwards quite difficult. I could not process what we were meant to be doing in the lab either due to the lack of written instructions and too much background noise so I have just read through the lecture notes, picked out the topics I hadn't yet covered in my blog and researched them.

Harmonics

Harmonics are the specific frequencies created by standing waves. The 'fundamental' of the harmonic is the loudest tone you can hear. A harmonic is a integer multiple of the fundamental's frequency (e.g. if the fundamental is 'f' you can get 2f, 3f, 4f, etc.).

Sound intensity and level

Sound intensity measurements are extremely difficult to make so the intensity of sound is generally expressed as an equivalent sound level. This is done by comparing any sound to a standard sound intensity (the quietest 1KHz tone the average human can hear).

Echoes and Reverberation

Reverberations are the reflections of sound waves hitting off various different surfaces (e.g. walls, desks, ceiling, floor) and reaching your ear. Since they take less than 0.1 secs of time difference to reach your ear from the original sound source they are processed all together and may just result in hearing the sound for a slightly prolonged time.

Echoes on the other hand are reflections of sound that have a time delay of more than 0.1 secs. Because of this delay there is a gap between the original sound and the reflection. The second sound hear is called an echo.

The Inverse-Square Law

The Inverse-Square Law means that the intensity of the sound heard varies inversely as the square of the distance 'R' from the source of the sound.

Sound will be roughly nine times less intense at a distance of 3m from its origin, as at a distance of 1m in open air.

Spectrum

A spectrum is a graph of sound level (amplitude) against frequency over a short period of time. Since many sound waves contain different frequencies this graph is often useful.

Spectrogram

Instead of a spectrum, the variation of sound intensity with time and frequency can be displayed by representing intensity by colour or brightness on a frequency vs time axis. This is called a spectrogram.

Digital Signal Processing Systems

Steps in a Digital Signal Processing System:

1. The signal is inputted via a microphone or other recording equipment.

2. The recording is then converted from analogue to digital (into binary numbers).

3. Editing is then done to the digital copy (e.g filtering, pitch warp, echo, reverb, etc.).

4. The signal is then changed from digital back into analogue.

5. Then the signal is smoothed out.

6. The edited recording is outputted.

Computers cannot understand analogue signals, which is why they must be converted into digital first and then converted back again so that we can listen and process them.

Why use digital processing?

1. Precision

Precision of DSP systems is only limited by the conversion process at both input and output - analogue to digital and vice versa. This is only in theory though since in reality the sampling rate and word length (no. of bits) restrictions affect the precision.

2. Robustness

Digital systems are less susceptible to component tolerance and electrical noise (pick-up) variations due to logic noise margins.

An important factor for complex systems is that adjustments for electrical drift and component ageing are essentially removed.

3. Flexibility

Flexibility of the DSP is due to its programmability, which allows it to be upgraded and for its processing operations to be expanded easily without necessarily incurring large scale hardware changes.

Sound card architecture

Sampling a signal

Sampling a signal is when the system samples the signal at a specific time, nT seconds. It the continues sampling the signal over periods of T seconds.

The rate that a signal is usually sampled at is double the frequency of the human hearing range. For example, a signal heard at 10 Hz would be sampled at 20Hz.

References:

Echos and reverberation: http://www.physicsclassroom.com/mmedia/waves/er.cfm
Spectrum graphic: http://www.tablix.org/~avian/blog/archives/2008/11/the_sound_of_hot_tea/
Sampling: http://cnx.org/content/m15655/latest/

Audio, Image and Video Processing

Sunday, 14 October 2012

Week 3: Digital Signal Processing

No comments: