AMSS-Net

Usecases of Progresive Manipulation, and an ablation study

In this demonstration, we show that we can apply the proposed method repeatedly to manipulated audio tracks, which is also known as Progressive Manipulation used in conversational system.

Usecase 1: Progressive Manipulation for making Saptial Audio

AMMS-Net can make an audio clip, which had been recorded with a single microphone, provide a better and realistic hearing experience to users.

For example, there are people playing instruments and singing in the above video.

We can make a better audio clip in terms of the spatial audio with AMSS-Net.

order description Audio Spectrogram Wav
0 original
1 pan bass completely to the right side
2 pan drums to the left side
3 pan vocals to the right side
4 apply light highpass to drums

Usecase 2: Progressive Manipulation to Remove Unwanted Noises

An audio clip may contain noises such as people shouting noise in the middle of a concert.

We can also observe reflections of some sounds, especially from the kick drums, which might annoy some listeners.

We perceive them more severe when we increase the volume of those instruments (the operation 1 in the table below)

Then, we can make a better audio clip to remove unwanted sounds or remove the reverberation effect with AMSS-Net as follows:

order description Audio Spectrogram Wav
0 original
1 decrease the volume of vocals
2 increase the volume of drums, bass
3 remove reverb from bass, drums

Usecase 3: Progrssive Manipulation for More Dramatic Introduction with Musical Effects

We are used to some dramatic introduction with musical effects such as fade-in or band pass filters.

Although some artists show performance with special devices for musical effects as described in the figure below,

it is usually hard to apply those effects in a real performance with limited devices and engineers.

In this case, we can use AMSS-Net to post-process the recorded clip for more dramatic introduction as follows.

order description Audio Spectrogram Wav
0 original
1 apply light highpass to vocals
2 apply medium highpass to bass
3 apply light lowpass to drums

Ablation Study: Progrssive Manipulation

choi hn · AMSS-Net sample - ACM Multimedia 2021
model desc x times Audio Spectrogram
origin N/A
amssnet apply highpass to vocals x 1
amssnet apply highpass to vocals x 20
wo_csa apply highpass to vocals x 20
wo_smpocm apply highpass to vocals x 20