Hierarchical Regulated Iterative Network for Joint Task of Music Detection and Music Relative Loudness Estimation

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

TASLP Volume 29 | 2021

Hierarchical Regulated Iterative Network for Joint Task of Music Detection and Music Relative Loudness Estimation

TASLPRO Articles

By:

Bijue Jia; Jiancheng Lv; Xi Peng; Yao Chen;Shenglan Yang

Music detection (MD) refers to the task of finding out whether a music event happens in an audio file and what time it starts and ends, i.e., splitting an audio recording and annotating each fragment as music or non-music. MD not only has the basic application in automatic retrieving and localizing audio data based on the type of content but also has a more practical application of monitoring music for copyright management. The practical application in the music industry is the royalty collection in broadcasting. As elaborated in [1]: the Austrian National Broadcasting Corporation (ORF) requires knowing where exactly the music appears in the soundtrack of TV production, and detecting the music is in the foreground or the background. ORF posed this requirement for the purpose of calculating the royalty fees, which are paid to a national agency according to certain rules. Ideally, the production team would provide a list of all the music segments occurring in TV production, but in reality, these lists are largely inaccurate. As a result, ORF has to guess the amount of music within a production more or less because manually annotating all productions is impossible. Also, the copyright fee will be different for music is used in the foreground or the background [2]. Hence, it is highly expected to develop method of music relative loudness estimation (MRLE), i.e., annotating each fragment as fg-music, bg-music, or non-music.

In the past, research mainly focused on the music/speech detection task, which is segmenting and annotating audio as music, speech, or noise. Early work [3] explored the distinguishable features between music and speech from the perspective of signal processing. Using these handcrafted features, later research [4]–[5][6] added subsequent classifiers to do music/speech detection. Recent works [7]–[8][9] focused on automatically-learned features from spectrogram images and used neural networks as classifiers. In contrast to simple music/speech detection task, the emphasis point of MD task is different: music is used to accentuate scenes, therefore speech and any noise signals might be present concurrently.

Read on IEEE Xplore

Tags:

IEEE TASLP Article

SPS Social Media

IEEE SPS Facebook Page https://www.facebook.com/ieeeSPS
IEEE SPS X Page https://x.com/IEEEsps
IEEE SPS Instagram Page https://www.instagram.com/ieeesps/?hl=en
IEEE SPS LinkedIn Page https://www.linkedin.com/company/ieeesps/
IEEE SPS YouTube Channel https://www.youtube.com/ieeeSPS

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel

© Copyright 2025 IEEE - All rights reserved. Use of this website signifies your agreement to the IEEE Terms and Conditions.
A public charity, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.

ieeee-sps-logo-social.png

2030 IEEE International Conferences on Acoustics, Speech, and Signal Processing (ICASSP 2030)

Congratulations Image (1).png

SPS Members Receive 2026 IEEE Technical Field Awards!

congratulations.jpg

Congratulations to Signal Processing Society Members Elevated to Senior Members!

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

Hierarchical Regulated Iterative Network for Joint Task of Music Detection and Music Relative Loudness Estimation

Publications & Resources

For Authors

Congratulations Image (1).png

congratulations.jpg

Submit_Manuscript_pg.jpg

Top Reasons to Join SPS Today!

Hierarchical Regulated Iterative Network for Joint Task of Music Detection and Music Relative Loudness Estimation

SPS Social Media

IEEE SPS Educational Resources

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

Hierarchical Regulated Iterative Network for Joint Task of Music Detection and Music Relative Loudness Estimation

Search form

You are here

Publications & Resources

For Authors

Top Reasons to Join SPS Today!

Hierarchical Regulated Iterative Network for Joint Task of Music Detection and Music Relative Loudness Estimation

SPS Social Media

IEEE SPS Educational Resources