1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.
We propose a novel modified Mel-discrete cosine transform (MMD) filter bank structure, which restricts the overlap of each filter response to its immediate neighbor. In contrast to the well-known triangular filters employed in the extraction of the Mel-frequency cepstral coefficients (MFCC), the proposed filter structure has a smoother response and offers discrete cosine transformation and Mel-scale filtering in a single operation. It is known that the choice of MFCC as the only feature for voice activity detection (VAD) does not yield substantial improvements in the performance. Even with the long-term approach, we observe a not so encouraging VAD performance when MFCC features are employed. However, other long-term based VAD algorithms – without MFCC - are known to provide a substantial improvement in the performance under low SNR with time-varying statistics of speech and/or noise. In this work, we show that by employing the MMD followed by the long-term differential entropy of voice signal for VAD provides significant improvements in detection accuracy when compared with the other well-known long-term algorithms. Thus, this study opens up the possible benefits of the proposed MMD filter bank for other speech processing applications.
© Copyright 2022 IEEE – All rights reserved. Use of this website signifies your agreement to the IEEE Terms and Conditions.
A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.