Spectral entropy
The spectral entropy captures the "peakiness" of a spectrum. A spectrum with sharp peaks will have low entropy while a spectrum with flat distribution will have high entropy. The definition is based on the definition of the Shannon entropy.
Spectral entropy is computed from the power spectrum
where:
is the probability mass function (PMF) of the power spectrum : is the power at frequency bin , is the total number of frequency bins.
Normalization
The entropy is normalized by dividing by
Single-pass computation
The entropy is derived from the PMF of the power spectrum
However, the entropy formula can be reformulated to allow for a single-pass computation:
References
- Misra, H., Ikbal, S., Bourlard, H., & Hermansky, H. (2004). Spectral entropy based feature for robust ASR. 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1. https://doi.org/10.1109/ICASSP.2004.1325955
- Peeters, G. (2004). A large set of audio features for sound description (similarity and classification) in the CUIDADO project. In CUIDADO IST Project Report (Vol. 54).
- Eyben, F. (2016). Real-time Speech and Music Classification by Large Audio Feature Space Extraction. https://doi.org/10.1007/978-3-319-27299-3
- https://en.wikipedia.org/wiki/Entropy_(information_theory)
Code
INFO
The following snippet is written in a generic and unoptimized manner. The code aims to be comprehensible to programmers familiar with various programming languages and may not represent the most efficient or idiomatic Python practices. Please refer to implementations for optimized implementations in different programming languages.
import numpy as np
def spectral_entropy(spectrum: np.ndarray):
ps = np.abs(spectrum) ** 2
ps_sum = np.sum(ps)
if ps_sum == 0.0:
return 0.0
p = ps / ps_sum
p = p[p != 0]
return -np.sum(p * np.log2(p)) / np.log2(len(ps))