Spectral entropy

The spectral entropy captures the "peakiness" of a spectrum. A spectrum with sharp peaks will have low entropy while a spectrum with flat distribution will have high entropy. The definition is based on the definition of the Shannon entropy.

Spectral entropy is computed from the power spectrum $X_{p} = | X |^{2} \in R^{M}$ using the following formula:

SpectralEntropy = - \frac{\sum_{m = 0}^{M - 1} p (m) \cdot \log_{2} p (m)}{\log_{2} M},

where:

$p (m)$ is the probability mass function (PMF) of the power spectrum $X_{p}$ :
$p (m) = \frac{X_{p} [m]}{\sum_{m = 0}^{M - 1} X_{p} [m]},$
$X_{p} [m]$ is the power at frequency bin $m$ ,
$M$ is the total number of frequency bins.

Normalization

The entropy is normalized by dividing by $\log_{2} M$ to constrain the output to the range $[0, 1]$ . This normalization guarantees that the spectral entropy is comparable across different spectra, independent of their resolution. The normalized value of 0 indicates a perfectly deterministic spectrum (e.g., a single peak), while a value of 1 indicates maximum uncertainty (e.g., a flat spectrum).

Single-pass computation

The entropy is derived from the PMF of the power spectrum $p (m)$ . As a result, computing the entropy typically requires two steps: first, calculating the total energy of the power spectrum, $X_{p, s u m} = \sum_{m = 0}^{M - 1} X_{p} [m]$ , and second, computing the entropy using the PMF $p (m) = \frac{X_{p} [m]}{X_{p, s u m}}$ .

However, the entropy formula can be reformulated to allow for a single-pass computation:

\begin{aligned} - \sum_{m = 0}^{M - 1} p (m) \cdot \log_{2} p (m) & = - \sum_{m = 0}^{M - 1} \frac{X_{p} [m]}{X_{p, s u m}} \cdot \log_{2} \frac{X_{p} [m]}{X_{p, s u m}} \\ = - \sum_{m = 0}^{M - 1} \frac{X_{p} [m]}{X_{p, s u m}} \cdot (\log_{2} X_{p} [m] - \log_{2} X_{p, s u m}) \\ = - \sum_{m = 0}^{M - 1} \frac{X_{p} [m]}{X_{p, s u m}} \log_{2} X_{p} [m] + \log_{2} X_{p, s u m} \cdot \underset{= 1}{\underset{⏟}{\sum_{m = 0}^{M - 1} \frac{X_{p} [m]}{X_{p, s u m}}}} \\ = \log_{2} X_{p, s u m} - \frac{1}{X_{p, s u m}} \cdot \sum_{m = 0}^{M - 1} X_{p} [m] \cdot \log_{2} X_{p} [m] \end{aligned}

References

Misra, H., Ikbal, S., Bourlard, H., & Hermansky, H. (2004). Spectral entropy based feature for robust ASR. 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1. https://doi.org/10.1109/ICASSP.2004.1325955
Peeters, G. (2004). A large set of audio features for sound description (similarity and classification) in the CUIDADO project. In CUIDADO IST Project Report (Vol. 54).
Eyben, F. (2016). Real-time Speech and Music Classification by Large Audio Feature Space Extraction. https://doi.org/10.1007/978-3-319-27299-3
https://en.wikipedia.org/wiki/Entropy_(information_theory)

Code

INFO

The following snippet is written in a generic and unoptimized manner. The code aims to be comprehensible to programmers familiar with various programming languages and may not represent the most efficient or idiomatic Python practices. Please refer to implementations for optimized implementations in different programming languages.

code.py

import numpy as np


def spectral_entropy(spectrum: np.ndarray):
    ps = np.abs(spectrum) ** 2
    ps_sum = np.sum(ps)
    if ps_sum == 0.0:
        return 0.0
    p = ps / ps_sum
    p = p[p != 0]
    return -np.sum(p * np.log2(p)) / np.log2(len(ps))

Run in playground

Algorithms

latest

Spectral entropy

Normalization

Single-pass computation

References

Code

latest

Spectral entropy ​

Normalization ​

Single-pass computation ​

References ​

Code ​

Spectral entropy

Normalization

Single-pass computation

References

Code