I’ve been reading a little bit about audio analysis and doing a lot of searching for audio libraries. There are two audio analysis methods I want to compute.

1. Compute the loudness of an audio file. That is add up the PCM output and divide by the length in seconds of the audio file. This outputs the average loudness of the file. It’s is an easy measure to compute and might be useful for categorizing music (I haven’t yet tried sorting my music by this “loudness” measure).

2. Do a spectral analysis of the audio file. That is compute the discrete (fast) fourier transform of the file and go from there. As the wikipedia article shows the discrete fourier transform requires significant analysis to get good spectral components for an audio file (see the Spectral Analysis section). For audio analysis there is also the Discrete Time Fourier Transform which is just a variant of the discrete fourier transform.

As of right now I’ve written the code to do **1.** but I don’t have anything for **2.**. Before I get too excited about doing the spectral analysis of audio files I need to read Example Applications DFT. And also brush up on my fourier analysis.

For implementing the spectral analysis I’m going to use the FFTW library developed at MIT.

I’ve listed below the packages and libraries for audio analysis and fourier analysis that I came across in my search.

SciPy (main python scientific platform (fourier analysis uses the FFTW library), CLAM (large audio analysis framework, lots of dependencies), Baudline Spectrum Analyzer, Matlab (no explanation needed), Octave (Matlab clone), FFTW.

I’m going to use the FFTW library for my program because the other libraries such as scipy etc. require 10’s of dependencies. And I want my program to rely on only a few dependencies. If writing C code gets too onerous I’ll switch to SciPy.