Tools

openSMILE.

Authors: Florian Eyben, Martin Wöllmer, Björn Schuller

The openSMILE tool enables you to extract large audio feature spaces in realtime. SMILE is an acronym for Speech & Music Interpretation by Large Space Extraction. It is written in C++ and is available as both a standalone commandline executable as well as a dynamic library (A GUI version is to come soon). The main features of openSMILE are its capability of on-line incremental processing and its modularity. Feature extractor components can be freely interconnected to create new and custom features, all via a simple configuration file. New components can be added to openSMILE via an easy plugin interface and a comprehensive API. 

openSMILE is free software licensed under the GPL license and is currently available via Subversion (http://subversion.tigris.org/) in a pre-release state here. Commercial licensing options are available upon request.

To directly check out the Subversion repository, type the following command in a command-line prompt on a system where SVN is installed:
   svn co https://opensmile.svn.sourceforge.net/svnroot/opensmile opensmileIf you use openSMILE for your research, please cite the following paper: 

Florian Eyben, Martin Wöllmer, Björn Schuller: “openSMILE – The Munich Versatile and Fast Open-Source Audio Feature Extractor”, Proc. ACM Multimedia (MM), ACM, Firenze, Italy, 25.-29.10.2010.

A brief summary of openSMILE’s features is given here:

  • Cross-platform (Windows, Linux, Mac)
  • Fast and efficient incremental processing in real-time
  • High modularity and reusability of components
  • Plugin support
  • Multi-threading support for parallel feature extraction 
  • Audio I/O:
    • WAVE file reader/writer
    • Sound recording and playback via PortAudio library.
    • Acoustic echo cancellation for full duplex recording/playback in an open-microphone setting
  • General audio signal processing:
    • Windowing Functions (Hamming, Hann, Gauss, Sine, …)
    • Fast-Fourier Transform
    • Pre-emphasis filter
    • Comb filter (available soon)
    • FIR/IIR filter (available soon)
    • Autocorrelation
    • Cepstrum
  • Extraction of speech-related features:
    • Signal energy
    • Loudness (pseudo)
    • Mel-spectra
    • MFCC
    • Pitch
    • Voice quality
    • Formants (available soon)
    • LPC (available soon)
  • Music-related features:
    • Pitch classes (semitone spectrum)
    • Chroma features
    • Chroma based CENS features
    • Tatum and Meter vector
  • Moving average smoothing of feature contours
  • Moving average mean subtraction (e.g. for on-line cepstral mean subtraction)
  • Delta Regression coefficients of arbitrary order
  • Functionals:
    • Means, Extremes
    • Moments
    • Segments
    • Peaks
    • Linear and quadratic regression
    • Percentiles
    • Durations
    • Onsets
    • DCT coefficients
  • Popular feature file formats supported:
    • Hidden Markov Toolkit (HTK) parameter files (write) 
    • WEKA Arff files (currently only non-sparse) (read/write)
    • Comma separated value (CSV) text
    • LibSVM feature file format
  • Fully HTK compatible MFCC, energy, and delta regression coefficient computation
  • Fast: 6k features extracted with 0.02 RTF

Acknowledgment: openSMILE’s development has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement No. 211486 (SEMAINE).