Data

Databases used in the Interspeech Computational Paralinguistics Challenge (ComParE) series are usually owned by individual donators. End User License Agreements (EULAs) are usually given for participation in the challenge. Usage of the databases outside of the Challenges always has to be negotiated with the data owners – not the organisers of the Challenge. We aim to provide contact information per database – however, this requires consent of the data owners, which we are currently collecting.

Below, a description of the current 2021 data will soon be given. All of these corpora provide realistic data in challenging acoustic conditions. They feature further rich annotation such as speaker meta-data, transcripts, and segmentation, and are partitioned into training, development, and test data, observing subject independence. Benchmark results of the most popular approaches by open-source toolkits will be provided as in the years before including Deep Learning and Bag-of-Audio-Words baselines.