The Cross-Domain Mosquito Species Classification (CD-MSC) task focuses on mosquito species recognition under domain shift. Participants train systems on recordings collected from multiple source domains and test whether they can still recognise mosquito species when recording conditions change across location, device, or acoustic environment.
Description
Mosquito-borne diseases affect more than one billion people each year and cause close to one million deaths. Traditional mosquito surveillance relies on traps and manual species identification. That process is slow, labour-intensive, difficult to scale, and can expose field workers to infection risk. Audio-based mosquito monitoring offers a lower-cost and more scalable complement, but robust mosquito species classification remains difficult under real recording conditions.
Mosquito flight tones are narrow-band, often low in signal-to-noise ratio, and easily masked by background noise. Recordings for several epidemiologically relevant species are also limited. Variation across devices, environments, and collection protocols further increases the difficulty of reliable classification. In practice, a model may perform well under familiar conditions but fail when evaluated on recordings collected in different settings.
CD-MSC is designed to study this problem directly. In this task, a domain refers to an acquisition condition associated with the recording source, including differences in location, device, or acoustic environment. The task therefore evaluates not only whether a system can recognise mosquito species, but also whether it can generalise across domains.
Development dataset
The released development dataset is provided for the CD-MSC task. It is fully open and supports transparent baseline reproduction, model development, and custom split design.
Each audio file follows the naming format:
S_<speciesID>_D_<domainID>_<clipIndex>
File Structure:
Development_data
├── raw_audio
│ └── S_<speciesID>_D_<domainID>_<clipIndex>.wav
└── metadata
└── TrainVal_ids.txt
└── Training_ids.txt
└── Validation_ids.txt
└── Test_ids.txt
└── split_summary.json
Dataset overview
| Item | Value |
|---|---|
| Number of domains | 5 |
| Number of species | 9 |
| Total number of clips | 271380 |
| Total duration | 218388.40 seconds (60.66 hours) |
Domain Distribution
| Domain | Number of clips |
|---|---|
| D1 | 4065 |
| D2 | 784 |
| D3 | 679 |
| D4 | 200 |
| D5 | 265652 |
Species Distribution
| Species | Species ID | Number of clips |
|---|---|---|
| Aedes aegypti | 1 | 81587 |
| Aedes albopictus | 2 | 18517 |
| Culex quinquefasciatus | 3 | 72056 |
| Anopheles gambiae | 4 | 46998 |
| Anopheles arabiensis | 5 | 21117 |
| Anopheles dirus | 6 | 127 |
| Culex pipiens | 7 | 29754 |
| Anopheles minimus | 8 | 550 |
| Anopheles stephensi | 9 | 674 |
The released development dataset is uneven across both species and domains. Participants are encouraged to consider both class balance and domain balance during model development.
Download
Task setup
The development set is divided into two main partitions: training\validation (trainval) and test. The trainval set is intended for model development and local validation. The test set is used to analyse cross-domain performance under the released development setting.
Participants are welcome to use the species and domain information encoded in the audio IDs to construct alternative domain-aware development splits.
TrainVal set
| Domain | Ae.aeg | Ae.alb | Cx.qui | An.gam | An.ara | An.dir | Cx.pip | An.min | An.ste | Total |
|---|---|---|---|---|---|---|---|---|---|---|
| D1 | 111 | 73 | 0 | 0 | 0 | 87 | 59 | 200 | 200 | 730 |
| D2 | 0 | 0 | 0 | 0 | 0 | 0 | 60 | 0 | 200 | 260 |
| D3 | 0 | 17 | 0 | 0 | 0 | 0 | 0 | 200 | 200 | 417 |
| D4 | 20 | 0 | 12 | 0 | 0 | 0 | 0 | 51 | 0 | 83 |
| D5 | 73297 | 16575 | 64838 | 42298 | 19005 | 0 | 26660 | 0 | 0 | 242673 |
Test set
| Domain | Ae.aeg | Ae.alb | Cx.qui | An.gam | An.ara | An.dir | Cx.pip | An.min | An.ste | Total |
|---|---|---|---|---|---|---|---|---|---|---|
| D1 | 12 | 6 | 672 | 818 | 1820 | 0 | 7 | 0 | 0 | 3335 |
| D2 | 0 | 419 | 0 | 0 | 0 | 0 | 6 | 99 | 0 | 524 |
| D3 | 192 | 2 | 0 | 0 | 0 | 0 | 68 | 0 | 0 | 262 |
| D4 | 2 | 0 | 1 | 0 | 0 | 40 | 0 | 0 | 74 | 117 |
| D5 | 7953 | 1425 | 6533 | 3882 | 292 | 0 | 2894 | 0 | 0 | 22979 |
Evaluation dataset
The evaluation dataset will be released according to the challenge timeline. Please follow the official schedule for updates.
Evaluation metrics
The official evaluation reports:
- BAseen: balanced accuracy on clips from seen domains
- BAunseen: balanced accuracy on clips from unseen domains
- DSG = |BAunseen - BAseen|: domain shift gap
For the baseline development test set, each clip is treated as one sample and produces one species prediction. The main evaluation target is species classification.
Task Rules
- Participants may use publicly available pre-trained models and representations. However, external labelled mosquito data are not permitted.
- Participants may use ensembles of multiple models.
- Participants may define their own development split strategy. The released baseline split is provided only to reproduce the reference baseline and does not restrict participant system development.
If you have any questions, please contact the organizers.
Baseline system
System description
The released baseline provides a fully open and reproducible reference for the CD-MSC task. It is designed to be lightweight, easy to run, and easy to extend.
The baseline is built on:
- 8 kHz input audio
- 64-bin log-mel spectrogram features
- a lightweight multitemporal resolution convolutional neural network (MTRCNN) with 0.22 million parameters
- a primary species-classification head
- an auxiliary domain-classification head
The MTRCNN baseline is well-suited to CD-MSC because it can process audio clips of arbitrary length longer than 1.1 s. This makes it particularly convenient for the task, where clip duration may vary across recordings. Variable-length clips are handled through dynamic padding and masking, so participants can work with the released data without enforcing a single fixed clip length at inference time.
Source code
The baseline is provided as a transparent starting point for participants.
Training Setup
The released baseline is trained under a fixed and reproducible default setup:
- optimiser: AdamW
- learning rate: 0.001
- training batch size: 64
- evaluation batch size: 8
- maximum epochs: 100
- early stopping starts after epoch 10
- early stopping patience: 5 epochs
- model selection metric: validation species balanced accuracy
The released code also supports repeated experiments with fixed random seeds, and the default setup uses 10 seeds.
Participants are welcome to use the released setup directly or adapt it to develop stronger systems for cross-domain mosquito species classification.
Baseline performance
Official cross-domain results on the development test split are shown below.
| Checkpoint | BAseen | BAunseen | DSG |
|---|---|---|---|
| Best-validation checkpoint | 0.8806 ± 0.0108 | 0.1751 ± 0.0197 | 0.7055 ± 0.0248 |
| Final checkpoint | 0.8822 ± 0.0097 | 0.1704 ± 0.0180 | 0.7118 ± 0.0235 |
The released baseline performs strongly on seen domains but degrades markedly on unseen domains. This result shows that cross-domain generalisation remains the central challenge of the CD-MSC task.
Citation
If you use the development dataset, the released baseline, or refer to the BioDCASE 2026 Cross-Domain Mosquito Species Classification task, please cite the following paper.
BioDCASE 2026 CD-MSC Baseline
MTRCNN model: 📄 PDF
@INPROCEEDINGS{10890031,
author={Hou, Yuanbo and Ren, Qiaoqiao and Wang, Wenwu and Botteldooren, Dick},
booktitle={IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
title={Sound-Based Recognition of Touch Gestures and Emotions for Enhanced Human-Robot Interaction},
year={2025},
pages={1-5},
doi={10.1109/ICASSP49660.2025.10890031}
}
Support
For participant questions related to this task, please feel free to contact us. We have a Slack or WeChat group for discussion and support.
- Yuanbo Hou, Machine learning research group, University of Oxford,
- Vanja Zdravkovic, University of Oxford,