More details can be found on the task page:
Teams ranking
Submission information | Rank | Test set | Validation | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Rank | Submission label | Name |
Technical Report |
Official rank |
Rank score |
MSE (avg) | MSE (aru) | MSE (Zebra Finch) | MSE (avg) | MSE (aru) | MSE (Zebra Finch) |
NoSync | Nosync Baseline | biodcase2025Task1 | 5 | 1.35 | 1.35 | 0.85 | 1.84 | 1.15 | 0.98 | 1.31 | |
CCBaseline | Crosscorrelation Baseline | biodcase2025Task1 | 7 | 3.32 | 3.32 | 1.10 | 5.55 | 8.45 | 6.86 | 10.03 | |
DLBaseline | Deep Learning Baseline | biodcase2025Task1 | 2 | 0.58 | 0.58 | 0.62 | 0.55 | 0.89 | 0.52 | 1.26 | |
landmarks | Landmark based synchronization | harjuLandmark2025 | 6 | 1.60 | 1.60 | 1.17 | 2.04 | 0.84 | 0.41 | 1.27 | |
BEATsCA | BEATs with Cross-Attention | nihalBeats2025 | 1 | 0.30 | 0.30 | 0.14 | 0.45 | 0.31 | 0.10 | 0.52 | |
CP-GlobalLag | ChronoPrint (GlobalLag) | bhattacharjee2025 | 8 | 3.93 | 3.93 | 0.62 | 7.23 | 5.30 | 0.39 | 10.20 | |
CP-MedianLag_Regress | ChronoPrint (MedianLag + Regression) | bhattacharjee2025 | 4 | 1.33 | 1.33 | 0.23 | 2.43 | 2.96 | 0.66 | 5.26 | |
CP-MedianLag | ChronoPrint (MedianLag) | bhattacharjee2025 | 3 | 1.29 | 1.29 | 0.36 | 2.22 | 3.00 | 0.26 | 5.74 |
Technical reports
Granular Fingerprinting for Temporal Alignment of ARU Recordings
Bhattacharjee, Aditya
Queen Mary University of London
Granular Fingerprinting for Temporal Alignment of ARU Recordings
Bhattacharjee, Aditya
Queen Mary University of London
Abstract
This report presents a submission to the BioDCASE 2025 challenge task on temporal alignment of recordings from autonomous recording units (ARUs). We approach the problem as one of granular fingerprinting, learning invariant audio embeddings at a fine temporal resolution. Our method leverages a self-supervised contrastive framework designed to capture alignment-robust features in short overlapping audio segments across asynchronous sensor recordings. The contrastive setup is trained using the FSD50K dataset with artificial mixtures of ”noisy” data points. The alignment is achieved in a zero-shot fashion by inferring the keypoints using a combination of cosine similarity-based lag calculation and linear regression.
BioDCASE 2025 Task 1
Hoffman, Benjamin and Gill, Lisa and Heath, Becky, and Narula, Gagan
BioDCASE 2025 Task 1
Hoffman, Benjamin and Gill, Lisa and Heath, Becky, and Narula, Gagan
Abstract
Coming soon!
Landmark-based synchronization for Drifting Audio Recordings
Harju, Manu and Mesaros, Annamaria
Tampere University
Landmark-based synchronization for Drifting Audio Recordings
Harju, Manu and Mesaros, Annamaria
Tampere University
Abstract
This technical report describes our submissions to the multichannel alignment task in the BioDCASE 2025 Challenge. Our system is based on matching and aligning audio landmarks, which are simple structures extracted from the spectrogram representations. Our code and the configuration used is available on GitHub.
BEATs with Cross-Attention for Multi-Channel Audio Alignment
Nihal, Ragib Amin and Yen, Benjamin and Ashizawa, Takeshi and Nakadai, Kazuhiro
Institute of Science Tokyo
BEATs with Cross-Attention for Multi-Channel Audio Alignment
Nihal, Ragib Amin and Yen, Benjamin and Ashizawa, Takeshi and Nakadai, Kazuhiro
Institute of Science Tokyo
Abstract
We modified the provided BEATs baseline with three main changes. First, we added cross-attention layers that allow audio embeddings from different channels to interact before making alignment predictions. Second, improved the training process with better data sampling, conservative augmentation (amplitude scaling and noise addition), and AdamW optimization with learning rate scheduling. Third, replaced the baseline's binary counting similarity metric with confidence-weighted scoring that uses the full range of model outputs. The system uses the same candidate generation approach as the baseline but processes alignment decisions differently. On validation data, the method achieved MSE of 0.099 for ARU and 0.521 for zebra finch.