Multichannel Alignment


Challenge results

More details can be found on the task page:

Task description page

Teams ranking

Submission information Rank Test set Validation
Rank Submission label Name Technical
Report
Official
rank
Rank
score
MSE (avg) MSE (aru) MSE (Zebra Finch) MSE (avg) MSE (aru) MSE (Zebra Finch)
NoSync Nosync Baseline biodcase2025Task1 5 1.35 1.35 0.85 1.84 1.15 0.98 1.31
CCBaseline Crosscorrelation Baseline biodcase2025Task1 7 3.32 3.32 1.10 5.55 8.45 6.86 10.03
DLBaseline Deep Learning Baseline biodcase2025Task1 2 0.58 0.58 0.62 0.55 0.89 0.52 1.26
landmarks Landmark based synchronization harjuLandmark2025 6 1.60 1.60 1.17 2.04 0.84 0.41 1.27
BEATsCA BEATs with Cross-Attention nihalBeats2025 1 0.30 0.30 0.14 0.45 0.31 0.10 0.52
CP-GlobalLag ChronoPrint (GlobalLag) bhattacharjee2025 8 3.93 3.93 0.62 7.23 5.30 0.39 10.20
CP-MedianLag_Regress ChronoPrint (MedianLag + Regression) bhattacharjee2025 4 1.33 1.33 0.23 2.43 2.96 0.66 5.26
CP-MedianLag ChronoPrint (MedianLag) bhattacharjee2025 3 1.29 1.29 0.36 2.22 3.00 0.26 5.74

Technical reports

Granular Fingerprinting for Temporal Alignment of ARU Recordings

Bhattacharjee, Aditya
Queen Mary University of London

Abstract

This report presents a submission to the BioDCASE 2025 challenge task on temporal alignment of recordings from autonomous recording units (ARUs). We approach the problem as one of granular fingerprinting, learning invariant audio embeddings at a fine temporal resolution. Our method leverages a self-supervised contrastive framework designed to capture alignment-robust features in short overlapping audio segments across asynchronous sensor recordings. The contrastive setup is trained using the FSD50K dataset with artificial mixtures of ”noisy” data points. The alignment is achieved in a zero-shot fashion by inferring the keypoints using a combination of cosine similarity-based lag calculation and linear regression.

PDF

BioDCASE 2025 Task 1

Hoffman, Benjamin and Gill, Lisa and Heath, Becky, and Narula, Gagan

Abstract

Coming soon!

Landmark-based synchronization for Drifting Audio Recordings

Harju, Manu and Mesaros, Annamaria
Tampere University

Abstract

This technical report describes our submissions to the multichannel alignment task in the BioDCASE 2025 Challenge. Our system is based on matching and aligning audio landmarks, which are simple structures extracted from the spectrogram representations. Our code and the configuration used is available on GitHub.

PDF

BEATs with Cross-Attention for Multi-Channel Audio Alignment

Nihal, Ragib Amin and Yen, Benjamin and Ashizawa, Takeshi and Nakadai, Kazuhiro
Institute of Science Tokyo

Abstract

We modified the provided BEATs baseline with three main changes. First, we added cross-attention layers that allow audio embeddings from different channels to interact before making alignment predictions. Second, improved the training process with better data sampling, conservative augmentation (amplitude scaling and noise addition), and AdamW optimization with learning rate scheduling. Third, replaced the baseline's binary counting similarity metric with confidence-weighted scoring that uses the full range of model outputs. The system uses the same candidate generation approach as the baseline but processes alignment decisions differently. On validation data, the method achieved MSE of 0.099 for ARU and 0.521 for zebra finch.

PDF