Multi-Voice Intonation Adaptation via Gradient Descent

Schwär S, Müller M (2026)


Publication Type: Journal article

Publication year: 2026

Journal

DOI: 10.1109/TASLPRO.2026.3675812

Abstract

Intonation in ensemble performances on instruments with flexible tuning involves a complex interaction between musicians shaped by musical context, acoustic conditions, and each performer's perception and preferences. In post-production, it is often desirable to compensate for unintended deviations while preserving expressive fluctuations and musically meaningful interaction between individual tracks and voices. In this paper, we formulate multi-voice intonation adaptation as a cost minimization problem, making three main contributions. First, we introduce a differentiable cost function that explicitly balances the adherence of each voice to an equal-tempered pitch grid and the harmonic fit between voices using sensory dissonance. Second, based on this differentiable cost function, we derive a gradient descent adaptation algorithm that produces smooth, time-varying pitch-shift curves without requiring score or note level information. We show how a small set of interpretable hyperparameters, including initialization, stopping criterion, step size, and momentum, allows for a controlled trade-off between the compensation of unintended intonation deviations and the preservation of expressive fluctuations. Third, we evaluate our method on string, wind, and vocal quartet multi-track recordings through objective and subjective experiments, demonstrating quality comparable to a commercial pitch-correction baseline while offering particular advantages in handling intonation drift and modeling interactions between voices. Beyond these results, the focus of this work is conceptual, making a typically heuristic post-production task transparent and controllable through an explicit cost-based optimization framework.

Authors with CRIS profile

How to cite

APA:

Schwär, S., & Müller, M. (2026). Multi-Voice Intonation Adaptation via Gradient Descent. IEEE Transactions on Audio, Speech and Language Processing. https://doi.org/10.1109/TASLPRO.2026.3675812

MLA:

Schwär, Simon, and Meinard Müller. "Multi-Voice Intonation Adaptation via Gradient Descent." IEEE Transactions on Audio, Speech and Language Processing (2026).

BibTeX: Download