Title: Deep Speaker Modeling: Theories, Applications and Practice
Presenter: Shuai Wang, Yanmin Qian, Haizhou Li



Part I: Foundations and Recent Advances (60 minutes)
- Foundational theories and review of traditional methods in speaker modeling
- Evolution of speaker representation techniques in the deep learning era
- From i-vector to various deep speaker representations
- Applications of self-supervised and semi-supervised learning in speaker modeling
- Analysis of speaker representation capabilities in foundation speech models
- Leveraging pretrained large models
Part II: Applications Beyond Recognition (60 minutes)
- Speaker-adaptive speech synthesis
- Voice cloning technologies and ethical considerations
- Speaker representation in few-shot and zero-shot speech synthesis
- Personalized voice conversion systems
- Speaker perception in multimodal human-computer interaction
- Target speaker speech processing
- Target speaker extraction
- Target speaker speech recognition
- Target speaker verification
- Personalized VAD
Part III: Challenges and Countermeasures (30 minutes)
- Domain adaptation and domain-invariant learning
- Privacy-preserving speaker representations
- Robustness and adversarial attack defense
- Computational efficiency and model compression
- Explainability techniques and methods
Part IV: Practical Implementation (30 minutes)
- Introduction to tools and frameworks
- Wespeaker toolkit for speaker embedding learning
- Wesep toolkit for target speech extraction
- Case studies and demonstrations
- Interactive discussion and Q&A session