Prof. Dr. ir. Emanuël Habets

International Audio Laboratories Erlangen

Title: Advancing Speech Communication: From Spatial Capture to Runtime Adaptation

Abstract: Speech is at the heart of modern communication, from everyday voice and video calls to teleconferencing, telepresence, and AR/VR experiences. Yet making conversations sound natural across rooms, varying talker–microphone distances, and devices, remains a challenge. In this talk, I will share how recent advances in speech and acoustic signal processing are bringing us closer to seamless, high-quality voice communication anywhere. We begin with spatial sound capturing, revisiting classical beamforming approaches before introducing a new neural beamforming approach. This learning-based method enables us to “point our ears” in any direction with enhanced robustness and precise control. Next, we explore acoustic teleportation – a technique that transforms a recording made in one environment to sound as if it were recorded in another. By disentangling speech and acoustic embeddings, we can effectively make the recording sound as if it were made in the target environment. Finally, we discuss efficient deployment of these technologies on resource-constrained devices using slimmable neural networks – models that can adapt their size and computational complexity on the fly. Such architectures support both fixed (device-driven) and dynamic (data-driven) runtime adaptation, allowing resource-constrained devices to deliver advanced speech processing. Looking forward, these developments pave the way for more natural and immersive speech communication.

Emanuël Habets received the M.Sc. (2002) and Ph.D. (2007) degrees in Electrical Engineering from Eindhoven University of Technology (TU/e), The Netherlands. From 2007 to 2009, he was a Postdoctoral Fellow at the Technion—Israel Institute of Technology and Bar-Ilan University, Israel, and from 2009 to 2010, he was a Research Fellow in the Communications and Signal Processing Group at Imperial College London, U.K. He is currently a Full Professor at the International Audio Laboratories Erlangen—a joint institute of Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Fraunhofer IIS—and Head of the Department for Speech and Audio Research at Fraunhofer IIS (home of mp3), Germany.

Dr. Habets’ research focuses on speech and acoustic signal processing; he has authored over 100 journal papers and 250 conference papers. He was Co-Chair of the 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) in New Paltz, NY, and Co-Chair of the 2014 International Conference on Spatial Audio (ICSA) in Erlangen, Germany. He served on the IEEE Industry Digital Signal Processing Technology Standing Committee (2013–2015), as Associate Editor of IEEE Signal Processing Letters (2013–2017), and as Editor-in-Chief of the EURASIP Journal on Audio, Speech, and Music Processing (2016–2018). He is a founding member of the EURASIP Acoustic, Speech, and Music Signal Processing Technical Area Committee (Vice-Chair, 2015–2018; Chair, 2019–2021). With S. Gannot and I. Cohen, he received the 2014 IEEE Signal Processing Letters Best Paper Award. He currently serves on the Technical Program Committee of the 2025 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), on the IEEE Audio and Acoustic Signal Processing and IEEE Speech and Language Processing Technical Committees, and as a Senior Area Editor for the IEEE/ACM Transactions on Audio, Speech, and Language Processing.

Prof. Jane Wang

Fellow: IEEE, Canadian Academy of Engineering
Electrical and Computer Engineering Dept./Biomedical Engineering School
University of British Columbia (UBC), Vancouver, Canada

Title: Signal Processing Meets Deep Learning in Healthcare- A Case Study on Parkinson’s Disease (PD)

Abstract: Recent exciting breakthroughs and revolutions of artificial intelligence (AI), especially deep learning models, in numerous fields also come with big challenges: lack of interpretability and explainability due to the “black box” nature of current deep learning models; deep learning models are vulnerable to adversarial attacks; and the scarcity of well annotated data in real-world problems. Such challenges are particularly critical for biomedical and healthcare applications. Our suggestion is to explore the intersection of traditional signal/image processing (SP/IP) and deep learning to make the decision making clinically meaningful and explainable, by leveraging domain knowledge with the learning ability of deep learning to mitigate the deficiencies of traditional SP/IP and black-box deep learning approaches. In this talk, I will first provide an overview and then focus on illustrative research on Parkinson’s Disease (PD) study (e.g., pose estimation based assessment and monitoring in Parkinson’s Disease using video data). We propose innovative strategies (e.g., self-supervision, partial annotation, data synthesis) for training deep learning models without or reducing the need for explicit annotated data. The talk will conclude by brainstorming future research directions.

Jane Wang received the B.Sc. degree from Tsinghua University in 1996 and the M.Sc. and Ph.D. degrees from the University of Connecticut (UConn) in 2000 and 2002, respectively, all in electrical engineering. While at UConn, she received the annual Outstanding Engineering Doctoral Student Award. She has been Research Associate at the University of Maryland, College Park from 2002 to 2004. Since Aug. 2004, she has been with the ECE dept. at the University of British Columbia (UBC), Canada, and she is currently Professor. She is an IEEE Fellow, a Fellow of the Canadian Academy of Engineering (FCAE) and a member of the College of New Scholars, Artists and Scientists of the Royal Society of Canada.

Her research interests are in the broad areas of statistical signal processing and machine learning, with current focuses on digital media and biomedical data analytics. She co-received the EURASIP Journal on Applied Signal Processing (JASP) Best Paper Award (2004), and the IEEE SPS Best Paper Award (2005). She has published 200+ journal papers and 140+ peer-reviewed conference papers. She has been Co-Founder for Cortic Tech., a Vancouver startup, focusing on developing AI Education products, and their team won the First Grand Prize in the OpenCV AI Competition 2021. Jane’s professional services mainly include the following: she has been Elected Member to several IEEE SPS TCs, including the Bio Imaging and Signal Processing TC, Multimedia SP TC, Machine Learning SP TC, and IFS TC (2015-2018), and served on the IEEE Fellow Committee; she has been key Organizing Committee Member for numerous IEEE conferences and workshops (e.g., the Co-Technical Chair for ChinaSIP2014, GlobalSIP2017, ICIP2021, and ICIP2025 and the Co-General Chair of MMSP2018, DSLW2021, and ICIP2026); and she has been Associate Editor for the IEEE TSP, SPL, TMM, TIFS, TBME, and SPM, and Area Editor of SPM, and as Editor-in-Chief for IEEE SPL.

Prof. Xiao-Ping (Steven) Zhang

Tsinghua Shenzhen International Graduate School (SIGS), PhD, MBA, P.Eng., FIEEE, FEIC, FCAE

Title: Record and Represent Human Movement – A Type of Generic Sequential Symbolic Notation System

Abstract: Recording and representing knowledge with temporal characteristics such as music and dancing before digital age have been challenging. The brilliant way people invented such as sheet music has played a crucial role in our cultural development. In this talk, we first review a symbolic notation of a type of human body movement – Labanotation. We then illustrate how to take advantage of this powerful symbolic notation to recognize body gesture elements using neural network learning. Further, we introduce a new symbolic notation system, namely HandLaban, to record and represent human hand movements. This type of generic sequential symbolic notation system translates between dynamic human movements and static sequence of symbols. It has great potential in AI applications such as digital human, robotics as well as LLM based generative AI.

Xiao-Ping (Steven) Zhang received the B.S. and Ph.D. degrees from Tsinghua University, in 1992 and 1996, respectively, all in electronic engineering. He holds an MBA in Finance and Economics with Honors from the University of Chicago Booth School of Business. He is Tsinghua Pengrui Chair Professor at Tsinghua Shenzhen International Graduate School (SIGS), Tsinghua University. He was the founding Dean of Institute of Data and Information (iDI) at Tsinghua SIGS. He had been with the Department of Electrical, Computer and Biomedical Engineering, Toronto Metropolitan University (Formerly Ryerson University), Toronto, ON, Canada, as a Professor and the Director of the Communication and Signal Processing Applications Laboratory (CASPAL) and has served as the Program Director of Graduate Studies. His research interests include sensor networks and IoT, machine learning/AI/robotics, statistical signal processing, image and multimedia content analysis, and applications in big data, finance, and marketing.

Dr. Zhang is Fellow of the Canadian Academy of Engineering, Fellow of the Engineering Institute of Canada, Fellow of the IEEE, a registered Professional Engineer in Ontario, Canada, and a member of Beta Gamma Sigma Honor Society. He is the general Co-Chair for the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2021 and 2027. He is the general co-chair for 2017 GlobalSIP Symposium on Signal and Information Processing for Finance and Business, and the general co-chair for 2019 GlobalSIP Symposium on Signal, Information Processing and AI for Finance and Business. He was an elected Member of the ICME steering committee. He is the general chair for ICME2024 and BioCAS2023. He is Editor-in-Chief for the IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING. He served as Senior Area Editor for the IEEE TRANSACTIONS ON IMAGE PROCESSING and the IEEE TRANSACTIONS ON SIGNAL PROCESSING. He served as Associate Editor for the IEEE TRANSACTIONS ON IMAGE PROCESSING, the IEEE TRANSACTIONS ON MULTIMEDIA, the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, the IEEE TRANSACTIONS ON SIGNAL PROCESSING, and the IEEE SIGNAL PROCESSING LETTERS. He was selected as IEEE Distinguished Lecturer by the IEEE Signal Processing Society and by the IEEE Circuits and Systems Society.

Keynote

Prof. Dr. ir. Emanuël Habets

Prof. Jane Wang

Prof. Xiao-Ping (Steven) Zhang