Research projects at the professorship
Ongoing projects:
Generative speaker-independent prosody modeling for articulatory speech synthesis
FKZ: | KK5049503FG3 |
Duration: | 01.01.2025 - 31.12.2028 |
Sponsor: | ESF Plus |
Short description: | The aim of the project is to improve the quality of speech synthesis through the use of a text generator. Information about the purpose, meaning and linguistic structure of the utterances is integrated into the synthesis process. In addition, the system should be flexibly and efficiently adaptable to different speakers, speaker groups or new domains. |
Variability of vowel formants due to the finite impedance of the glottis, the velopharyngeal port and the vocal tract walls
Duration: |
15.04.2024 - 14.04.2027 |
Funded by: |
German Research Foundation (DFG) |
Cooperation partners: |
Phoniatrics and Pedaudiology, Clinic of the Ludwig-Maximilians-University Munich (LMU) |
Short description: |
The project aims to answer the questions related to the effect of the boundary conditions on vowel formants, which involves interactions between multiple mechanisms. Specifically, we investigate the effects of finite glottal impedance and velopharyngeal opening on the shift of vowel resonance frequencies. To this end, a combination of methodologies is employed, including computer simulations and physical models. We also collect and analyze different types of voices, such as pathological and singing voices, as well as typical speech by healthy speakers. |
Miniaturized radar sensor technology and analysis for silent communication (SIRANA)
Duration: |
|
Intelligent speech recognition based on diversified synthetic speech data
FKZ: | KK5049503FG3 |
Duration: |
01.08.2023 - 31.07.2026 |
Sponsor: | AiF Projekt GmbH, ZIM - Cooperation projects |
Cooperation partner: | Mediainterface GmbH |
Brief description: |
The aim of this project is to improve the accuracy of speech recognition systems by using diversified articulatory-synthetic speech data whose synthesis process can be interpreted and observed. The generation of more natural and realistic speech data is therefore the primary challenge. The aim is to be able to recognize spoken language with different accents, dialects and ways of speaking. To generate this diverse speech data, the articulatory speech synthesis model VocalTractLab is used to create different speaker models that differ in terms of anatomical and speech style-specific characteristics. By varying these parameters in a targeted manner, realistic yet controllable German utterances can be generated, providing a reliable basis for the evaluation and improvement of speech recognition systems. |
Multi-sensory non-invasive voice prosthetics using AI (MUSIK)
FKZ: | 100686372 |
Duration: |
01.06.2023 - 31.05.2026 |
Sponsor: | Chair of Radio Frequency and Photonics Engineering (TUD), Institute of Textile Machinery and High Performance Material Technology (TUD), Altavo GmbH |
Brief description: | In this project, a novel, natural-sounding substitute voice is to be developed. Specially developed radar sensor technology is used, which records the (silent) movements of the tongue, lips, etc. Artificial intelligence then generates a speech signal from the measurement data. The focus is on the investigation of additional sensor modalities with which silent speech movements can be recorded in order to record vocal tract movements even more reliably and thus further improve the artificial voice. The modalities being investigated are the intraoral ootic measurement of tongue and lip movements (optical palatography, OPG) and low-frequency ultrasound for internal acoustic stimulation of the vocal tract. |
Radar-based silent speech recognition (GerKi)
FKZ: | 20D1930B |
Duration: |
01.01.2021 - 31.12.2025 |
Funded by: | German Aerospace Center (DLR) |
Brief description: | The project deals with silent speech recognition using a pulse radar. The radar system used will consist of an antenna radiating through the upper vocal tract and two receiving antennas. Applications of this technology are e.g. to allow speech communication in environments with high noise levels. |
Completed projects:
Self-Learning Physical Reservoirs for Intelligent Bioelectronic Interfaces
FKZ: |
|
Duration: |
01.11.2022 - 15.04.2023 (Phase 1) and 01.05.2023 - 14.07.2023 (Phase 2) |
Sponsor: |
Federal Agency for Leapfrog Innovation, Challenge "New Computing Concepts" |
Cooperation partner: |
Dresden Integrated Center for Applied Physics and Photonic Materials (IAPP) |
Short description: |
In this project, we aim to explore the theory and practical hardware implementation of self-learning physical reservoirs as a highly power-efficient and versatile edge-computing system for on-chip classification of bioelectronic signals or environmental monitoring. With this approach, we are not following the current paradigms of edge-computing using e.g., field-programmable gated array or memristive networks, but rather we will pioneer a radically new path of analog, in-material computing using self-learning physical reservoirs. Our edge-AI-system does not require massive neural networks with precisely adjustable weights, but rather utilizes the nonlinearity of the material to create a sparse, random neural network offering superior power-efficiency over other edge-AI approaches. |
Non-invasive voice prosthetics using vocal tract radar sensors and real-time AI speech synthesis (Promise-AI) - Subproject: Radar-based measurement of the vocal tract signal, data collection and validation on test subjects
FKZ: |
16SV8989 |
Duration: |
01.08.2022 - 31.07.2024 |
Funded by: |
BMBF |
Cooperation partners: |
Altavo GmbH, Dresden; Chair of High Frequency Technology, TU Dresden |
Brief description: |
Voice loss represents a severe disability, often accompanied by social withdrawal and inability to work. The collaborative project Promise-AI pursues a fundamentally new approach to voice rehabilitation in order to help voiceless people regain a natural-sounding, easy-to-learn artificial voice without complications or stigma. This will involve capturing articulation movements of the vocal tract with non-invasive radar sensor technology, processing them in real time, and using a previously trained AI to synthesize natural-sounding speech that is output through the speaker of a smartphone. In a participatory approach, patients and other stakeholders will be involved in the design of a patient-friendly MTI concept. The sub-project of TU Dresden is focused on the investigation of the radar sensor technology, signal analysis and EMC compatibility as well as on the acquisition of training data and the execution of validation study. |
Development of a trajectory planning algorithm and the necessary trajectory control for the energy-efficient use of an unmanned aerial system network in urban areas (UrbanSens)
FKZ: |
20D2106C |
Duration: |
01.01.2022 - 31.12.2024 |
Funded by: |
Federal Ministry for Economic Affairs and Energy |
Cooperation partners: |
Chair of Flight Mechanics and Control (TU Dresden), Chair of Radio Frequency and Photonics Engineering (TU Dresden), Infineon Technologies, and others |
Brief description: |
The project addresses the efficient use of networked unmanned aerial vehicles in an urban environment through the development of new sensor and communication strategies, and the development of novel control methods to exploit local weather effects. It makes important contributions to the funding policy goal of "Capable and efficient aviation" with a focus on "New mobility of the future". Extensive work is being carried out in the field of unmanned systems for logistics tasks, starting with an investigation of urban environmental aerodynamics and the development of new sensors for wind field measurement through to trajectory control and evaluation in flight tests. The technologies developed are also essential for more environmentally friendly manned aviation, e.g. in the field of urban air mobility. |
Digital transformation and sovereignty of future communication networks (6G-Life)
FKZ: | 16KISK001K |
Duration: |
15.08.2021 - 14.08.2025 |
Funded by: | BMBF |
Adaptive monitoring system for ultra-long cable runs (HIMON)
Duration: | 01.04.2020 - 30.07.2023 |
Sponsor: | HIGHVOLT Prüftechnik Dresden GmbH |
Cooperation partner: | HIGHVOLT Prüftechnik Dresden GmbH |
Brief description: | Innovative method for monitoring and condition assessment of high-voltage components of critical infrastructure systems in power supply systems. |
Development of speaker recognition and verification for medical dictation systems (SEMED)
FKZ: | ZF4443005HB9 |
Duration: | 01.04.2020 - 31.12.2022 |
Sponsor: | AiF Projekt GmbH |
Short description: | The aim of this project is the development of speaker recognition and verification for medical dictation systems based on speaker-specific features. The new speaker-dependent features will enable significantly higher recognition performance both in speech assignment and speech recognition as a whole. In addition, the entire system should be robust against interference and background noise. |
EVoc-Learn: High quality simulation of early vocal learning (language acquisition)
Duration: | 01.11.2019 - 31.10.2022 |
Funded by: | Leverhulme Trust |
Link: | http://www.homepages.ucl.ac.uk/~uclyyix/EVL/project.html |
Accent improvement through pronunciation training with articulatory feedback (ADAMA)
FKZ: | 01/S19019B |
Duration: | 01.10.2019 - 30.09.2022 |
Funded by: | BMBF |
Cooperation partner: | Linguwerk GmbH Dresden |
Brief description: | The project deals with the learning of a new language and the desire to speak it with as little accent as possible. In order to make independent practice more flexible and efficient, a system is to be developed which for the first time combines three different approaches for CAPT (Computer Aided Pronunciation Training) in an automated overall concept: articulatory biofeedback, a virtual, animated teacher and the evaluation of pronunciation on the basis of acoustic and articulatory measurements. |
Development of an age-appropriate assistance system for device operation based on tongue movements
Duration: | 12.12.2018 - 30.06.2021 |
Funded by: |
Saxon Development Bank |
Link: |
Silent mobile voice communication using radar sensors and articulatory speech synthesis (RadarSpeech)
Duration: | 15.08.2019 - 30.06.2022 |
Sponsor: | Sächsische Aufbaubank |
Link: |
Broadband acoustic modeling of speech
FKZ: | BI 1639/7-1 |
Duration: | 13.05.2019 - 12.05.2022 |
Funded by: | German Research Foundation DFG |
Cooperation partner: |
|
Short description: |
The high frequency part of the speech spectrum remains still relatively unexplored and challenging for the physical modeling of speech production. However, it contains perceptively relevant information and may, as an example, play a role in the naturalness and intelligibility of speech synthesis. In this project we will develop a wideband speech synthesis framework based on the multimodal method which is an efficient high frequency acoustic simulation method. This physical relevance will be assessed experimentally and the synthesis will be evaluated perceptually. |
EASY - Expressive Articulatory SYnthesis of Audiobooks
FKZ: | ZF4443004BZ8 |
Duration: | 01.08.2018 - 31.07.2021 |
Funded by: | ZIM - Central Innovation Program for SMEs |
Cooperation partner: |
Aristech GmbH, Heidelberg |
Brief description: |
As a simulation of the speech apparatus, articulatory speech synthesis theoretically has all the possibilities for vocal and linguistic expression of emotions and moods that humans also possess. In this project, these possibilities are being researched and applied in order to generate an automatic synthesis of audio books for children that is as expressive as possible and resembles a lecture by a reader. |
Investigation of partial discharge signals
Duration: | 01.11.2018 - 31.12.2019 |
Client: | HighVolt Prüftechnik Dresden GmbH |
Development of an optoelectronic measuring system for the control of interactive speech therapy exercises
FKZ: | 16SV7741 |
Duration: | 01.03.2017 - 29.02.2020 |
Funded by: | BMBF |
Cooperation partner: |
|
Brief description: |
As a result of motor impairments, many people suffer from swallowing and speech disorders after a stroke, but these can often be treated in a targeted manner under speech therapy guidance. The OSLO project aims to develop an intraoral sensor system that optically measures the entire dynamics of the tongue and enables the patient to continue training unsupervised. This is to be achieved in a playful way using various therapeutic mini-games, which can be easily controlled via the sensor system on the tablet. |
Intrinsic direction-dependent speeds of articulators
FKZ: | BI 1639/4-1 |
Duration: | July 2017 - June 2020 |
Funded by: | DFG |
The Fascination of the Speaking Machine: Technological Change in Speech Synthesis over Two Centuries
FKZ: | 01UQ1601A |
Duration: | 01.12.2016 - 31.05.2019 |
Funded by: | BMBF |
Cooperation partner: |
|
Replacement voice for laryngectomees through speech movement measurement and articulatory speech synthesis in real time (Voice 2.0)
FKZ: | 13GW0101B |
Duration: | 01.01.2016 - 31.12.2018 |
Funded by: | BMBF |
Cooperation partner: | Linguwerk GmbH |
Investigation of oral cavity variations of woodwind players
Duration: | 27.02.2015 - 27.04.2015 |
Client: | Institute for Musical Instrument Making e.V. Zwota |
Biofeedback for the therapy of swallowing disorders
Duration: | 01.11.2014 - 31.03.2015 |
Client: |
Department of Ear, Nose and Throat Medicine at the Goethe University Hospital Frankfurt/M., specializing in phoniatrics and pedaudiology |
Analysis of noise sources in the vocal tract with a new measurement method for 3D real-time reconstruction of the oral cavity
FKZ: | BI 1639 - 1/2 |
Duration: | 01.10.2012 - 31.12.2015 |
Funded by: | DFG |