Skip to main content

Speech and Language Processing

SLTC

Two Fully-funded PhD Studentships in Spoken Language Technologies

Two Fully-funded PhD Studentships in Spoken Language Technologies, University of Sheffield, UK Deadline for applications: 13 April 2025.   Home and International students may apply. Regardless of your fees status (Home or International), all fees will be paid (in addition to a full stipend).   Speech and Language Technologies (SLTs) are a range of Artificial Intelligence (AI) approaches for analysing, producing, modifying or responding to spoken and written language. SLTs are underpinned by a number of fundamental research fields including acoustics, signal processing, speech processing, natural language processing, computational linguistics, mathematics, machine learning, physics, psychology, and computer science.   We are seeking two candidates to each work on an interdisciplinary SLT research project covering both fields of speech and language research on one of the following topics:
  • Accessible Democracy: UK Houses of Parliament and cross-party Select Committees are at the core of UK democracy. Making the proceedings of these bodies accessible to citizens and journalists is key to holding politicians accountable. This research aims to develop technologies to provide access to the rich linguistic and paralinguistic information in parliamentary audio recordings. Helping journalists to identify newsworthy events is one of the example objectives, alongside more standard tasks such as search, creating alerts or summarisation.
  • Analytics of conversations: Spoken conversations are complex and difficult to understand for AI systems. While the words spoken are of obvious importance, paralinguistic information often plays an essential role for a satisfactory and efficient exchange. In practice only goal oriented metrics are used to assess the quality of an exchange, which are not helpful to describe a wide range of conversations such as interviews, story telling or even examinations. Modelling of the participants’ knowledge and state as well as paralinguistic signalling and perception should be used to research novel methods to interpret and understand conversations.
  • Evolving communication in embodied agents: Spoken and written language have developed in the course of human evolution and can be viewed as key species-wide adaptations that have enabled us to better survive on our planet. Modelling the development of language in artificial agents with sensory apparatus that are embedded in a physical environment is an exciting research methodology that promises both deeper understanding of human languages and their origins, as well as insights into how to build more effective autonomous agents. This research will build on the state of the art in this area.
About the School/Research Groups You will be a member of the Speech and Hearing and Natural Language Processing research groups in the School of Computer Science at the University of Sheffield and an affiliated member of the UKRI AI Centre for Doctoral Training (CDT) in Speech and Language Technologies (SLT) and their Applications. In the School of Computer Science, 99% of our research was rated in the highest two categories in REF2021 (world-leading or internationally excellent).   Funding Full funding for 3.5 years covering Home or International tuition fees, an enhanced stipend (£24,280 tax free for 2025/26), and a research and training support grant of £2,500 pa to cover research expenses and conference attendance.   Application Process
The deadline for applications is 23:59 on 13 April 2025. Eligibility and application guidance can be found on our website: https://slt-cdt.sheffield.ac.uk/apply

Read more

Assistant/Associate/Full Professor in Computing Science (Fundamental and Applied AI)

Tampere University has several professor positions open related to AI and its applications, covering various areas of signal processing. The positions include a quite substantial starting package, covering funding for multiple research group members. Strong researchers are encouraged to apply! The deadline for applications is 9 March 2025. For more information about the positions, please visit this page.

Read more

PhD Stipend in Multimodal Reasoning with Large Language Models

Large language models(LLMs) have demonstrated increasingly powerful capabilities for reasoning tasks, especially in text. The project aims to explore and advance these capabilities in reasoning across multiple data modalities, including but not limited to text, speech and audio. The integration of multiple modalities can lead to more robust and general systems capable of understading and reasoning about the world in a more human-like manner. The project will involve fine-tuning pre-trained models and developing self-supervised learning techniques to adapt LLMs for multimodal tasks.

Application deadline: 16 March 2025

Apply here

Read more

Tenure-Track Faculty Position in Signal Processing

Tenure-Track Faculty Position in Signal Processing 

The Electrical and Systems Engineering department at Washington University in St. Louis invites applications for multiple tenure-track faculty positions with an effective start date on or after July 1, 2025. Candidates should have earned a Ph.D. or equivalent degree in electrical engineering, systems engineering, computer engineering or a closely related discipline. Washington University is a highly-selective national research university with a strong tradition of research excellence. It is nationally known for its student body's exceptional quality and attractive campus, which borders residential neighborhoods and one of the nation’s largest urban parks. Many faculty walk or bike to work. St. Louis combines affordability with a vibrant metropolitan area, offering many cultural and entertainment opportunities.

The University’s strategic plan, released in 2022, seeks growth of top-tier research, scholarship, and creative practice with an emphasis on transdisciplinary and cross-school research. Included in this plan are foci related to applications in medicine, public health, infrastructure, and addressing pressing societal challenges. We seek both junior and senior applicants who will contribute to fundamental and applied research in signal processing and closely related areas. Examples include:

(I) Signal processing and deep learning

(II) Graph signal processing

(III) Audio, speech and image processing

(IV) Signal processing in neuroscience

(V) Statistical machine learning

Successful applicants will have a primary appointment in the department of Electrical and Systems Engineering with the possibility of joint appointments in other departments. The faculty member will be expected to teach undergraduate and graduate courses in electrical and systems engineering, participate in university service, and establish a thriving externally-funded research program. Faculty positions are open for all levels; appointment at a senior rank (associate and full professor) will be considered for exceptional candidates with a distinguished record of achievement in research and teaching.

Candidates should have earned a Ph.D. or equivalent degree in electrical engineering, systems engineering, computer engineering or a closely-related discipline.

Applications should include: (1) a cover letter that identifies the candidate’s three most significant publications and describes their interest in the position; (2) a curriculum vitae; (3) a research plan for the next five years that should not exceed three pages, and should highlight the problem(s) or set of questions to be investigated, the envisioned approach, a mentoring strategy, and the proposed funding sources; (4) a statement of teaching interests and philosophy (not exceeding 2 pages); (5) a statement describing contributions to and future plans for enhancing diversity (not exceeding 2 pages); and (6) a list of at least three references via the link provided at

https://apply.interfolio.com/157328

Priority will be given to completed applications (including submitted reference letters) received before December 15, 2024. However, applications will be accepted at any time and will be considered until the positions are filled. Washington University in St. Louis is committed to the principles and practices of equal employment opportunity and especially encourages applications by those underrepresented in their academic fields. It is the University’s policy to recruit, hire, train, and promote persons in all job titles without regard to race, color, age, religion, sex, sexual orientation, gender identity or expression, national origin, protected veteran status, disability, or genetic information. Verification of employment eligibility will be required upon employment.

Read more

PhD Opportunities in AI for Digital Media Inclusion (Deadline 30 May 2024)

** PhD Opportunities in Centre for Doctoral Training in AI for Digital Media Inclusion
** Surrey Institute for People-Centred AI at the University of Surrey, UK, and
** StoryFutures at Royal Holloway University of London, UK

** Apply by 30 May 2024, for PhD cohort starting October 2024

URL: https://www.surrey.ac.uk/artificial-intelligence/cdt

The Centre for Doctoral Training (CDT) in AI for Digital Media Inclusion combines the world-leading expertise of the Surrey Institute for People-Centred AI at the University of Surrey, a pioneer in AI technologies for the creative industries (vision, audio, language, machine learning) and StoryFutures at Royal Holloway University of London, leader in creative production and audience experience (arts, psychology, user research, creative production).

Our vision is to deliver unique cross-disciplinary training embedded in real-world challenges and creative practice, and to address the industry need for people with responsible AI, inclusive design and creative skills. The CDT challenge-led training programme will foster a responsible AI-enabled inclusive media ecosystem with industry. By partnering with 50+ organisations, our challenge-led model will be co-designed and co-delivered with the creative industry to remove significant real-world barriers to media inclusion.

The overall learning objective of the CDT training programme is that all PhD researchers gain a cross-disciplinary understanding of fundamental AI science, inclusive design and creative industry practice, together with responsible AI research and innovation leadership, to lead the creation of future AI-enabled inclusive media.

The CDT training program will select PhD students who will work on challenge areas including Intelligent personalisation of media experiences for digital inclusion, and Generative AI for digital inclusion. Example projects related to audio include:

- Audio Generative AI from visuals as an alternative to Audio Description
- Audio orchestration for neurodivergent audiences using object-based media
- AUDItory Blending for Inclusive Listening Experiences (AUDIBLE)
- Foundational models for audio (including speech, music, sound effect) to texts in the wild
- Generative AI for natural language description of audio for the deaf and hearing impaired
- Generative AI with Creative Control, Explainability, and Accessibility
- Personalised audio editing with generative models
- Personalised subtitling for readers of different abilities
- Translation of auditory distance across alternate advanced audio formats

If you have any questions about the CDT, please contact Adrian Hilton or Polly Dalton.

For more information and to apply, visit:
https://www.surrey.ac.uk/artificial-intelligence/cdt

Application deadline: 30 May 2024

--

Prof Mark D Plumbley
EPSRC Fellow in AI for Sound
Professor of Signal Processing
Centre for Vision, Speech and Signal Processing
University of Surrey, Guildford, Surrey, GU2 7XH, UK
Email: m.plumbley@surrey.ac.uk

 

Read more

PhD position “Unsupervised/semi-supervised learning algorithms for speech enhancement and source localization”

The Signal Processing Division and the Collaborative Research Centre Hearing Acoustics at the University of Oldenburg in Germany are seeking to fill the position of a

Research Scientist (PhD Student) - “Unsupervised/semi-supervised learning algorithms for speech enhancement and source localization”

The position is available from 01.08.2023 for 3 years, with salary according to TV-L E13 (75%), corresponding to about 3.200 € per month before taxes (exact amount depending on experience and qualifications).  

The main activities of the Signal Processing Division (https://uol.de/en/mediphysics-acoustics/sigproc) centre around signal processing for acoustical and biomedical applications, with a focus on hearing aids and speech communication devices. More specifically, research topics in the areas of microphone array processing, speech enhancement and acoustic scene analysis are addressed, using a combination of model-based statistical signal processing techniques and data-driven machine learning methods. The Signal Processing Division has access to excellent high-performance computing facilities, measurement equipment and labs, e.g., a unique lab with variable acoustics.   

The Collaborative Research Centre Hearing Acoustics (https://uol.de/en/sfb-1330-hearing-acoustics) aims at a fundamentally better quantitative understanding of the principles underlying the processing of complex auditory and audio-visual scenes, the implementation of this knowledge in algorithms for perceptual enhancement of acoustic communication, and the evaluation of these algorithms for different applications. The successful candidate is expected to investigate unsupervised/semi-supervised learning algorithms for speech enhancement and source localization within a hybrid computational acoustic scene analysis (CASA) framework. Using this CASA framework, we aim at leveraging the potential of recent machine learning methods while maintaining the interpretability of conventional signal processing modules through high-level interpretable latent variables.

Responsibilities/Tasks

  • carry out research on acoustical signal processing algorithms for speech enhancement and source localization, involving algorithm design, implementation, and experimental validation;
  • write scientific papers for international conferences and journals;
  • actively participate in the research meetings and seminars at the Department of Medical Physics and Acoustics

Profile

  • Candidates are required to have an academic university degree (Master or equivalent) in electrical engineering, engineering physics, hearing technology and audiology or a related discipline, excellent grades and a solid scientific background in at least two of the following fields: speech and audio signal processing, machine learning, acoustics.
  • Familiarity with scientific tools and programming languages (e.g., python) as well as excellent English language skills (both oral and written) are required.
  • For the envisaged research project, experience with unsupervised/semi-supervised learning methods and acoustical signal processing algorithms is beneficial.

For applicants outside of the European Union it is highly recommended to check if your academic university degree is equivalent to a German higher education qualification. Please consult the website of the Central Office for Foreign Education (https://www.kmk.org/zab/central-office-for-foreign-education.html) for more information and to apply for a statement of comparability.

The University of Oldenburg is dedicated to increasing the percentage of women in science. Therefore, equally qualified female candidates will be given preference. Applicants with disabilities will be preferentially considered in case of equal qualification.

To apply for this position, please send your application (ref. SP232) including a letter of motivation with a statement of skills and research interests (max. 1 page), curriculum vitae, and a copy of the university diplomas and transcripts to simon.doclo@uol.de. The application deadline is 21.04.2023.

Read more

PhD Position in Deep Cascaded Representation Learning for Speech Modelling

The LivePerson Centre for Speech and Language offers a 3 year fully funded PhD studentship covering standard maintenance, fees and travel support, to work on cascaded deep learning structures to model speech. The Centre is connected with the Speech and Hearing (SpandH) and the Natural Language Processing (NLP) research groups in the Department of Computer Science at the University of Sheffield.

Auto-encoding is a powerful concept that allows us to compress signals and find essential representations. The concept was expanded to include context, which is usually referred to as self-supervised learning. On very large amounts of speech data this has led to very successful methods and models for representing speech data, for a wide range of downstream processes. Examples of such models are Wave2Vec or WaveLM. Use of their representations often requires fine-tuning to a specific task, with small amounts of data. When encoding speech, it is desirable to represent a range of attributes at different temporal specificity. Such attributes often reflect a hierarchy of information.

The aim in this PhD project is to explore the use of knowledge about natural hierarchies in speech in cascaded auto- and contextual encoder/decoder models. The objective is to describe a structured way to understand such hierarchies. The successful candidate is expected to propose methods to combine different kinds of supervision (auto, context, label) and build hierarchies of embeddings extractions. These propositions may have to be seen in the context of data availability and complexity. All proposals are to be implemented and tested on speech data. Experiments should be conducted on a range of speech data sets with different speech types and data set size.

The student will join a world-leading team of researchers in speech and language technology. The LivePerson Centre for Speech and Language Technology was established in 2017 with the aim to conduct research into novel methods for speech recognition and general speech processing, including end to end modelling, direct waveform modelling and new approaches to modelling of acoustics and language. It has recently extended its research remit to spoken and written dialogue. The Centre hosts several Research Associates, PhD researchers, graduate and undergraduate project students, Researchers and Engineers from LivePerson, and academic visitors. Being fully connected with SpandH brings collaboration, and access to a wide range of academic research and opportunities for collaboration inside and outside of the University. The Centre has access to extensive dedicated computing resources (GPU, large storage) and local storage of over 60TB of raw speech data.

The successful applicant will work under the supervision of Prof. Hain who is the Director of the LivePerson Centre and also Head of the SpandH research group. SpandH was and is involved in a large number of national and international projects funded by national bodies and EU sources as well as industry. Prof. Hain also leads the UKRI Centre for Doctoral Training In Speech and Language Technologies and their Applications (https://slt-cdt.ac.uk/) - a collaboration between the NLP research group and SpandH. Jointly, NLP and SpandH host more than 110 active researchers in these fields. This project will start as soon as possible.

All applications must be made directly to the University of Sheffield using the Postgraduate Online Application Form. Information on what documents are required and a link to the application form can be found here: https://www.sheffield.ac.uk/postgraduate/phd/apply/applying

On your application, please name Prof. Thomas Hain as your proposed supervisor and include the title of the studentship you wish to apply for.

Your research proposal should:

  • Be no longer than 4 A4 pages, including references
  • Outline your reasons for applying for this studentship
  • Explain how you would approach the research, including details of your skills and experience in the topic area

This position is fully funded by LivePerson, covering all tuition fees and a stipend at the standard UKRI rate.

Read more

Research Associate in Integrated Multitask Neural Speech Labelling

For further information and the link to apply please visit: https://www.jobs.ac.uk/job/CXH168/research-associate-in-integrated-multitask-neural-speech-labelling We are seeking an outstanding Research Associate in Integrated Multitask Neural Speech Labelling, to join the LivePerson Centre for Speech and Language Technology based at the University of Sheffield, which is linked with the Speech and Hearing (SpandH) research group in the Department of Computer Science. You are applying to join a world-leading team of researchers in speech and language technology to work on new ways to integrate a variety of speech technology labelling, clustering or segmentation tasks into a single algorithm or process, in the context of deep neural networks.

Even end to end (E2E) automatic speech recognition is typically considered as a standalone process, independent of other speech audio technology tasks such as diarisation, acoustic event detection or intent recognition. The LivePerson Centre for Speech and Language Technology was established in 2017 with the aim of conducting research into novel methods for speech recognition and general speech processing, including end to end modelling, direct waveform modelling and new approaches to modelling of acoustics and language. It has recently extended its research remit to spoken and written dialogue. The Centre hosts several Research Associates, PhD researchers, graduate and undergraduate project students, Researchers and Engineers from LivePerson, and academic visitors, which will provide a vibrant work environment within the University. Being fully connected with SpandH brings collaboration, and access to a wide range of academic research and opportunities for collaboration inside and outside of the University.

The post holder will work closely with Prof. Thomas Hain who is the Director of the LivePerson Centre and also Head of the SpandH group. Prof. Hain also leads the UKRI Centre for Doctoral Training In Speech and Language Technologies and their Applications (slt-cdt.ac.uk) - a collaboration between the Natural Language Processing (NLP) research group and SpandH. Jointly, NLP and SpandH host more than 110 active researchers in these fields. We’re one of the best not-for-profit organisations to work for in the UK.

The University’s Total Reward Package includes a competitive salary, a generous Pension Scheme and annual leave entitlement, as well as access to a range of learning and development courses to support your personal and professional development. We build teams of people from different heritages and lifestyles from across the world, whose talent and contributions complement each other to the greatest effect. We believe diversity in all its forms delivers greater impact through research, teaching and student experience.

To find out what makes the University of Sheffield a remarkable place to work, watch this short film: www.youtube.com/watch?v=7LblLk18zmo, and follow @sheffielduni and @ShefUniJobs on Twitter for more information.

Read more