Enhancing speech signals in complex acoustic environments remains a critical challenge in audio processing. The presenters recent work presents several innovative approaches to tackle key issues in this domain. They introduced a novel audio zooming technique based on deep learning, shifting from traditional direction-based beamforming to a user-defined, adjustable 3D region for sound capture. This advancement enables precise and flexible audio acquisition, supporting real-time applications such as remote conferencing, education, and live streaming.