1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.
The prevailing characteristics of micro-videos result in the less descriptive power of each modality. The micro-video representations, several pioneer efforts proposed, are limited in implicitly exploring the consistency between different modality information but ignore the complementarity.
The prevailing characteristics of micro-videos result in the less descriptive power of each modality. The micro-video representations, several pioneer efforts proposed, are limited in implicitly exploring the consistency between different modality information but ignore the complementarity.
Contactless fingerprint recognition is highly promising and an essential component in the automatic fingerprint identification system. However, due to the inherent characteristic of perspective distortions of contactless fingerprints, achieving a highly accurate contactless fingerprint recognition system is very challenging.
A problem deeply investigated by multimedia forensics researchers is that of detecting which device has been used to capture a video. This enables us to trace down the owner of a video sequence, which proves extremely helpful to solve copyright infringement cases as well as to fight distribution of illicit material (e.g., child exploitation clips and terroristic threats).
Compressed sensing (CS) has recently emerged as an effective and efficient way to encrypt data. Under certain conditions, it has been shown to provide some secrecy notions. In theory, it could be considered to be a perfect match for constrained devices needing to acquire and protect the data with computationally cheap operations.
The challenges of real world applications of the laser detection and ranging (Lidar) three-dimensional (3-D) imaging require specialized algorithms. In this paper, a new reconstruction algorithm for single-photon 3-D Lidar images is presented that can deal with multiple tasks.
In this paper, we present a full view optical flow estimation method for plenoptic imaging. Our method employs the structure delivered by the four-dimensional light field over multiple views making use of superpixels. These superpixels are four dimensional in nature and can be used to represent the objects in the scene as a set of slanted-planes in three-dimensional space so as to recover a piecewise rigid depth estimate.
Binary tomography is concerned with the recovery of binary images from a few of their projections (i.e., sums of the pixel values along various directions). To reconstruct an image from noisy projection data, one can pose it as a constrained least-squares problem.
Visual cues such as lip movements, when available, play an important role in speech communication. They are especially helpful for the hearing impaired population or in noisy environments. When not available, having a system to automatically generate talking faces in sync with input speech would enhance speech communication and enable many novel applications.
Automatic evaluation of singing quality can be done with the help of a reference singing or the digital sheet music of the song. However, such a standard reference is not always available. In this article, we propose a framework to rank a large pool of singers according to their singing quality without any standard reference.
Wireless acoustic sensor networks (WASNs) can be used for centralized multi-microphone noise reduction, where the processing is done in a fusion center (FC). To perform the noise reduction, the data needs to be transmitted to the FC. Considering the limited battery life of the devices in a WASN, the total data rate at which the FC can communicate with the different network devices should be constrained.
Panoramic videos are becoming more and more easily obtained for common users. Although these videos have
Panoramic videos are becoming more and more easily obtained for common users. Although these videos have
Nowadays, 360° video/image has been increasingly popular and drawn great attention. The spherical viewing range of 360° video/image accounts for huge data, which pose the challenges to 360° video/image processing in solving the bottleneck of storage, transmission, etc. Accordingly, the recent years have witnessed the explosive emergence of works on 360° video/image processing.
Recent years have witnessed the rapid development of virtual reality (VR). Above 90% of VR content is in the form of 360° video, also called omnidirectional video or panoramic video. Generally speaking, 360° video offers immersive and interactive viewing experience, as the viewers are able to freely move their heads in the range of 360° × 180° to access different viewports.
This correspondence proposes the use of a real-only equalizer (ROE), which acts on real signals derived from the received offset quadrature amplitude modulation (OQAM) symbols. For the same fading channel, we prove that both ROE and the widely linear equalizer (WLE) yield equivalent outputs.
This letter presents a high resolution method which separates close components of a multi-component linear frequency modulated (LFM) signal and eliminates their Cross-Terms (CTs). We first investigate the energy distribution of the Auto-Terms (ATs) and CTs in ambiguity plane.
This letter proposes a new time domain absorption approach designed to reduce masking components of speech signals under noisy-reverberant conditions. In this method, the non-stationarity of corrupted signal segments is used to detect masking distortions based on a defined threshold.