Large-scale antenna arrays, also known as massive MIMO, are key enablers for 5G and beyond networks, which, however, bring tremendous pressures on hardware cost and energy consumption.
Deep learning methods haverevolutionized speech recognition, image recognition, and natural language processing since 2010. Each of these tasks involves a single modality in their input signals. However, many applications in the artificial intelligence field involve multiple modalities.
Solving visual question answering (VQA) task requires recognizing many diverse visual concepts as the answer. These visual concepts contain rich structural semantic meanings, e.g., some concepts in VQA are highly related (e.g., red & blue), some of them are less relevant (e.g., red & standing).
We consider the problem of reliable information propagation in the brain using biologically realistic models of spiking neurons. Biological neurons use action potentials, or spikes, to encode information. Information can be encoded by the rate of asynchronous spikes or by the (precise) timing of synchronous spikes. Reliable propagation of synchronous spikes is well understood in neuroscience and is relatively easy to implement by biologically-realistic models of neurons.
Visual food recognition on mobile devices has attracted increasing attention in recent years due to its roles in individual diet monitoring and social health management and analysis. Existing visual food recognition approaches usually use large server-based networks to achieve high accuracy.
This paper presents a novel approach for accurate barcodes detection in real and challenging environments using compact deep neural networks. Our approach is based on Convolutional Neural Network ( CNN ) and neural network compression, which can detect the four vertexes coordinates of a barcode accurately and quickly. Our approach consists of four stages: ( i ) feature extraction by a base network, ( ii ) region proposal network ( RPN ) training, ( iii ) barcode classification and coordinates regression, and ( iv ) weights pruning and recoding.
Previous research methods on wake-up word detection (WWD) have been proposed with focus on finding a decent word representation that can well express the characteristics of a word. However, there are various obstacles such as noise and reverberation which make it difficult in real-world environments where WWD works.
Automatic modulation classification facilitates many important signal processing applications. Recently, deep learning models have been adopted in modulation recognition, which outperform traditional machine learning techniques based on hand-crafted features. However, automatic modulation classification is still challenging due to the following reasons.
The filtered-x least-mean-square (FxLMS) algorithm has been widely used for the active noise control. A fundamental analysis of the convergence behavior of the FxLMS algorithm, including the transient and steady-state performance, could provide some new insights into the algorithm and can be also helpful for its practical applications, e.g., the choice of the step size.