Tao Zhang (Starkey Hearing Technologies, USA)
Lecture Date: May 9, 2019
Chapter: Santa Clara Valley
Chapter Chair: Yang Lei
Topic: Solving the Cocktail Party Problem for Hearing Aids:
Solutions, Challenges and Opportunities
Most evolutionary algorithms and other meta-heuristic search methods typically assume that there are explicit objective functions available for fitness evaluations. In the real world, such explicit objective functions may not exist in many cases. Instead, computationally very intensive numerical simulations, such as computational fluid dynamic simulations or finite element analysis, must be conducted.
Neural networks with rectified linear unit (ReLU) activation functions (a.k.a. ReLU networks) have achieved great empirical success in various domains. Nonetheless, existing results for learning ReLU networks either pose assumptions on the underlying data distribution being, e.g., Gaussian, or require the network size and/or training size to be sufficiently large.
In the special issue in Proceedings of the IEEE in April 2019, the editors have collected and presented recent works on innovative approaches and emerged technologies for coping with dynamicity, heterogeneity, and the scale, which have been central to (or even enablers of) recent advances in communications and networking technologies.
Lecture Date: May 9, 2019
Chapter: Santa Clara Valley
Chapter Chair: Yang Lei
Topic: Solving the Cocktail Party Problem for Hearing Aids:
Solutions, Challenges and Opportunities
The task of Heterogeneous Face Recognition consists in matching face images that are sensed in different domains, such as sketches to photographs (visual spectra images), and thermal images to photographs or near-infrared images to photographs. In this paper, we suggest that the high-level features of Deep Convolutional Neural Networks trained in visual spectra images are potentially domain independent and can be used to encode faces sensed in different image domains.
In psychology, it is known that facial dynamics benefit the perception of identity. This paper proposes a novel deep network framework to capture identity information from facial dynamics and their relations. In the proposed method, facial dynamics occurred from a smile expression are analyzed and utilized for facial authentication. Detailed changes in the local regions of a face such as wrinkles and dimples are encoded in the facial dynamic feature representation.
In real-world applications, different kinds of learning and prediction errors are likely to incur different costs for the same system. Moreover, in practice, the cost label information is often available only for a few training samples. In a semi-supervised setting, label propagation is critical to infer the cost information for unlabeled training data.
Auction is an effective way to allocate goods or services to bidders who value them the most. The rapid growth of e-auctions facilitates online transactions but poses new and distinctive challenges. It is difficult to establish trust among sellers, buyers, and auctioneers without centralized auction websites or platforms (the auctioneer) which collect bids and derive the auction results. However, these third parties may be untrustworthy, and malicious sellers or buyers may refuse to deliver the goods or payment according to the protocol.
In this paper, we propose a Group-Sparse Representation-based method with applications to Face Recognition (GSR-FR). The novel sparse representation variational model includes a non-convex sparsity-inducing penalty and a robust non-convex loss function. The penalty encourages group sparsity by using an approximation of the
We present an image captioning framework that generates captions under a given topic. The topic candidates are extracted from the caption corpus. A given image’s topics are then selected from these candidates by a CNN-based multi-label classifier. The input to the caption generation model is an image-topic pair, and the output is a caption of the image.