Effective Subword Segmentation for Text Comprehension

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

Effective Subword Segmentation for Text Comprehension

By: 
Zhuosheng Zhang; Hai Zhao; Kangwei Ling; Jiangtong Li; Zuchao Li; Shexia He; Guohon

Representation learning is the foundation of machine reading comprehension and inference. In state-of-the-art models, character-level representations have been broadly adopted to alleviate the problem of effectively representing rare or complex words. However, character itself is not a natural minimal linguistic unit for representation or word embedding composing due to ignoring the linguistic coherence of consecutive characters inside word. This paper presents a general subword-augmented embedding framework for learning and composing computationally derived subword-level representations. We survey a series of unsupervised segmentation methods for subword acquisition and different subword-augmented strategies for text understanding, showing that subword-augmented embedding significantly improves our baselines in various types of text understanding tasks on both English and Chinese benchmarks.

SPS on Twitter

  • Registration is now live for the 2020 IEEE 6th World Forum on Internet of Things! Meet attendees from industry, the… https://t.co/1T7vQhAazS
  • Early bird registration for ends on Monday, 24 February. Register today and save, and save even more with… https://t.co/dzlSXdN4y8
  • The IEEE Journal of Selected Topics in Signal Processing is now accepting original manuscripts for a Special Issue… https://t.co/mXKh41of5A
  • Join us on Tuesday, 25 February for a new webinar, “Enabling Identity-Based Integrity Auditing and Data Sharing Wit… https://t.co/rfpjVkEv09
  • The 2020 IEEE International Conference on Autonomous Systems will take place in Montréal on 12-14 August 2020 and w… https://t.co/ePFEWYagwP

SPS Videos


Signal Processing in Home Assistants

 


Multimedia Forensics


Careers in Signal Processing             

 


Under the Radar