Mei-Yuh Hwang, speech recognition, machine translation, language understanding, scene-text OCR, image recognition, GenAI

Mei-Yuh Hwang 黄美玉

Affiliate Professor at EE Department
University of Washington (UW)
IEEE Fellow
Reach me at my-svi on live.com

Mei-Yuh received her PhD in Computer Science from Carnegie Mellon University in 1993 and have worked in AI industry/academia for three decades, publishing numerous conference and journal papers, and delivering industry products in speech recognition, machine translation, language understanding, and image representation & recognition. Mei-Yuh's focus has been always on turning state-of-the-art technologies into end-users' hands. She is an IEEE fellow, who is passionate in bridging the gap between academia and industry.

Vinbrain, 3/2022-11/2023

Technical advisor, remote
Attendee, RSNA 2023

Meta, 5/2022--5/2023

AI Research Scientist, Bellevue WA
Speech recognition patent: Device wake-up modeling using a word-piece model.
Scene-Text OCR
- DISGO: Automatic End-to-End Evaluation for Scene Text OCR: Evaluating machine translation from scene-text OCR output
Automatic data filtering for training & evaluating text-to-image generation

Microsoft, 4/2020-4/2022

Partner science manager, Microsoft Search for Office 365, Bellevue WA.
NLP for Outlook and Teams search, based on fine-tuning on, and/or few-shot prompting to, various large pre-trained language models.
CTO winning project: Prompt engineering on GPT-3 copilot to manipulate Excel spreadsheets via human natural languages.
See related post in GPT-3 application.

Mobvoi AI Lab, 2016-2020

Director, Redmond WA.
Mobvoi makes speech-enabled smart IoT devices, from hardware to software, all in-house.
- We are one of the most successful companies in smart watches in China.
- Mobvoi also provides the in-car voice assistant for VW automobiles in China. The technology includes both on-the-cloud and on-device voice navigation, enabling majority of services without internet connection.
- We are active in generative AI and have been very successful in personalized TTS.
- Though a young company that focuses on industry products, we are actively participating in our speech research community with our limited resources.

Microsoft China, 2012-2015

Principal science manager, Spoken language understanding for Cortana, Beijing and Suzhou.
Non-English Cortana
- To deliver non-English Cortana without human annotated data, Mei-Yuh designed an adapted translation algorithm which offered both paraphrasing and generalization capabilities with required semantic slot tags.
- The protoype model was further improved via iterative data augmentation using RNN and newly logged data.
Microsoft cognitive services on Azure cloud:
- Cortana non-English language understanding for LUIS,
- Language-model adaptation for customized speech recognition.
- Spontaneous speech recogition for Skype speech-to-speech translation.

Microsoft and UW, 1994-2012

Machine translation, Microsoft, Redmond WA, 2008-2012.
- Co-built Bing Translator automated training infrastructure, including the design and implementation of map-reduce parallel processing, based on DryadLink.
- Designed and implemented Bing Translation Hub for customized vertical-domain translation.
Speech recognition at Univ. of Washington, Seattle WA, 2004-2008.
- Led the DARPA EARS and GALE Mandarin speech recongition projects at University of Washington.
- Won the best Mandarin speech recognition in 2007.
Speech recognition, Microsoft Research & Production, Redmond WA, 1994-2004.
- Ported Sphinx-II speech recognition to Microsoft on Windows desktop, Office, and Microsoft Speech Server SDK, for the recognition of multiple languages.

SPHINX-II speech recognition at CMU, PA, 1987-1993

First to propose Markov state clustering, based on decision trees, for continuous speech recognition.
The idea of shared states (or senones as Mei-Yuh named it in 1992) had been widely adopted for two decades since its inception, until recent years when end-to-end neural-transducer based and transformer-based speech recognition took over a new era.
Participated in numerous DARPA speech recognition evaluation benchmarks (Resource Management, WSJ, ATIS) and won the top position consistently.

Awards

2023, Outstanding Reviewer, ICASSP 2023
2021, AAIA Fellow
2019, IEEE Fellow
2010, Microsoft Gold Star Award from Microsoft Research, Redmond, WA
1992, Allen Newell Research Excellence Medal, Pittsburgh, Carnegie Mellon University
1986, Phi Tao Phi Scholastic Honor Society, recommended by National Taiwan University

Professional Services

2015-2018: IEEE ISCSLP steering committee
2013: IEEE associate editor for Transactions on Audio, Speech, and Language Processing (ASLP)
2011: Technical Chair of IWSLT
1998: Publicity Chair IEEE ICASSP
Reviewers for IEEE Transactions on ASLP
Technical committee for ICASSP, Interspeech, ISCSLP, ACL, NAACL

Invited Talks

2020: WeCNLP Summit, Seattle online.
2019: Northwestern Polytechnical University, Xi-an, China
2018 April: Yuanchuan Telephony company, Taiwan
2017: UWEE Research Colloquium Talk
2017: Panelist, CMU Summit
2017: Talk at Southwest Forestry Univerisity, Kunming, China
2017: Talk at Soochow University, Suzhou, China
2015: Talk at Soochow University, Suzhou, China
2014: Talk at University of Science and Technology of China, Suzhou campus
2014: Keynote speech at PhD Forum, Northwestern Polytechnical University, Xi-an, China
1994-2003: Lecturer at National Taiwan University, Academia Sinica, and ITRI, Taiwan
1993: Lecturer at IBM Gaithersburg

Google Scholar

Education

PhD, Computer Science, Carnegie Mellon University, December 1993.
BA, Computer Science, National Taiwan University (台大), June 1986.