Mei-Yuh Hwang, speech recognition, machine translation, language understanding, scene-text OCR,
image recognition, GenAI
Mei-Yuh Hwang 黄美玉
Affiliate Professor at EE Department
University of Washington (UW)
IEEE Fellow
Reach me at my-svi on live.com
Mei-Yuh received her PhD in Computer Science from Carnegie Mellon University in 1993 and have worked in AI industry/academia
for three decades,
publishing numerous conference and
journal papers, and delivering industry products in speech recognition, machine translation,
language understanding, and image representation & recognition.
Mei-Yuh's focus has been always on turning state-of-the-art technologies into end-users' hands.
She is an IEEE fellow, who is passionate in bridging the gap between academia and industry.
- Technical advisor, remote
- Attendee, RSNA 2023
Meta, 5/2022--5/2023
- AI Research Scientist, Bellevue WA
- Speech recognition patent: Device wake-up modeling using a word-piece model.
- Scene-Text OCR
- Text-to-image generation
Microsoft, 4/2020-4/2022
- Partner science manager, Microsoft Search for Office 365, Bellevue WA.
- NLP for Outlook and Teams search, based on fine-tuning on, and/or few-shot prompting to, various large pre-trained language models.
- CTO winning project: Prompt engineering on GPT-3 copilot to manipulate Excel spreadsheets via human natural languages.
See related post in GPT-3 application.
Mobvoi AI Lab, 2016-2020
- Director, Redmond WA.
-
Mobvoi makes speech-enabled
smart IoT
devices, from hardware to software, all in-house.
- We are one of the most successful companies in smart watches in China.
- Mobvoi also provides the in-car voice assistant for VW automobiles in China.
The technology includes both on-the-cloud and on-device voice navigation, enabling majority of services without internet connection.
- We are active in generative AI and have been very successful in personalized TTS.
- Though a young company
that focuses on industry products, we are actively participating in
our speech research community with our limited resources.
Microsoft China, 2012-2015
- Principal science manager, Spoken language understanding for Cortana, Beijing and Suzhou.
- Non-English Cortana
- To deliver non-English Cortana without human annotated data,
Mei-Yuh designed
an adapted translation algorithm which
offered both paraphrasing and generalization capabilities
with required semantic slot tags.
- The protoype model was further improved via
iterative data augmentation using RNN and newly logged data.
- Microsoft cognitive services on Azure cloud:
Microsoft and UW, 1994-2012
- Machine translation, Microsoft, Redmond WA, 2008-2012.
- Co-built Bing Translator automated training infrastructure,
including the design and implementation of
map-reduce parallel processing, based on DryadLink.
- Designed and implemented
Bing Translation Hub for customized vertical-domain translation.
- Speech recognition at Univ. of Washington, Seattle WA, 2004-2008.
-
Led the DARPA EARS and GALE Mandarin speech recongition projects
at University of Washington.
- Won the best Mandarin speech recognition.
- Speech recognition, Microsoft Research & Production, Redmond WA, 1994-2004.
-
Ported Sphinx-II speech recognition to Microsoft on Windows desktop, Office, and Microsoft Speech Server SDK,
for the recognition of multiple languages.
SPHINX-II speech recognition at CMU, PA, 1987-1993
- First to propose
Markov state clustering, based on decision trees, for continuous speech recognition.
- The idea
of shared states (or
senones as Mei-Yuh named it in 1992) had
been widely adopted for two decades since its inception, until recent years when end-to-end neural-transducer based
and transformer-based speech recognition took over a new era.
- Participated in numerous DARPA speech recognition evaluation benchmarks (Resource Management, WSJ, ATIS)
and won the top position consistently.
Awards
- 2023, Outstanding Reviewer, ICASSP 2023
- 2021, AAIA Fellow
- 2019, IEEE Fellow
- 2010, Microsoft Gold Star Award from Microsoft Research, Redmond, WA
- 1992, Allen Newell Research Excellence Medal, Pittsburgh, Carnegie Mellon University
- 1986, Phi Tao Phi Scholastic Honor Society, recommended by National Taiwan University
Professional Services
- 2015-2018: IEEE ISCSLP steering committee
- 2013: IEEE associate editor for Transactions on Audio, Speech, and Language Processing (ASLP)
- 2011: Technical Chair of IWSLT
- 1998: Publicity Chair IEEE ICASSP
- Reviewers for IEEE Transactions on ASLP
- Technical committee for ICASSP, Interspeech, ISCSLP, ACL, NAACL
Invited Talks
- 2020: WeCNLP Summit, Seattle online.
- 2019: Northwestern Polytechnical University, Xi-an, China
- 2018 April: Yuanchuan Telephony company, Taiwan
- 2017: UWEE
Research Colloquium Talk
- 2017: Panelist,
CMU Summit
- 2017: Talk at Southwest Forestry Univerisity, Kunming, China
- 2017: Talk at Soochow University, Suzhou, China
- 2015: Talk at Soochow University, Suzhou, China
- 2014: Talk at University of Science and Technology of China, Suzhou campus
- 2014: Keynote speech at PhD Forum, Northwestern Polytechnical University, Xi-an, China
- 1994-2003: Lecturer at National Taiwan University, Academia Sinica, and ITRI, Taiwan
- 1993: Lecturer at IBM Gaithersburg
Education
- PhD, Computer Science, Carnegie Mellon University, December 1993.
- BA, Computer Science, National Taiwan University (台大), June 1986.