The logistical bits: I am a Research Staff Member (Research Scientist) in the Scalable Knowledge Intelligence department at IBM Research Almaden in San Jose, California. I graduated from my PhD program in Computer Science and Engineering at the University of Washington in Seattle, Washington in 2019. I was a part of GRIDLab and Neural Systems Lab and the Brunton Lab, primarily advised by Dr. Bingni Brunton and Dr. Rajesh Rao. I was also advised by Dr. Jeff Ojemann and Dr. Ali Farhadi with numerous collaborations with many awesome scientists.
The interesting bits: I love brains and computers, basically any cognitive system. In my PhD research, my goal was to learn algorithms from the brain to help machine learning and use state of the art machine learning techniques to learn more about the brain. Now, I use the same machine learning techniques to discover other insights, such as document understanding.
PhD Project: Decode and predict activities from multiple modalities (brain + video + audio) in natural non-experimental data using deep learning. I am developing code for this project in Python and the code will be made available soon. We have released parts of this dataset, see here. See the project slide for more detailed project descriptions.
Past projects: This spans from decoding dreams from fMRI to developing automatic fish behaviour classification to brain surgeries in mice to study and treat PTSD. See the project slide for more details.
Me outside research: I truly enjoy travelling and living in other cultures. On the side are some of my pictures from my internship stays in Germany, Japan and Colorado. Maybe this is also why I really like RPG video games (Dragon Age being one of my favorite).
Brief History: I was born in Nanjing, China and grew up there until the age of 9, when my family immigrated into Canada. I moved within British Columbia quite a bit until college. I went to the University of British Columbia in Vancouver for my undergraduate degree and graduated with an Honours degree in computer science. I currently live in San Jose with my husband Jason (whom I met in Japan on internship), my daughter Luna, and my dog Barkspawn -- bonus points if you know where this name is from.
Document Extraction and Understanding
GTE: Technical lead and main algorithm + code developer for an end-to-end table extraction system. To learn more, click here.
TableLab: A user interface integrated with GTE developed by me from conception to launch to allow users to easily label table structure by correcting detected tables. The labels can also be used to customize GTE models and see improved results all in the same user-friendly UI.
Table Tutorial: We describe in detail the challenges and current approaches for table extraction and understanding at ICDM 2019 and VLDB 2020. To learn more, click here.
Sem-Tab-Facts 2021: Table Fact Verification Competition led by me for SemEval 2021. Learn more and participate here.
Documents hold a rich amount of information, especially in tables and are easy to understand for humans but not so for current machines. We use modern computer vision, natural language processing and other deep learning techniques to extract and understand tables.