MDS Computational Linguistics Capstone Project: Sharing Student Experiences
Updated: Jul 21
We designed Unigrams so researchers and analysts could quickly parse and understand large amounts of qualitative data at once. Whether the data includes open-ended comments from surveys, reviews, or interview transcripts, Unigrams aims to help analysts discover key themes and segment results into fundamental groups. This summer, four students from UBC’s Master of Data Science - Computational Linguistics program joined us to investigate new processes for Unigrams. The project goal was to run tests to help us improve our application using some of the latest language models in development. The project was a great success and we asked each student to introduce themselves and share a little bit about their experience working on the project.
Gordon (Liangchen) Xia
I pursued my undergraduate degree in Statistics & Data Science at a university in the United States, and I am currently a student at UBC, working towards a Master of Data Science in Computational Linguistics.
During my time at Kai Analytics, I have found the tasks assigned to be incredibly valuable in terms of expanding my knowledge and skill set. Topic classification has been a subject of great interest to me since my school days, and I believe it holds immense practical significance. Unlike traditional classroom learning, this internship has allowed me to go beyond the acquisition of knowledge and delve into the realm of problem-solving. It has been an excellent opportunity to strengthen my teamwork abilities, as it required not only completing my individual tasks but also effectively collaborating with my fellow teammates.
I am sincerely grateful to Colin and Kevin for their invaluable support throughout this internship. Their guidance has been instrumental in helping us navigate challenges, rectify mistakes, and set clear directions.
I truly enjoyed my time at Kai Analytics and found the internship to be a highly enriching experience.
Working with Kai Analytics has been a wonderful way to finish our capstone program!
As someone who is passionate about linguistics and working with unstructured text, the Unigrams project was a great opportunity to apply my education to a real-world problem. It was really cool to see how organizations such as Kai Analytics can use natural language processing (NLP) strategies to build an application that can be used to enhance insights and benefit businesses in a tangible way.
Collaborating with Collin, Kevin, and the rest of our team has been an exceptionally supportive experience, and I’m very proud of our project!
I thoroughly enjoyed collaborating with the team at Kai Analytics throughout my internship. The project topic proved to be exceptionally intriguing, allowing us to apply our existing knowledge while also pushing us to expand our skills.
Colin and Kevin, our mentors, exhibited remarkable passion and consistently offered invaluable feedback to foster our growth. Their exceptional organizational abilities were evident through the establishment of weekly milestones, enabling us to monitor our progress toward the overarching goal.
Over the past 6 weeks I have been working with my capstone teammates to develop a new topic modelling pipeline for Kai Analytics.
In particular, I have focused on preparing our code for deployment (containerization and interactivity), creating some advanced visualizations to help users get a more intuitive sense of the data, testing out new libraries for tasks like joint topic-sentiment analysis, and implementing spelling normalization tools to make sure our modelling is robust to spelling differences (which can be quite important when working with a limited amount of data!).
I have enjoyed working on a data product that is applicable to a wide range of domains and have loved getting to implement some very cutting-edge technologies. I look forward to the rest of my time with the Kai Analytics team.
We are thrilled to have been selected as a capstone partner. We are grateful to work with four fantastic students and their professor over the past six weeks. We hope that the students found the mentorship helpful and we look forward to collaborating further.
The new discoveries we made together during the project will now be incorporated into our development pipeline. This mean next level results for users of Unigrams in the coming weeks.