Data Labeling Platform for BioTech Intelligence


This is a startup project I worked with 3 MBA, 5 computer engineers, and domain experts. I joined the team as a sole designer and led end to end product design process.

We created MoreLife.ai, a cost-efficient, end-to-end data platform available for artificial intelligence companies working on tools that aid in medical diagnosis from static images like X-Rays, CT scans, and MRIs. Our aim is to assist in the task of labeling medical training data by employing an algorithm that labels with minimal oversight.

01/2020 - 03/2020

Team Member:
1 Designer (Me!) + 5 CS Engineers + 3 MBAs + 1 Biology PhD

Led end to end design ranging from user research, product thinking, product strategy to UX design and interaction design.




With the rapid development of machine learning technology, digital image processing technology has been widely applied in the medical field to provide support for subsequent medical diagnosis. A primary step in enhancing the computer vision model is to set a training algorithm and validate these models using high-quality training data.

However, handling medical data from acquisition to training is an enormous obstacle that all machine learning teams face today in healthcare. Lots of ML teams spend significantly more time collecting and labeling training datasets than building machine learning models. 

Our team observed this situation and decided to walk in this space, asking why handling medical data is so difficult.


Research Method


SME (ML team leader) Interview

In order to get some industry insights, we interviewed 3 SMEs, who are currently working in the AI field. By talking to them, we hope to learn the challenges the machine learning team faces and factors that impede the fast processing of ML data, specifically, medical images data.

User (Labeller) Interview

After talking to SMEs, we realized that the data labeling platform's direct user is not our buyers. In this case, to understand the day to day work of labeling data and the challenges, we conducted another round of research. We interviewed 2 data labelers who used to work for AI companies to label images to train computer vision AI models manually.




1. The cost for labeling medical data is expensive.

Training medical AI models require high-quality data. However, annotating medical data (images) required high domain knowledge. Currently, the majority of medical data is hand-labeled by hard to acquire medical experts; and hiring domain expert like radiologists cost a lot of money.


2. Labeling medical data has a long turnaround time.

Because of the labor shortage in the US market, ML teams usually hire 3rd party labelers outside of the US. There is no unified system for team leaders to manage 3d party labelers. Besides, the inefficient communication between labelers and the ML team caused the ambiguity understanding of taxonomy.


3. Quality control of labeling medical data is poor.

A significant cost of considering a robust AI model requires 250,000-1,000,000 images. The arduous task of hand-labeling data strains, even trained professionals experience burnout issues. Furthermore, poor data quality from overworked labelers threatens the accuracy of AI models.


Should We Invest In This Space?

Market Size Is Growing

The success of AI in cancer detection, pharmaceuticals, and research, etc. by 100+ new AI companies in Healthcare and Life Sciences motivates the market to keep growing. The market size of AI in healthcare is expected to grow from $2.1 billion in 2018 to $36.1 billion by 2025, at a CAGR of 50.2% during the forecast period.


Big Players Are Joining the Game

Multiple tech giants are entering the market with massive amounts of resources and partnerships with healthcare institutions and organizations: i.e., Microsoft+Walgreens, Microsoft+Humana, Apple+Several Hospitals, Amazon Care, Microsoft + Novartis, etc.


No One Is Dominating This Space

AI companies and machine learning teams desire a more efficient way to reduce the cost of labeling medical data in terms of time and resources.

However, when we conduct a competitive analysis, we noticed that no player is dominating this space. There is no business that provides cost efficiency data processing service while having expertise in the medical field.


How might we help biomedical AI companies produce better quality data more efficiently?


How to reduce cost?


How to reduce turnaround time?


How to improve quality control?

Our Solution

Automated Label Generation

We use the Data Programming technic to derive training labels from large data sets. These models do not identify a condition but rather categorized, sort, and ultimately label thousands of images with a high degree of precision.

Man + Machine Model

Labeler (Radiologists) act as auditors of the labeling algorithm. By bridging human and artificial intelligence to reframe medical image labeling workflow, it can decrease labelers’ workload while increasing their accuracy.


Benefits For Stakeholders


Benefits for Labeler (Radiologists)

It significantly decreases the workload of radiologists. Images already had an annotation that helped them understand what to look for. Hence, reducing workload would eliminate burnout issues leading to increased accuracy and efficiency.

Benefits for AI Companies

Data labeled accurately in a short time frame compared to all data being labeled by humans. It saves AI companies costs and time so that they can ship the ML models to the market earlier.


Validate Our Idea

We went ahead and talked to SMEs we interviewed before to ask for their opinion of our concepts. When we talked to Daryn Nakhuda, co-founder of Mighty Ai, we found that our product resonated with him. On the other hand, he addressed a few concerns about our business model:

  1. Our original plan was only to provide labeling service and labeling platform, however, he addressed that the margin of labor is very low.
  2. One of the foreseeing challenges is most of the contracts would be one time. It would be difficult for us to create a reoccurring contract with the same client.

In the end, he envisioned and suggested an end-to-end platform for the entire AI lifecycle including quality control, people management, labeling process, and result analytics.

“It's really the full-service platform that we offered helped us get there ... the quality control, people (if you want them), processes, and analytics on the data…”

-Daryn Nakhuda
Co-founder & CEO of Mighty AI


Here is a visualization (designed by me) of the platform we are trying to build. 

Define Labeler User Journey

Figure Out the High-Level User Flow

Since there are multiples different stakeholders involved in the labeling task, I decided to map out all essential tasks to have a better understanding of the detailed user flow and task navigation among all stakeholders.


Create Lo-Fi Wireframe

Design Backbone of the Product

I led and owned the entire interaction design part. At first, I built lo-fi wireframe to represent a high-level overview of the new ideas visually and gave me a quick understanding of how to map put various tasks to different UI and sections. In addition, it also helped me interpret how users can navigate between other interfaces.

This project is currently under NDA.

Please contact me to learn more.

© Harri Lin 2023