Skip to main content

Command Palette

Search for a command to run...

ML academics vs ML production

The drastic differences between machine learning in academics and machine learning in production for commercial purposes.

Updated
2 min read
ML academics vs ML production
S

I love programming, writing, travelling mountains, and mentoring. Let me know what you think at any of the following places:

The Machine learning used in academics/research is quite different from the ML used in Production applications for real usage by end users. Here is a description on what are the differences.

Requirements

  • In academics, the need is to build the next SOTA (State Of The Art) model.
  • A 0.1% gain above exiting SOTA is considered exceptional.
  • In Production, there is no fixed requirement across all the stakeholders, the Sales team, Product team, Engineering manager, etc. have different requirements.

ML Lifecycle priority

  • In academics, GPU/TPU machines with high throughput which can train faster are required.
  • In production low latency fast Inference/prediction is required. The users need to be shown the recommendations, the ads fast. A slight delay can reduce the clickthrough rate and thereby revenue drastically.

Data

  • In academics, mostly there is a benchmark static dataset on top of which models are built.
  • In production, data is constantly getting generated by the users and may have bias.
  • Working with shifting datasets make it a challenge.

Bias and Fairness

  • In academics, in front of achieving the SOTA model goal, fairness takes a low priority.
  • In production, the fairness of the ML model can not be ignored.

Interpretability

  • In academics why the model predicts the result is often not a priority.
  • In production, explainability is of greater priority on why the model makes this decision and the model should be more than a black box.

We discussed how ML in research is different from ML in production across the following categories:

  1. Requirements
  2. Lifecycle priority
  3. Data
  4. Bias and Fairness
  5. Interpretability

Reference: https://www.oreilly.com/library/view/designing-machine-learning/9781098107956/

For more such insights follow @soumnedrak_

AI

Part 7 of 9

In this series, we will discuss on the AI tools and tech.

Up next

Why do we need vector embeddings in NLP?

Evolution of Vector Embeddings