til.duyet.net
  • 🤪Today I Learned
  • Data Engineering
    • ☁️AWS
      • Make an NVMe volume available for use on AWS EC2
      • AWS EMR
        • S3 Dist CP
        • Manage services
        • EMR - Tez
        • Issues
    • ⚒️Tools
    • 💻Shell
      • MacOS's Touch ID on Terminal
      • Using `sed` to find and replace in file
      • Merging contents of multiple .csv files into single .csv file
    • env from ConfigMap or Secrets
    • ☸️Kubernetes, Helm, Kustomize
      • initContainer to download file to pod
      • Kustomize: omission of resources
      • K8S: Services and Labels
      • K8S: PVC stuck in status “Terminating”
      • K8S: Port Forward
      • K8S: Pull an Image from a Private Registry
      • Happy helm
      • Helm: render manifest locally
      • Helm: Public Helm chart repository with GitHub Pages
    • 🔄Apache Airflow
      • Extend official Docker image
      • Generate offline SQL upgrade script
      • Airflow in Docker
      • Mastering Airflow UI
      • Best Practices for Airflow and ETLs
      • Airflow in Docker Compose
      • Useful SQL queries for Apache Airflow
    • 🐳Docker
      • "Distroless" Docker Images
      • Docker cleanup
      • Optimize the Docker Image Size
      • The best Docker base image for Python
  • Database
    • Google BigQuery
      • BigQuery Cancel Running Query
      • BigQuery - Split string and get the first part
      • BigQuery - UNNEST in SELECT
      • Bigquery - Sample queries for audiences based
      • BigQuery cookbook for Google Analytics Exported Data
    • Apache Hive
      • Hive - SHOW schemas/tables/create
    • AWS Redshift / Postgres
      • Amazon Redshift Utilities
      • Postgres - Index Summary
      • Postgres - List tables
      • Redshift - GRANT
      • Redshift - tables and their owners
      • Redshift - Check the table size
    • Presto
      • Aliyun Data Lake Analytics (Presto) - Add partition on non-existing location
      • AWS Athena - Add Partition
  • Programming
    • 🍪Rust
      • cheats.rs
      • Imperative vs Declarative
      • Generate Struct from JSON
    • 🐍Python
      • YAML config file with environment variables
      • date_range_generator
      • get_all_s3_keys
      • Pipenv
    • 👻Golang
      • Functions and Methods in Go?
      • Convert JSON to Go struct
    • 💎Javascript / Typescript
      • Intl.ListFormat
    • ✨FE / React
      • Beautiful icons, images, ..
      • Flexbox
      • Create hook to inject JS script
  • Unix
    • Git - Pretty git branch graphs
    • Checking files in Docker build context
    • Bash get the directory of the current script
    • Vim
    • Find and replace
  • Miscellaneous
    • [Fig] Single machine and distributed system structure
    • Deploying Machine Learning Models at Scale
    • Bypass a Chrome certificate/HSTS error
    • Articles
Powered by GitBook
On this page

Was this helpful?

Edit on Git
  1. Miscellaneous

Deploying Machine Learning Models at Scale

https://algorithmia.com/blog/deploying-machine-learning-at-scale

Previous[Fig] Single machine and distributed system structureNextBypass a Chrome certificate/HSTS error

Last updated 5 years ago

Was this helpful?

Deploying machine learning models at scale is one of the most pressing challenges faced by the community of data scientists today, and as ML models get more complex, it’s only getting harder. The most common way machine learning gets deployed today is on powerpoint slides.

We estimate that fewer than 5 percent of commercial data science projects make it to production. If you want to be part of that share, you need to understand how deployment works, why machine learning is a unique deployment problem, and how to navigate this messy ecosystem.

---

Machine learning has a few unique features that makes deploying it at scale harder

Deploying regular software applications is hard—but when that software is a machine learning pipeline, it’s worse.

  1. Multiple Data Science Languages

    1. R, Python, Scalar ...

  2. Data Science Languages Can Be Slow

  3. Machine Learning Can Be Extremely Compute Heavy, and Relies on GPUs

  4. Machine Learning Compute Works In Spikes

    1. Once your algorithms are trained, they’re not used consistently––your customers will only call them when they need them. That can mean that you’re only supporting 100 API calls at 8:00 AM, but 1 Million at 8:30 AM. Scaling up and down like that while making sure not to pay for servers you don’t need is a nightmare.

After taking months to write out your (awesome) models, you’re going to need to hand them over to engineering to deploy at scale. That process can take months, and the models you end up with may not at all resemble what you handed them originally. And if you want to make small changes after, or continually improve your models with new data? Forget about it.