Deep Learning in Production: Laptop set up and system design

Deep Learning in Production: Laptop setup and System Design

Hello hello hello and welcome to the Deep Learning in Production course.

In this article series, our goal is dead simple. We are gonna start with a colab notebook containing prototype deep learning code (i.e. a research project) and we’re gonna deploy and scale it to serve millions or billions (ok maybe I’m overexcited) of users.

We will incrementally explore the following concepts and ideas:

how to structure and develop production-ready machine learning code,

how to optimize the model’s performance and memory requirements, and

how to make it available to the public by setting up a small server on the cloud.

But that’s not all of it. Afterwards, we need to scale our server to be able to handle the traffic as the userbase grows and grows.

So be prepared for some serious software engineering from the scope of machine learning. Actually now I’m thinking about it, a more suitable title for this series would be “Deep Learning and Software Engineering”.

To clarify why Software Engineering is an undeniable significant part in deep learning, take for example Google Assistant. Behind Google Assistant is, without a doubt, an ingenious machine learning algorithm (probably a combination of BERT, LSTMs and God knows what else). But do you think that this amazing research alone is capable of answering the queries of millions of users at the same time? Absolutely not. There are dozens of software engineers behind the scenes, who maintain, optimize, and build the system . This is exactly what we are about to discover.

One last thing before we begin. “Deep Learning in Production” isn’t going to be one of those high-level, abstract articles that talk too much without practical value. Here we are going to go really deep into software, we will analyze details that may seem too low level, we will write a lot of code and we will present the full deep learning software development cycle from start to end. From a notebook to serving millions of users.

From research to production

But enough with the pitching (I hope you are convinced by now). Let’s cut to the chase with an overview of the upcoming articles. Note that each bullet is not necessarily a separate article (it could be 2 or 3). I am just outlining all the subjects we will touch.

Setup your laptop and environment.

Best practises to structure your python code and develop the model.

Optimize the performance of the code in terms of latency, memory etc.

Train the model in the cloud.

Build an API to serve the model.

Containerize and deploy the model in the cloud.

Scale it to handle loads of traffic.

In this aspect, some of the technologies, frameworks and tools we will use (in random order):