Best practices to write Deep Learning code: Project structure, OOP, Type checking and documentation
Best practices to write Deep Learning code: Project structure, OOP, Type checking and documentation
In part 1 of the Deep Learning in Production course, we defined the goal of this article-series which is to convert a python deep learning notebook into production-ready code that can be used to serve millions of users. Towards that end, we continue our series with a collection of best practices when programming a deep learning model. These practices mostly refer to how we can write organized, modularized, and extensible python code .
You can imagine that most of them aren’t exclusive for machine learning applications but they can be utilized on all sorts of python projects. But here we will see how we can apply them in deep learning using a hands-on approach (so brace yourselves for some programming).
One last thing before we start and something that I will probably repeat a lot on this course. Machine learning code is ordinary software and should always be treated as one. Therefore, as ordinary code it has a project structure, a documentation and design principles such as object-oriented programming.
Also, I will assume that you have already set up your laptop and environment as we talked about on part 1 (if you haven’t feel free to do that now and come back)
Project Structure
One very important aspect when writing code is how you structure your project. A good structure should obey the “Separation of concerns” principle in terms that each functionality should be a distinct component . In this way, it can be easily modified and extended without breaking other parts of the code. Moreover, it can also be reused in many places without the need to write duplicate code.
Tip : Writing the same code once is perfect, twice is kinda fine but thrice it’s not.
The way I like to organize most of my deep learning projects is something like this :
And that, of course, is my personal preference. Feel free to play around with this until you find what suits you best.
Python modules and packages
Notice that in the project structure that I presented, each folder is a separate module that can be imported in other modules just by doing “import module”. Here, we should make a quick distinction between what python calls module and what package . A module is simply a file containing Python code. A package, however, is like a directory that holds sub-packages and modules. In order for a package to be importable, it should contain a init .py file ( even if it’s empty). That’s not the case for modules. Thus, in our case each folder is a package and it contains an “init” file.
In our example, we have 8 different packages:
1) configs: in configs we define every single thing that can be configurable and can be changed in the future. Good examples are training hyperparameters, folder paths, the model...