Project

EG4338/EG6338 Machine Learning

Author

Yike Zhang

Published

August 21, 2025

Machine Learning Projects Template

Develop your own machine learning projects are an excellent way to showcase your skills and knowledge in real-world scenarios. Jupyter Notebooks and Google Colab can be handy tools for trying new ideas and simple experiments. However, for more complex and larger projects, it is recommended to use a structured approach to organize your code, data, and documentation. This will help you maintain clarity and efficiency as your project grows. Below is a popular template that you can use to structure your machine learning projects. This template is adopt from the Cookiecutter Data Science website, which contains some essential parts for organizing your machine learning project effectively.

├── LICENSE            <- Open-source license if one is chosen
├── Makefile           <- Makefile with convenience commands like "make data" or "make train"
├── README.md          <- The top-level README for developers using this project.
├── data
   ├── external       <- Data from third-party sources.
   ├── interim        <- Intermediate data that has been transformed.
   ├── processed      <- The final, canonical data sets for modeling.
   └── raw            <- The original, immutable data dump.

├── docs               <- A default mkdocs project; see www.mkdocs.org for details

├── models             <- Trained and serialized models, model predictions, or model summaries

├── notebooks          <- Jupyter notebooks.

├── pyproject.toml     <- Project configuration file with package metadata for
                         src and configuration for tools like black

├── references         <- Data dictionaries, manuals, and all other explanatory materials.

├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
   └── figures        <- Generated graphics and figures to be used in reporting

├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
                         generated with `pip freeze > requirements.txt`

├── setup.cfg          <- Configuration file for flake8

└── src   <- Source code for use in this project.
    
    ├── __init__.py             <- Makes src a Python module
    
    ├── config.py               <- Store useful variables and configuration
    
    ├── dataset.py              <- Scripts to download or generate data
    
    ├── features.py             <- Code to create features for modeling
    
    ├── modeling                
       ├── __init__.py 
       ├── predict.py          <- Code to run model inference with trained models          
       └── train.py            <- Code to train models
    
    └── plots.py                <- Code to create visualizations