Project

EG4338/EG6338 Machine Learning

Author

Yike Zhang

Published

August 21, 2025

Machine Learning Projects Template

Develop your own machine learning projects are an excellent way to showcase your skills and knowledge in real-world scenarios. Jupyter Notebooks and Google Colab can be handy tools for trying new ideas and simple experiments. However, for more complex and larger projects, it is recommended to use a structured approach to organize your code, data, and documentation. This will help you maintain clarity and efficiency as your project grows. Below is a popular template that you can use to structure your machine learning projects. This template is adopt from the Cookiecutter Data Science website, which contains some essential parts for organizing your machine learning project effectively.

├── LICENSE            <- Open-source license if one is chosen
├── Makefile           <- Makefile with convenience commands like "make data" or "make train"
├── README.md          <- The top-level README for developers using this project.
├── data
   ├── external       <- Data from third-party sources.
   ├── interim        <- Intermediate data that has been transformed.
   ├── processed      <- The final, canonical data sets for modeling.
   └── raw            <- The original, immutable data dump.

├── docs               <- A default mkdocs project; see www.mkdocs.org for details

├── models             <- Trained and serialized models, model predictions, or model summaries

├── notebooks          <- Jupyter notebooks.

├── pyproject.toml     <- Project configuration file with package metadata for
                         src and configuration for tools like black

├── references         <- Data dictionaries, manuals, and all other explanatory materials.

├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
   └── figures        <- Generated graphics and figures to be used in reporting

├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
                         generated with `pip freeze > requirements.txt`

├── setup.cfg          <- Configuration file for flake8

└── src   <- Source code for use in this project.
    
    ├── __init__.py             <- Makes src a Python module
    
    ├── config.py               <- Store useful variables and configuration
    
    ├── dataset.py              <- Scripts to download or generate data
    
    ├── features.py             <- Code to create features for modeling
    
    ├── modeling                
       ├── __init__.py 
       ├── predict.py          <- Code to run model inference with trained models          
       └── train.py            <- Code to train models
    
    └── plots.py                <- Code to create visualizations   

Machine Learning Final Project Ideas

  1. 🌮 Taco Sales Synthetic Dataset (2024–2025)

  2. 🌊 Flood Area Segmentation Dataset

  3. 🎓 Student Stress Monitoring Dataset

  4. 🦴 Human Bone Fracture Detection Dataset

  5. 🔥 Forest Fire Detection Dataset

  6. 🃏 Cards Image Classification Dataset

  7. 🗑️ Garbage Classification Dataset

  8. 🍋 Fruits and Vegetables Image Recognition Dataset

  9. 🌺 Flowers Classification Dataset

  10. 🪴 Plant Disease Recognition Dataset

  11. ♻️ Household Trash Recycling Dataset

  12. 💅 Nail Segmentation Dataset

  13. 🦘 Animals Detection Images Dataset

  14. 🤘American Sign Language Dataset

  15. 🫘 Coffee Bean Classification Dataset

  16. ☁️ Weather Image Recognition Dataset

  17. 🍄 Mushroom Species Classification Dataset

  18. 🐛 Dangerous Farm Insects Classification Dataset

  19. 🪧 Synthetic Job Postings 2025 Dataset

  20. 🍿 IMDB Movies Reviews Dataset from 2015 to 2024

The examples provided above can serve as inspiration for your Final Machine Learning Project. You are also welcome to propose your own project ideas, but please note that they will require prior approval from the instructor (discuss with the instructor first).