Image Classification App with EfficientNet

The goal of this project is to classify all the images I had from previous phones, cloud backups etc. The project will be a local web application that acts as a photo album, with EfficientNet/TensorFlow running locally to do image classification. There are 15k images, and I can classify them locally on an RTX 3060 GPU.

This post in the series details the system design. As this was my first project using Cursor, I wanted to specify the system design clearly, and see how much assistance it was able to give with the coding. The below is what I have prompted Cursor.


We are building a local-only web application for image classification with the following components:


1. Docker Setup

The application will use Docker and Docker Compose for containerization, ensuring clear separation of services and ease of deployment. Key points:

  • Containers:
    • Frontend: React-based container for the user interface.
    • Backend: Go-based API container for managing metadata and serving filtered/sorted images.
    • Machine Learning: Python-based container for TensorFlow image classification with GPU support.
    • Message queue: communication between backend and ML containers
  • Volume Mounts:
    • A specified directory external to the project containing the images is mounted as read-only for the Go and Python containers.
    • A volume is mounted for the SQLite database file, allowing read/write access by the Go container.
  • GPU Support:
    • The Python ML container uses nvidia/cuda:12.6.3-cudnn-runtime-ubuntu24.04 as its base image, leveraging GPU acceleration via NVIDIA Container Toolkit.

2. Frontend (React)

A user-friendly interface with the following features:

  • View All Photos: Display all images from the mounted directory.
  • Metadata Display: Show metadata (e.g., categories) retrieved from the SQLite database via the backend.
  • Sorting and Filtering:
    • Sort images by date.
    • Filter images by category.
  • Album Management:
    • Create, update, and delete albums (collections of photos).
    • Albums and their associated photos are stored in the database.

3. Backend (Go)

The backend serves as the API for the application, with the following responsibilities:

  • Database Management:
    • Manage metadata in an SQLite database, including relationships between images, categories, and albums.
  • API Endpoints:
    • Serve metadata and filtered/sorted image lists to the frontend.
    • Support CRUD operations for albums and categories.
  • CLI Features:
    • Database Migrations: Apply and manage migrations for the SQLite database.
    • ML Query: Trigger the Python ML container via the message queue, send image classification tasks, and retrieve classification results to update the database.

4. Machine Learning (Python)

A Python-based service handles image classification using TensorFlow and EfficientNet. Key features:

  • Image Classification:
    • Listens to a message queue (MQ) for image classification tasks sent by the Go backend.
    • Processes a batch of image file paths at a time and applies the specified model.
    • Sends the classification results back to the MQ for the Go backend to retrieve and update the database.
  • GPU Support:
    • The container uses the base image nvidia/cuda:12.6.3-cudnn-runtime-ubuntu24.04 and TensorFlow’s GPU version.
  • Volume Mounts:
    • Read-only access to the external image directory.
    • No direct database access (handled by the Go backend).

5. Database (SQLite)

The database manages image metadata and application data:

  • Tables:
    • Images: File paths, timestamps, and other image-related metadata.
    • Categories: Many-to-many relationships between images and categories.
    • Albums: Collections of images created by users.
    • Models: Track the classifiers used for categorising images.
  • Data Integrity:
    • The database is mounted as a file accessible only by the backend container.
    • Excluded from version control to ensure privacy.

6. Message Queue (MQ)

The application will use a message queue to facilitate communication between the Go backend and the Python ML container. Key points:

  • Choice of MQ: RabbitMQ is chosen for its reliability and robust support for task queuing.
  • Purpose:
    • The Go backend sends image classification tasks to the Python container through the MQ.
    • The Python container processes the tasks, classifies the images, and sends the results back to the MQ.
    • The Go backend retrieves the results from the MQ and persists them in the SQLite database.
  • Scalability: This design supports asynchronous processing and can scale to handle large batches of images efficiently.

Workflow

  1. Image Viewing: The user views all images in the photo directory through the frontend.
  2. Metadata Management: The backend fetches and serves metadata to the frontend from the SQLite database.
  3. Image Classification:
    • The Go CLI triggers the ML service, passing image paths via the message queue.
    • The ML service classifies the images and updates the database with categories.
  4. Album Management: Users create and manage albums through the frontend, with changes reflected in the database.

Goals

  • Local-only deployment for privacy and security.
  • Scalability to handle ~15,000 images.
  • GPU-accelerated classification for efficient batch processing.