Image Classification with EfficientNet

Running Image Classification with EfficientNet

The goal of this project is to classify all the images I had from previous phones, cloud backups etc. The project will be a local web application that acts as a photo album, with EfficientNet/TensorFlow running locally to do image classification. There are 15k images, and I can classify them locally on an RTX 3060 GPU.

In this post, I’ll walk you through how I set up and ran image classification using a pre-trained EfficientNet model in TensorFlow.

I have recently collected all my photos from a cloud storage service, and old phone, and various other places. I would like categorise or tag them, as a way of helping me organise and find things.

Not being keen to give a big tech company all of them, I have decided to build a tool I can use, based on the EfficientNet artificial neural network, which I can run locally on an RTX 3060 GPU.

This was my first step in learning how to use AI for image recognition, with the ultimate goal culling and organising my photo collection. This project serves as a foundation for more advanced applications in the future.

Background

EfficientNet is a family of convolutional neural networks that achieves state-of-the-art accuracy while being computationally efficient. The model I used, EfficientNetB0, is pre-trained on the ImageNet dataset, which allows it to recognise a wide range of objects out of the box.

The goal of this project was simple: use EfficientNet to classify an image of a cat and identify its breed or related categories. This involved setting up a development environment, configuring TensorFlow to use my NVIDIA RTX 3060 GPU, and writing Python code for inference.

Setting Up the Environment

1. Installing Python and TensorFlow

I started by ensuring Python was installed on my machine. Since I’m using Ubuntu 24.10, my Python version is 3.12. To avoid interfering with the system-wide Python environment, I created a virtual environment:

python3 -m venv ~/workspace/ai/venv
source ~/workspace/ai/venv/bin/activate
pip install --upgrade pip

Next, I installed TensorFlow with GPU support:

pip install tensorflow[and-cuda]

This ensured that TensorFlow would take advantage of my RTX 3060 for faster computations.

2. Verifying GPU Usage

To confirm that TensorFlow could detect my GPU, I ran the following script:

import tensorflow as tf
print("GPUs detected:", tf.config.list_physical_devices('GPU'))

The output showed that TensorFlow recognised my RTX 3060, so I was ready to proceed.

Writing the Inference Code

The next step was writing a Python script to load an image and classify it using EfficientNetB0. Here’s the complete code:

import numpy as np
import tensorflow as tf
from tensorflow.keras.applications import EfficientNetB0
from tensorflow.keras.applications.efficientnet import preprocess_input, decode_predictions
from tensorflow.keras.preprocessing import image

# Ensure TensorFlow uses the GPU
gpus = tf.config.list_physical_devices('GPU')
if gpus:
    tf.config.set_visible_devices(gpus[0], 'GPU')
    print("Using GPU:", gpus[0])
else:
    print("No GPU detected. Using CPU.")

# Load the pre-trained EfficientNetB0 model
model = EfficientNetB0(weights='imagenet')

# Path to the sample image
img_path = 'sample_image.png'

# Load and preprocess the image
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

# Perform inference
predictions = model.predict(x)
decoded_predictions = decode_predictions(predictions, top=3)[0]

# Print the results
for i, (imagenet_id, label, score) in enumerate(decoded_predictions):
    print(f"{i + 1}: {label} ({score:.2f})")

This script does the following:

Configures TensorFlow to use the GPU
Loads the EfficientNetB0 model pre-trained on ImageNet
Preprocesses the input image to meet the model’s requirements (e.g., resizing to 224x224 pixels and normalizing pixel values)
Runs inference and decodes the top-3 predictions

Results

When I ran the script with an image of a cat, the output was:

1: tiger_cat (0.25)
2: tabby (0.21)
3: Egyptian_cat (0.09)

EfficientNet correctly identified the image as related to cats, with tiger_cat as the most likely category. This was a great validation of the setup and the model’s capabilities.

Lessons Learned

Environment Setup: Using a virtual environment and ensuring TensorFlow had GPU support were the first steps. Debugging GPU issues taught me how to configure CUDA correctly.
Inference Basics: Understanding how to preprocess images and decode model outputs gave me a solid foundation for working with pre-trained models.
Efficiency: Running inference on the GPU significantly improved performance compared to the CPU. The RTX 3060 GPU I am using is sufficient for this task.

Next Steps

Now that I’ve successfully run inference, I plan to make a start on training. Initially, I will train EfficientNet on a dataset known as “Cats vs. Dogs.” Eventually, I will use what I have learned to organise my personal photo collection into categories automatically. This will help me prune and organise the collection.

Image Classification with EfficientNet

Part of the Machine Learning and Image Classification series

Running Image Classification with EfficientNet

Background

Setting Up the Environment

1. Installing Python and TensorFlow

2. Verifying GPU Usage

Writing the Inference Code

Results

Lessons Learned

Next Steps

Part of the Machine Learning and Image Classification series

Part of the Machine Learning and Image Classification series

Running Image Classification with EfficientNet#

Background#

Setting Up the Environment#

1. Installing Python and TensorFlow#

2. Verifying GPU Usage#

Writing the Inference Code#

Results#

Lessons Learned#

Next Steps#

Part of the Machine Learning and Image Classification series

Running Image Classification with EfficientNet

Background

Setting Up the Environment

1. Installing Python and TensorFlow

2. Verifying GPU Usage

Writing the Inference Code

Results

Lessons Learned

Next Steps