Train YOLO Custom object detection model in Windows GPU

In my last post, I have shown you how you can detect objects using YOLO model. Now official YOLO model comes with a specific list of objects (a total of 80 objects) like car, person, cat, dog etc. Now if you want to detect custom object using YOLO you need to train YOLO custom object detection model on your own. This comprehensive and step-by-step tutorial lets you train your own custom object detector using YOLO in windows.

In this tutorial, I will show you how to train YOLOv3 to detect custom objects. But after reading this tutorial you can train YOLOv4 custom object detection, you can also train YOLOv5 custom object detection. Configuration and training process is same for all.

Before you start I will recommend you to read the below articles:

I will break this tutorial into some steps:

  1. Image Collection
  2. Image Selection (How to choose a proper set of images to train YOLO)
  3. Annotate Image
  4. Download and configure Darknet in Windows
  5. Create Train and Test Data to train YOLO model
  6. Compile darknet on Windows
  7. Train YOLO custom object detection model in Windows
  8. Test YOLO model for image and video

1. Image Collection to train YOLO model

To train any custom object detection model, you need a huge training dataset. To create a training dataset for object detection, you need a huge amount of images. Follow the below steps to download images in bulk from google itself.

  • Install Imageye extension in google chrome to download all searched images in bulk
  • Search any image in the google image search
  • Click on Imageye extension
  • Now in the Imageye window click on the filter icon and specify your desired range of image size. I am going to use 0-20000, the default range.
  • Now Click on select all then click on Download images in Imageye

Once you have downloaded images in bulk, you should remember the below points to select the best images so that with minimum images model can perform best (properly detect custom objects).

2. How to choose images to train YOLO?

While selecting images to train YOLO model to detect custom objects. you must keep below points in your mind:

How to choose images to train YOLO to detect custom objects
  • Variance for training data: Now let’s say you want to train a multi-class YOLO model. For this example, I am going to train a custom YOLO , which can predict Messi and Ronaldo in any image or video (two-class classification)`. For this, you need to maintain proper balance in your training data. To do that you can:
    1. Download some images only for Messi
    2. Download some images where Messi is available with other persons but not with Ronaldo
    3. Download some images only for Ronaldo
    4. Download some images where Ronaldo is available with other persons but not with Messi
    5. Download some images where Messi and Ronaldo both are present
  • Number of images: After planning the image variation I will recommend you to use at least 500 images in total (including Messi and Ronaldo)
  • Quality of images: While downloading images for YOLO model I will suggest you use all kinds of image quality. Not only high resolution and not only low resolution.

Let me show you the count of images I have used to train my Custom YOLO model to detect Messi and Ronaldo for this tutorial:

  1. 200 images of Messi
  2. 200 images of Ronaldo
  3. 150 images where both Ronaldo and Messi present
  4. 50 images where Messi is available with other persons but not with Ronaldo
  5. 50 images where Ronaldo is available with other persons but not with Messi
  6. Total 650 images I have selected to train this model

Note: If you want to detect any custom object and you are not able to find a good number of images on the internet, in that case, you can use the image Augmentation technique.

Now all the images have different names. We need to use serial numbers (0, 1, 2…) as names for those images. Run the below code (rename_images.py) to rename your final images to serial numbers.

rename images to train yolo on custom dataset

rename_images.py

# Rename images in bulk
import os

# Mention image folder full path
folder_path = "D:/yolo_custom_object_detection/final_images"

files = os.listdir(folder_path)

# Rename images with serial number. Keeping the format (png, jpeg, jpg, etc.) same
for index, file in enumerate(files):
    os.rename(os.path.join(folder_path, file), os.path.join(folder_path, ''.join([str(index), '.', file.split('.')[-1]])))

Note: I kept all downloaded images inside the folder name “final_images” under D:/ drive.

3. Annotate Image

At this stage, I hope you have finalized some images which you are going to use for training YOLO custom object detector. Now you need to annotate those images manually. This is the hardest part of the entire process to train any computer vision model.

To annotate images for the YOLO model I am going to use LabelImg annotator for this tutorial. It is an open-source image annotator. You can download and use it in Windows and Linux without installation.

If you are new to any annotator below are the guides to do annotation using LabelImg annotator:

  • Open LabelImg image annotator tool
  • Click on Open Dir and select the folder of your final images
  • Now at the right bottom, you will see list of all images from your image folder (“final_images”)
  • Click on PascalVOC to change the algorithm type to YOLO. Since we are going to make training dataset for YOLO model. We need to change it to YOLO
  • Click each image you want to annotate
  • Click on Create RectBox
  • Draw rectangle box by covering face of the person (in our case Messi or Ronaldo)
  • Write label name. If it is an image of Ronaldo write Ronaldo or if it is an image of Messi write Lable as Messi
  • Finally press crtl+s to save the annotation

By doing the above steps txt files will get generated inside the image folder (“final_images”) with the same name of each image for which you have annotated. Make sure that after annotating each image you need to save it every time.

save annotation data to train YOLO to detect custom objects

Now if you open any txt file you will see values like [0 0.484167 0.222500 0.275000 0.351667]. Thease values are nothing but coordinates of the bounding box you have drawn in the image while doing annotation.

Along with coordinate txt files, another file called classes.txt will also be saved inside image folder. This file contains class names. For our example we have used only two classes (Messi & Ronaldo), so in my classes.txt file have only “Messi” and “Ronaldo”.

Also Read:  Install OpenCV GPU with CUDA for Windows 10

So after annotating (and saved) each image from your image folder, you should have two types of files inside your image folder:

  1. Coordinate txt file
  2. txt file contains classes or labels (classes.txt)

So now my working directory structure looks like below:

yolo_custom_object_detection
  ├──final_images
  │	├──0.png
  │	├──0.txt
  │	├──1.png
  │	├──1.txt
  │	├── ...
  │	└──classes.txt
  │
  └──rename_images.py

4. Download and configure Darknet in Windows

This is very important step. To train YOLO custom object detection in Windows, you need to do few configurations manually. I will show them step by step. Please don’t miss any step, otherwise, you may end up with issue.

4.1 Download pre-trained YOLO model weight

YOLO object detection uses Darknet in the background. So to train YOLO custom algorithm, you need to clone Darknet github repository into your windows. To do that I am going to create a folder named “darknet” in my root directory and then I am going to clone the Darknet repository inside that folder.

So now my working directory structure should looks like below:

yolo_custom_object_detection
  ├──darknet
  ├──final_images
  │	├──0.png
  │	├──0.txt
  │	├──1.png
  │	├──1.txt
  │	├── ...
  │	└──classes.txt
  │
  └──rename_images.py

4.2 Install OpenCV GPU with CUDA for Windows

In the Darknet repository link check for the Requirements for Windows sections. You need to meet those requirements to train a custom YOLO object detection model in Windows using Darknet.

requirements to train custom YOLO object detection model in Windows using Darknet

Basically, you need to install CUDA and cuDNN with OpenCV. This will allow you to utilize your GPU. I have a detailed article for this installation, you must follow that article to install CUDA and cuDNN in Windows.

Before reading the above article few points you need to keep in your mind:

  • install CUDA >= 10.2: To meet YOLO darknet requirement you must install CUDA version 10.2 or higher. In the above tutorial, I have installed CUDA 10.1. But by following the above tutorial you need to install CUDA toolkit 10.2. I have installed CUDA toolkit 10.2 for this tutorial.
  • install cuDNN >= 8.0.2: You must install cuDNN version 8.0.2 or higher. In the above tutorial, I have installed cuDNN v7.6.5. But by following the above tutorial you need to install cuDNN v8.0.2 (July 24th, 2020), for CUDA 10.2. I have installed cuDNN v8.0.2 (July 24th, 2020), for CUDA 10.2 for this tutorial.

Must Read

4.3 Configure Darknet folder files

At this point, I am assuming you have successfully installed OpenCV GPU with CUDA for windows. Now we need to make a few changes to some files inside our “darknet” folder (repository folder created in step 4.1).

  • Change in Makefile: Inside the “darknet” folder you should find a file called “Makefile“. Open that “Makefile” in any code editor (I am opening it through PyCharm to make sure there will not be any formatting issue).

At the beginning of that file you should see:

GPU=0
CUDNN=0
OPENCV=0

You need to change those values to 1. So now those values should look like below:

GPU=1
CUDNN=1
OPENCV=1

Edit yolov3.cfg from \darknet\cfg folder: Open “cfg” folder inside your “darknet” folder. Since we are training YOLOv3 for this tutorial, open “yolov3.cfg” file in any code editor and change the below configurations:

configure yolov3 to train custom object detection model in windows
Before we are going to edit this file, let's understand what this file is? This file contains parameters of YOLO algorithm such as information about our training and testing dataset, batch size, learning rate, and all sorts of hyperparameters that are required to train this neural network. So now you need to change these hyperparameters based on number of classes for which you are going to train your custom YOLO neural network. For this tutorial, we are going to use only two classes (1. Messi, 2. Ronaldo). You may have more or fewer classes than me, In that case, you need to change those parameters accordingly. I will explain to you how these changes you can make for number of classes for which you want to train your own custom object detector.

Note: I am going to train YOLO custom object detector by utilizing the GPU of my Windows system.

Make a copy of “yolov3.cfg” file and name it “yolov3_football.cfg” (to keep backup). Now start editing “yolov3_football.cfg”:

  • At beginning of this file, you will see the Testing and Training section. To train YOLOv3 model you need to comment outbatch” and “subdivisions” from the Testing section (lines 3 & 4) and un-comment “batch” and “subdivisions” from the Training section (lines 6 & 7)
  • line-6: batch (for training): If you are using a huge number of training data, you may use a high batch size. I am using only 650 images so I am going to give batch = 3
  • line-7: subdivisions (for training): I am going to give subdivisions = 8
configure yolov5 to train custom object detection model in windows gpu
  • line-20 => max_batches: max_batches is the number of iterations you want for the neural network. This number should be the number of classes * 2000. Since I have 2 classes (1. Messi & 2. Ronaldo) so my max_batches = 2*2000 = 4000
  • line-22 => steps: This is the minimum and maximum step. You can give it 20% less and higher than the number of max_batches. My max_batches = 4000. 20% of 4000 = 800. So I am going to give my steps = (4000-800), (4000+800) = 3200, 4800
configure yolov5 to train custom object detection model in windows gpu

YOLO v3 uses a variant of Darknet, for the task of detection, a total of 106 layer fully convolutional layers are available underlying architecture for YOLO v3. Now among 106 convolutional layers, the last 3 layers are YOLO layer which produces output by predicting the object.

yolov3 architecture
YOLOv3 Architecture

If you are familiar with the concept of Transfer Learning, you need to change last few layers of the transfer learned model (pre-trained model), to configure it for your custom dataset.

Search in the cfg file for “[yolo]“, you will find total 3 yolo layers in the cfg file. You only need to modify these YOLO layers and presiding convolutional layer to these YOLO layers. Now let’s search for the first yolo layer. I found it in line number 607.

Also Read:  Clip Raster with a Shape file in Python
total last 3 yolo layers in darknet model

As you know YOLO pretraind model comes with 80 classes. If you are not sure about it, please read my previous post (YOLO object detection using deep learning OpenCV | Real-time). Now since I am training it for 2 classes (Messi & Ronaldo), So I need to change the number of classes = 2 (in line 610). Now I have to go to the presiding convolutional layer. Here you need to change the number of filters in line number 603. The simple formula to find number of filters = (number of classes +5) * 3

So in our case filters = (2+5) * 3 = 21

before changing first yolo layer of darknet model parameters
1st yolo layer before changing parameters
after changing first yolo layer of darknet model parameters
1st yolo layer after changing parameters

Now we have done with first YOLO layer with its presiding conv layer. This process you need to repeat for 2 more time since we have 3 YOLO layers in total. So you have to do the same thing for line 689 & line 696 (for 2nd YOLO layer with it’s preciding conv layer) and in line 776 & line 783 (for 3rd YOLO layer with it’s preciding conv layer).

5. Create Train and Test Data to train YOLO model

To train any machine learning model we need to split our total data (all annotated images created in step 3) into train and test. To do that let’s create a folder named “OutsideData” inside the “darknet” folder. Then copy and paste “final_images” folder inside it. So now your working directory structure should look like below.

Note: My root directory name is “yolo_custom_object_detection”

yolo_custom_object_detection
  ├──darknet
  │     └──OutsideData
  │           └──final_images
  │	           ├──0.png
  │	           ├──0.txt
  │	           ├──1.png
  │	           ├──1.txt
  │	           ├── ...
  │	           └──classes.txt
  │
  └──rename_images.py

Now create a blank python file (named create_train_and_test_data.py) inside the “OutsideData” folder and paste the below code into it. Then run that code.

So now your working directory structure should look like below:

yolo_custom_object_detection
  ├──darknet
  │     └──OutsideData
  │           ├──create_train_and_test_data.py
  │           └──Final images
  │	           ├──0.png
  │	           ├──0.txt
  │	           ├──1.png
  │	           ├──1.txt
  │	           ├── ...
  │	           └──classes.txt
  │
  └──rename_images.py

create_train_and_test_data.py

# Training YOLO v3 for Custom Objects Detection
# Creating files train.txt and test.txt for training YOLO in Darknet framework

# Importing needed library
import os

# Full path to the folder with images and annotation
full_path_to_images = 'D:/yolo_custom_object_detection/darknet/OutsideData/final_images'

# Changing the current directory
# to one with images
os.chdir(full_path_to_images)

# Defining list to write paths in
p = []

# Using os.walk for going through all directories
# and files in them from the current directory
# Fullstop in os.walk('.') means the current directory
for current_dir, dirs, files in os.walk('.'):
    # Going through all files
    for f in files:
        # Checking if filename ends with '.jpeg'
        if f.endswith('.jpeg') or f.endswith('.jpg') or f.endswith('.png'):
            # Preparing path to save into train.txt file
            path_to_save_into_txt_files = full_path_to_images + '/' + f

            # Appending the line into the list
            # We use here '\n' to move to the next line
            # when writing lines into txt files
            p.append(path_to_save_into_txt_files + '\n')


# Slicing first 15% of elements from the list
# to write into the test.txt file
p_test = p[:int(len(p) * 0.15)]

# Deleting from initial list first 15% of elements
p = p[int(len(p) * 0.15):]

# ---------------------------------------------------------

# Creating file train.txt and writing 85% of lines in it
with open('train.txt', 'w') as train_txt:
    # Going through all elements of the list
    for e in p:
        # Writing current path at the end of the file
        train_txt.write(e)

# Creating file test.txt and writing 15% of lines in it
with open('test.txt', 'w') as test_txt:
    # Going through all elements of the list
    for e in p_test:
        # Writing current path at the end of the file
        test_txt.write(e)

After running the above code two files (train.txt and test.txt) will be generated inside the “final_images” folder. So now your working directory should look like below:

yolo_custom_object_detection
  ├──darknet
  │     └──OutsideData
  │           ├──create_train_and_test_data.py
  │           └──final_images
  │	           ├──0.png
  │	           ├──0.txt
  │	           ├──1.png
  │	           ├──1.txt
  │	           ├── ...
  │                ├──train.txt
  │                ├──test.txt
  │	           └──classes.txt
  │
  └──rename_images.py
create train and test dataset to train custom yolo model.PNG

Now, these train and test files contain the path of each file inside the “final_images” folder. I have selected random 85% of the images from the “final_images” folder for training and 15% of images for testing.

training data for custom yolo model

Now we need to create a file called “labelled_data.data” for training YOLO in Darknet framework in Windows. This file should contain:

  1. Number of classes (in our example 2)
  2. Path for train.txt file
  3. Path for the test.txt file

Now create a blank python file (named create_labelled_data.py) inside the “OutsideData” folder and paste the below code into it. Then run that code.

create_labelled_data.py

# YOLO v3 for Objects Detection with Custom Data
# Result of this code:
# Create files classes.names and labelled_data.data needed to train YOLO in Darknet framework

# Path to the folder with images and annotation
full_path_to_images = 'D:/yolo custom object detection/darknet\OutsideData/Final images'

# Defining counter for classes
c = 0

# Creating file classes.names from existing classes.txt from "Final image" folder
with open(full_path_to_images + '/' + 'classes.names', 'w') as names, \
     open(full_path_to_images + '/' + 'classes.txt', 'r') as txt:

    # Going through all lines in txt file and writing them into names file
    for line in txt:
        names.write(line)  # Copying all info from file txt to names

        # Increasing counter
        c += 1

# Creating file labelled_data.data
with open(full_path_to_images + '/' + 'labelled_data.data', 'w') as data:
    # Writing needed 5 lines
    # Number of classes
    # By using '\n' we move to the next line
    data.write('classes = ' + str(c) + '\n')

    # Location of the train.txt file
    data.write('train = ' + full_path_to_images + '/' + 'train.txt' + '\n')

    # Location of the test.txt file
    data.write('valid = ' + full_path_to_images + '/' + 'test.txt' + '\n')

    # Location of the classes.names file
    data.write('names = ' + full_path_to_images + '/' + 'classes.names' + '\n')

    # Location where to save weights (trained YOLO algorithm)
    data.write('backup = backup')

The above code will generate the below files and save those in your image folder (Final images):

  1. A folder named “backup_football” should be created inside “OutsideData” folder. The trained model will be saved inside this folder
  2. classes.names file (by converting classes.txt to classes.names): Contains names of classes
  3. labelled_data.data: Contains information about:
    • classes: Number of classes
    • train: the training file path
    • valid: validation / testing file path
    • names: the path of classes.names file
    • backup: where to save the model backup while training the model for backup
final data for training custom object detection model using yolo v5
labelled data for yolo model

So after running this code, your folder structure should looks like below.

yolo_custom_object_detection
  ├──darknet
  │     └──OutsideData
  │           ├──backup_football
  │           ├──create_labelled_data.py
  │           ├──create_train_and_test_data.py
  │           └──final_images
  │	           ├──0.png
  │	           ├──0.txt
  │	           ├──1.png
  │	           ├──1.txt
  │	           ├── ...
  │                ├──classes.names
  │                ├──labelled_data.data
  │                ├──train.txt
  │                ├──test.txt
  │	           └──classes.txt
  │
  └──rename_images.py

6. Download yolov3 pre-trained model

To train YOLO on custom dataset we need to take an approach of Transfer Learning. In Transfer Learning we used to have a model which is pre-trained for some sort of classes. We need to optimize on top of that model with our custom dataset.

Click the link below to download yolov3 pre-trained model weight. I am going to use yolov3 for this tutorial. The training process is the same for all yolov4, yolov5, yolov6.

Also Read:  Learn CNN from scratch with Python and Numpy

Download darknet53.conv.74

Create a folder named “custom_weight” inside “OutsideData” folder. Now copy and paste this yolov3 pre-trained model (darknet53.conv.74) inside the “custom_weight” folder.

So finally my working directory structure should look like below:

yolo_custom_object_detection
  ├──darknet
  │     └──OutsideData
  │           │ └──custom_weight
  │           │      └──darknet53.conv.74
  │           │
  │           ├──create_labelled_data.py
  │           ├──create_train_and_test_data.py
  │           └──final_images
  │	           ├──0.png
  │	           ├──0.txt
  │	           ├──1.png
  │	           ├──1.txt
  │	           ├── ...
  │	           └──classes.txt
  │
  └──rename_images.py

7. Compile darknet on Windows

Now we need to compile darknet to generate darknet.exe file. To do that follow below steps:

  • Open CMakeLists.txt file from “darknet” folder
  • Search for term “find_package(OpenCV REQUIRED)
  • Include the path for the OpenCV_DIR variable by adding the following line before the find_package(OpenCV REQUIRED). OpenCV_DIR is the build file path of OpenCV. Follow steps 6-13 in this tutorial to know your own path
change in CMakeLists file of darknet to train custom object detection model

If you are not doing above steps, you may get the below error:

Could not find a package configuration file provided by "OpenCV" with any
of the following names:

  OpenCVConfig.cmake
  opencv-config.cmake
  • Now open CMake GUI
  • In CMake provide input path to the darknet Source, and output path to the Binaries.
  • Then click configure button. In the configure window select Visual Studio 16 2019 as generator for the project. Then select x64 as the platform. Then click Finish. It will start configuring files
configure cmake to build darknet
  • Once configured successfully, click on Generate
  • Once generate successfully click on Open Project. It should open Visual Studio
  • In MS Visual Studio: Select: x64 and Release -> Build -> Build solution. It will start building darknet dependencies
generate darknet exe file

Once the build succeeded, a folder named “Release” will be created inside the “darknet” folder. You will find a file named “darknet.exe” inside this “Release” folder.

  1. Copy darknet.exe, darklib.dll and uselib.exe from darknet/Release folder to the darknet/ root folder
  2. Copy pthreadGC2.dll and pthreadVC2.dll from darknet\3rdparty\pthreads\bin\ folder to the darknet/ root folder

Above two steps to solve “pthreadvc2.dll was not found” error. If you are not facing this error, you can skip the above two steps.

Now copy opencv_highgui440.dll, opencv_videoio440.dll, opencv_imageproc440.dll, opencv_imagecodecs440.dll, opencv_core440.dll from build/bin/Release folder (follow steps 6-13 in this tutorial to know your build path) to darknet/ root folder.

We are doing the above step to avoid below errors while running the final training command in step 7

  • opencv_videoio440.dll was not found
  • opencv_imageproc440.dll was not found
  • opencv_imagecodecs440.dll was not found
  • opencv_core440.dll was not found
  • opencv_highgui440.dll was not found

8. Train YOLO custom object detection model in Windows

Finally, all the configuration is done. Now we can start training our custom YOLO model. To do that:

  1. Open cmd
  2. cd inside “darknet” folder
  3. Now run the below command
darknet.exe detector train OutsideData/final_images/labelled_data.data cfg/yolov3_football.cfg OutsideData/custom_weight/darknet53.conv.74 -dont_show
command to train yolo in windows gpu

Once you execute the above command it should start training YOLO model for our custom dataset using the darknet and save the trained model inside your backup folder (for my case backup model will be stored inside “backup_football” folder inside “OutsideData” folder).

Once training is completed, you will see some model weight files inside your backup folder (“backup_football” folder). These are the weights being saved while training the model as a backup after some iteration (1000, 2000,3000, etc.). The final weight is “yolov3_football_final.weights“.

Note: It took almost 3 hours to complete the training in my system for 650 training data.

yolov3 model weights

9. Test YOLO model for image and video

Now that we have successfully trained YOLO model for our custom dataset, we need to test this trained model to check whether our model is performing well or not.

To do that create a folder named “test_model” inside “OutsideData” folder and copy yolov3_football.cfg from darkne/cfg folder to the “test_model” folder.

Now open yolov3_football.cfg from “test_model” folder in any code editor. Since we will be using this configuration file for testing the model, we need to comment out batch and subdivisions from the training part and uncomment those in Testing part (we did just the opposite at step 4.3 to train the model)

change in configuration file to test yolo model for custom data

Now use object detection python code for image code from this tutorial to detect custom object in any given image. You just need to change the file path for:

  1. yolo_config: path of yolov3_football.cfg file from darknet/OutsideData/test_model folder (commented training and uncommented testing part)
  2. yolo_weight: path of yolov3_football_final.weights (trained model path from darknet/OutsideData/backup_football folder, mentioned in step 7)
  3. coco.names: This file contains the names of classes. We have generated classes.names file in step 5 by running labelled_data.py code. So you need to replace coco.names with classes.names.
import cv2
import numpy as np
 
# Loading image
img = cv2.imread("data/car.jpg")
 
# Load Yolo
# Only need to change file path in below two lines
# Trained YOLO model for our custom dataset
yolo_weight = "D:/yolo_custom_object_detection/darknet/OutsideData/backup_football/yolov3_football_final.weights"
# Custom configured cfg file for our own dataset
yolo_config = "D:/yolo_custom_object_detection/darknet/OutsideData/test_model/yolov3_football.cfg"
# coco_labels = "data/model/coco.names"
# Class names for our custom data (Messi and Ronaldo)
coco_labels = "D:/yolo_custom_object_detection/darknet/OutsideData/final_images/classes.names"
net = cv2.dnn.readNet(yolo_weight, yolo_config)
custom object detected using trained yolo model
Output of Trained YOLO Model for image

to detect custom object in any given video, use Video object detection code from this tutorial. You just need to change the 3 file paths (yolov3_football_final.weights, yolov3_football.cfg, classes.names) mentioned above.

Conclusion

In this post I showed you how to can train your own YOLO model to detect any specific object. In this tutorial, I have used only 650 training data and got a decent output from the model. To deploy any custom object detection model I will recommend you to use atleast 1000 training data with different variations.

If you have any questions or suggestions regarding this post, please let me know in the comment section below.

Comments are closed.