Universiti Teknologi Malaysia Institutional Repository

An improved object detection model based on optimised CNN.

Mohd. Anuar, Mohd. Syahid and Jayapalan, Senthil Kumar (2022) An improved object detection model based on optimised CNN. Open International Journal Of Informatics, 10 (1). pp. 78-96. ISSN 2289-2370

[img] PDF
839kB

Official URL: https://oiji.utm.my/index.php/oiji/article/view/23...

Abstract

Object detection is a computer vision technique that gives the ability to individually locate, recognise, and interpret multiple objects in an image with a better understanding. Modern image understanding tasks like image classification have been improved by state-of-the-art deep learning methods, particularly by convolutional neural networks (CNN). Region-based object detection algorithms such as Fast-RCNN achieve classification by CNN but over a longer period of time. You only look once (YOLO) prompts the object location and classification, treating object detection as a regression problem in an end-to-end network in a single step, whereas its accuracy decreases when the image has similar objects in a confined area, particularly when independent of the surrounding context. The aim of the current study is to improve YOLOv3 by optimising Darknet-53 to address the memory issue, using switchable normalisation techniques. We investigated the performance of five pre-trained networks, SqueezeNet, GoogleNet, ShuffleNet, Darknet-53, and Inception-V3, using a confusion matrix employing various epochs, learning rates, and mini-batches based on transfer learning. Darknet-53 took five times longer to complete the training and also ran into errors, most likely due to GPU memory shortages, whereas GoogleNet virtually obtained the same results in a fraction of the time. Using switchable normalisation techniques with the 10 class CIFAR-10 dataset, and utilising deep network designer (DND) of MATLAB R2021a, optimised versions of Darknet-53 increased the validation accuracy, considerably reducing the training time, and rectified the memory issue, which were then used as a backbone for YOLOv3 for effective object detection. The enhanced YOLOv3 was then assessed using a vehicle dataset and a sample Kuala Lumpur traffic scene using average precision. YOLOv3 with optimised CNN dNet-CIN as the backbone produced the best experimental results, with an FPS of 3.21 and a mAP-50 of 97%.

Item Type:Article
Uncontrolled Keywords:Object Detection,Transfer Learning, Computer Vision, Optimisation, CNN.
Subjects:T Technology > TJ Mechanical engineering and machinery
Divisions:Razak School of Engineering and Advanced Technology
ID Code:104593
Deposited By: Muhamad Idham Sulong
Deposited On:21 Feb 2024 08:21
Last Modified:21 Feb 2024 08:21

Repository Staff Only: item control page