Mohd. Anuar, Mohd. Syahid and Jayapalan, Senthil Kumar (2022) An improved object detection model based on optimised CNN. Open International Journal Of Informatics, 10 (1). pp. 78-96. ISSN 2289-2370
PDF
839kB |
Official URL: https://oiji.utm.my/index.php/oiji/article/view/23...
Abstract
Object detection is a computer vision technique that gives the ability to individually locate, recognise, and interpret multiple objects in an image with a better understanding. Modern image understanding tasks like image classification have been improved by state-of-the-art deep learning methods, particularly by convolutional neural networks (CNN). Region-based object detection algorithms such as Fast-RCNN achieve classification by CNN but over a longer period of time. You only look once (YOLO) prompts the object location and classification, treating object detection as a regression problem in an end-to-end network in a single step, whereas its accuracy decreases when the image has similar objects in a confined area, particularly when independent of the surrounding context. The aim of the current study is to improve YOLOv3 by optimising Darknet-53 to address the memory issue, using switchable normalisation techniques. We investigated the performance of five pre-trained networks, SqueezeNet, GoogleNet, ShuffleNet, Darknet-53, and Inception-V3, using a confusion matrix employing various epochs, learning rates, and mini-batches based on transfer learning. Darknet-53 took five times longer to complete the training and also ran into errors, most likely due to GPU memory shortages, whereas GoogleNet virtually obtained the same results in a fraction of the time. Using switchable normalisation techniques with the 10 class CIFAR-10 dataset, and utilising deep network designer (DND) of MATLAB R2021a, optimised versions of Darknet-53 increased the validation accuracy, considerably reducing the training time, and rectified the memory issue, which were then used as a backbone for YOLOv3 for effective object detection. The enhanced YOLOv3 was then assessed using a vehicle dataset and a sample Kuala Lumpur traffic scene using average precision. YOLOv3 with optimised CNN dNet-CIN as the backbone produced the best experimental results, with an FPS of 3.21 and a mAP-50 of 97%.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Object Detection,Transfer Learning, Computer Vision, Optimisation, CNN. |
Subjects: | T Technology > TJ Mechanical engineering and machinery |
Divisions: | Razak School of Engineering and Advanced Technology |
ID Code: | 104593 |
Deposited By: | Muhamad Idham Sulong |
Deposited On: | 21 Feb 2024 08:21 |
Last Modified: | 21 Feb 2024 08:21 |
Repository Staff Only: item control page