Universiti Teknologi Malaysia Institutional Repository

Object Character Recognition for automatic labelling of pharmaceutical products

Abdul Rahman, Muhammad Hanafi Akmal (2022) Object Character Recognition for automatic labelling of pharmaceutical products. Masters thesis, Universiti Teknologi Malaysia, Faculty of Engineering - School of Electrical Engineering.

[img]
Preview
PDF
306kB

Official URL: http://dms.library.utm.my:8080/vital/access/manage...

Abstract

In the current modern era, storing data information from images or documents to a computer drive is in high demand as it can be utilized the information for various purposes, especially in the pharmaceutical industry. The current method of storing data information about pharmaceutical products is to manually key-in the information about the products to the computer system. Therefore, one simple method for storing information from documents on a computer system would be to scan the image or document and then save it as an image file. However, analysing this information from the image can be exceedingly difficult. There is a need for dependable manual labour to review the information on pharmaceutical products. For this reason, a method to automatically fetch and store the information from the image is required. Object Character Recognition (OCR) is a well-known method that can identify and process information from pixel-based images to text format. In this thesis, OCR is implemented to extract text characters from images for the labelling of pharmaceutical products. The challenges that are associated with this task include variances in illumination, rotation when acquiring the image, and the different fonts that are shown on the pharmaceutical product. Besides, there is too much information for the computer system to accurately retrieve from the images. In addition, Named Entity Recognition (NER) is implemented to identify the important information from the OCR process. The system successfully extracts all the important information for several pharmaceutical products and successfully converts them into a sample form. The results obtained by OCR show a 92.85% accuracy rate. Meanwhile, the results obtained by NER have a 100% accuracy rate for MAL numbers and a 90% accuracy rate for product names. Overall, it is hoped that this system may help to optimize the work in the pharmaceutical supply chain industry and contribute towards the national industry.

Item Type:Thesis (Masters)
Uncontrolled Keywords:pharmaceutical products, Object Character Recognition (OCR), MAL numbers
Subjects:T Technology > TK Electrical engineering. Electronics Nuclear engineering
Divisions:Faculty of Engineering - School of Electrical
ID Code:99589
Deposited By: Yanti Mohd Shah
Deposited On:08 Mar 2023 03:35
Last Modified:08 Mar 2023 03:35

Repository Staff Only: item control page