Universiti Teknologi Malaysia Institutional Repository

Multi-scale network with integrated attention unit for crowd counting

Adel Hafeezallah, Adel Hafeezallah and Al-Dhamari, Ahlam and Abu Bakar, Syed Abd. Rahman (2022) Multi-scale network with integrated attention unit for crowd counting. Computers, Materials and Continua, 73 (2). pp. 3879-3903. ISSN 1546-2218

[img] PDF
1MB

Official URL: http://dx.doi.org/10.32604/cmc.2022.028289

Abstract

Estimating the crowd count and density of highly dense scenes witnessed in Muslim gatherings at religious sites in Makkah and Madinah is critical for developing control strategies and organizing such a large gathering. Moreover, since the crowd images in this case can range from low density to high density, detection-based approaches are hard to apply for crowd counting. Recently, deep learning-based regression has become the prominent approach for crowd counting problems, where a density-map is estimated, and its integral is further computed to acquire the final count result. In this paper, we put forward a novel multi-scale network (named 2U-Net) for crowd counting in sparse and dense scenarios. The proposed framework, which employs the U-Net architecture, is straightforward to implement, computationally efficient, and has single-step training. Unpooling layers are used to retrieve the pooling layers' erased information and learn hierarchically pixelwise spatial representation. This helps in obtaining feature values, retaining spatial locations, and maximizing data integrity to avoid data loss. In addition, a modified attention unit is introduced and integrated into the proposed 2UNet model to focus on specific crowd areas. The proposed model concentrates on balancing the number of model parameters, model size, computational cost, and counting accuracy compared with other works, which may involve acquiring one criterion at the expense of other constraints. Experiments on five challenging datasets for density estimation and crowd counting have shown that the proposed model is very effective and outperforms comparable mainstream models. Moreover, it counts very well in both sparse and congested crowd scenes. The 2U-Net model has the lowest MAE in both parts (Part A and Part B) of the ShanghaiTech, UCSD, andMall benchmarks, with 63.3, 7.4, 1.5, and 1.6, respectively. Furthermore, it obtains the lowestMSE in the ShanghaiTech-Part B, UCSD, and Mall benchmarks with 12.0, 1.9, and 2.1, respectively.

Item Type:Article
Uncontrolled Keywords:attention units, Computer vision, crowd analysis, crowd counting, max-pooling index, U-Net, unpooling
Subjects:T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK5101-6720 Telecommunication
Divisions:Electrical Engineering
ID Code:103270
Deposited By: Widya Wahid
Deposited On:24 Oct 2023 10:08
Last Modified:24 Oct 2023 10:08

Repository Staff Only: item control page