Developing a system capable of recognizing objects and individuals in images through the application of convolutional neural networks

youssef omran Gdura; wafaa faraj hadeia; Sara Fathi Aloudoly

doi:10.65405/kv8mt630

Authors

youssef omran Gdura Libyan Academy for Postgraduate Studies Author
wafaa faraj hadeia Libyan Academy for Postgraduate Studies Author
Sara Fathi Aloudoly Libyan Academy for Postgraduate Studies Author

DOI:

https://doi.org/10.65405/kv8mt630

Keywords:

Convolutional Neural Networks, Object Detection, Person Re-identification, YOLOv5, ResNet-50, Transfer Learning, Metric Learning, Triplet Loss, ArcFace, Face Recognition, Computer Vision, Real-time Inference, Fairness, Robustness.×

Abstract

The aim of this study was to design, test, and fully analyze a combined real-time system using convolutional neural networks (CNNs) with the ability to achieve general object detection and single (person) re-identification in unconstrained images simultaneously. The system proposed follows a hybrid two-stream design: the former stream uses an improved version of YOLOv5 to quickly and precisely detect multi-objects, and the latter stream is based on the modified ResNet-50 backbone that was trained using the triplet and ArcFace losses to learn highly discriminative identity representations. Other improvements are Squeeze-andExcitation attention blocks, widespread data augmentation, ImageNet training transfer, mixed-precision training and distributed, multi-GPU optimization. Strict testing using massive benchmark tasks (COCO, CelebA, cross-dataset), showed an average Average Precision (mAP 0.5:0.95) of 0.62 on object detection and a top-1 identification rate of 92 percent, which is a 817 percent improvement over baseline models. It was shown to high-level adversarial resistance, variations in illumination, and partial occlusions with real-time inference time of less than 200 ms on consumer-grade GPUs. Extensive studies regarding ablation and demographic equity also confirmed the role of every component and limited bias. Such findings make the suggested framework a very practical and deployable tool in a large number of applications, such as intelligent surveillance, assistive robotics, humanrobot interaction, forensic analysis, and smart environments. This work in the end provides a resultant reproducible, scalable and state-of-the-art pipeline that can be used as a strong building block to unified visual recognition systems in the next generation.

Downloads

Download data is not yet available.

References

Abu-Jassar, A. T., Al-Sharo, Y. M., Lyashenko, V., & Sotnik, S. (2021). Some Features of Classifiers Implementation for Object Recognition in Specialized Computer systems. TEM Journal, 10(4).

Agustin, S., Putri, E. N., & Ichsan, I. N. (2024). Design of A Cataract Detection System based on The Convolutional Neural Network. Jurnal ELTIKOM: Jurnal Teknik Elektro, Teknologi Informasi dan Komputer, 8(1), 1-8.

Alsajri, A., & Hacimahmud, A. V. (2023). Review of deep learning: Convolutional neural network algorithm. Babylonian Journal of Machine Learning, 2023, 19-25.

Ashiq, F., Asif, M., Ahmad, M. B., Zafar, S., Masood, K., Mahmood, T., ... & Lee, I. H. (2022). CNN-based object recognition and tracking system to assist visually impaired people. IEEE access, 10, 14819-14834.

Jekauc, D., Burkart, D., Fritsch, J., Hesenius, M., Meyer, O., Sarfraz, S., & Stiefelhagen, R. (2024). Recognizing affective states from the expressive behavior of tennis players using convolutional neural networks. KnowledgeBased Systems, 295, 111856.

Knysh, B., & Kulyk, Y. (2021). Improving a model of object recognition in images based on a convolutional neural network. Eastern-European Journal of Enterprise Technologies.№ 3: 40–50.

Nandhini, T. J., & Thinakaran, K. (2023, April). Deep Neural Network-based Crime

Scene Detection with Frames. In 2023 Eighth International Conference on Science Technology Engineering and Mathematics (ICONSTEM) (pp. 1-8). IEEE.

Oluyele, S., Adeyanju, I., & Sobowale, A. (2024). Robotic assistant for object recognition using convolutional neural network. ABUAD Journal of Engineering Research and Development, 7(1), 1-13.Alshehri, M., Zahoor, L., AlQahtani, Y., Alshahrani, A., AlHammadi, D. A., Jalal, A., & Liu, H. (2025). Unmanned aerial vehicle based multi-person detection via deep neural network models. Frontiers in Neurorobotics, 19, 1582995.

Pomazan, V., Tvoroshenko, I., & Gorokhovatskyi, V. (2023). Development of an application for recognizing emotions using convolutional neural networks.

Raj, R., & Demirkol, I. (2025). An improved facial emotion recognition system using convolutional neural network for the optimization of human robot interaction. Scientific Reports, 15(1), 38940.

Raquib, M., Hossain, M. A., Islam, M. K., & Miah, M. S. (2024). VashaNet: An automated system for recognizing handwritten Bangla basic characters using deep convolutional neural network. Machine Learning with Applications, 17, 100568.

Ślesicki, B., & Ślesicka, A. (2024). A new method for traffic participant recognition using doppler radar signature and convolutional neural networks. Sensors, 24(12), 3832.

Taye, M. M. (2023). Theoretical understanding of convolutional neural network: Concepts, architectures, applications, future directions. Computation, 11(3), 52.

Zangana, H. M., Mohammed, A. K., & Mustafa, F. M. (2024). Advancements and applications of convolutional neural networks in image analysis: A comprehensive review. Jurnal Ilmiah Computer Science, 3(1), 16-29.

Developing a system capable of recognizing objects and individuals in images through the application of convolutional neural networks

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

INDEXING

Crossref

open-access

ISSN

DOLJ

Turnitin

doi

googlescholar

Orcid

Language