Deep Learning for Image Recognition: Stare of the Art Techniques & Future Trends

Authors

  • Aryan Halan Student, Class 12th, Welham Boys School, Dehradun, Uttarakhand

DOI:

https://doi.org/10.29070/738grk27

Keywords:

deep learning, computer vision, image recognition, art techniques

Abstract

Deep learning approaches have completely transformed computer vision research, particularly in the areas of object identification and picture recognition. For tasks like as object identification and image recognition, we outline in this abstract the most recent developments and state-of-the-art methods in deep learning. The term "image recognition" describes the method by which objects or patterns inside digital images may be automatically identified and classified. One example is the exceptional performance shown by convolutional neural networks (CNNs) in picture identification tests. These algorithms can identify complicated patterns and provide accurate predictions because they learn hierarchical representations of visual attributes straight from raw pixel input. Applications such as item identification and image recognition have been revolutionised by models' capacity to learn complex visual representations directly from pixel input. Massive annotated datasets and advancements in deep learning architectures have sped up the process of creating very precise and efficient systems. Autonomous vehicles, surveillance, and medical imaging are just a few of the many areas that stand to benefit from deep learning's continued advancements in object identification and image recognition.

References

Abhinav, A., & Agrawal, A. (2019). A comprehensive survey of deep learning techniques for image recognition. Journal of Pattern Recognition and Artificial Intelligence, 32(1), 47-63.

Chen, Z., & Gupta, S. (2020). Deep learning for object detection: A comprehensive review. Journal of Visual Communication and Image Representation, 68, 102768.

Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.

Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition, 580-587.

Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems, 91-99.

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. Proceedings of the IEEE conference on computer vision and pattern recognition, 779-788.

He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask R-CNN. Proceedings of the IEEE international conference on computer vision, 2961-2969.

Redmon, J., & Farhadi, A. (2018). YOLOv3: An incremental improvement. arXiv preprint arXiv:1804.02767.

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. Proceedings of the IEEE conference on computer vision and pattern recognition, 2818-2826.

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition, 770-778.

Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Feature pyramid networks for object detection. Proceedings of the IEEE conference on computer vision and pattern recognition, 2117-2125.

Redmon, J., & Farhadi, A. (2017). YOLO9000: Better, faster, stronger. Proceedings of the IEEE conference on computer vision and pattern recognition, 7263-7271.

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). Redmon, J., & Farhadi, A. (2018). YOLOv3: An incremental improvement. arXiv preprint arXiv:1804.02767.

Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L (2018). DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE transactions on pattern analysis and machine intelligence, 40(4), 834-848.

Abhinav, A., & Agrawal, A. (2019). A comprehensive survey of deep learning techniques for image recognition. Journal of Pattern Recognition and Artificial Intelligence, 32(1), 47-63.

Chen, Z., & Gupta, S. (2020). Deep learning for object detection: A comprehensive review. Journal of Visual Communication and Image Representation, 68, 102768.

Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.

Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 580-587.

Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems, 91-99.

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 779-788.

He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask R-CNN. Proceedings of the IEEE International Conference on Computer Vision, 2961-2969.

Redmon, J., & Farhadi, A. (2018). YOLOv3: An incremental improvement. arXiv preprint arXiv:1804.02767.

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2818-2826.

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778.

Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2117-2125.

Redmon, J., & Farhadi, A. (2017). YOLO9000: Better, faster, stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7263-7271.

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). SSD: Single shot multibox detector. Proceedings of the European Conference on Computer Vision, 21-37.

Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2018). DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834-848.

Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 779-788.

Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 580-587.

Girshick, R. (2015). Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision, 1440-1448.

Chorowski, J. K., Bahdanau, D., Serdyuk, D., Cho, K., & Bengio, Y. (2015). Attention-based models for speech recognition. Advances in Neural Information Processing Systems, 577-585.

Everingham, M., et al. (2010). The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2), 303-338.

Peixoto, H. M., Teles, R. S., Luiz, J. V. A., & Henriques-Alves, A. M. (2019). Mice tracking using the YOLO algorithm. PeerJ Preprints, 7, e27880v1.

Henriques-Alves, A. M., & Queiroz, C. M. (2016). Ethological evaluation of the effects of social defeat stress in mice: Beyond the social interaction ratio. Frontiers in Behavioral Neuroscience, 9, 364.

Jhuang, H., et al. (2010). Automated home-cage behavioural phenotyping of mice. Nature Communications, 1, 68.

Burgos-Artizzu, X. P., Dollár, P., Lin, D., Anderson, D. J., & Perona, P. (2012). Social behavior recognition in continuous video. 2012 IEEE Conference on Computer Vision and Pattern Recognition, 1322-1329.

Norouzzadeh, M. S., et al. (2018). Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning. Proceedings of the National Academy of Sciences of the United States of America, 115(25), E5716-E5725.

Guo, J., et al. (2019). GluonCV and GluonNLP: Deep learning in computer vision and natural language processing. arXiv preprint arXiv:1907.04433.

Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248-255.

Chen, X. L., et al. (2006). Remote sensing image-based analysis of the relationship between urban heat island and land use/cover changes. Remote Sensing of Environment, 104(2), 133-146.

Chinthamu, N., Gooda, S. K., Venkatachalam, C., Swaminathan, S., & Malathy, G. (2023). IoT-based secure data transmission prediction using deep learning model in cloud computing. International Journal on Recent and Innovation Trends in Computing and Communication, 11, 68-76.

Ashwin, K. V., Kosuru, V. S. R., Sridhar, S., & Rajesh, P. (2023). A passive islanding detection technique based on susceptible power indices with zero non-detection zone using a hybrid technique. International Journal of Intelligent Systems and Applications in Engineering, 11(2), 635-647.

Raj, R., & Sahoo, D. S. S. (2021). Detection of botnet using deep learning architecture using Chrome 23 pattern with IoT. Research Journal of Computer Systems and Engineering, 2(2), 38-44.

Kamau, J., Goldberg, R., Oliveira, A., Seo-joon, C., & Nakamura, E. (2023). Improving recommendation systems with collaborative filtering algorithms. Kuwait Journal of Machine Learning, 1(3).

Ahammad, D. S. K. H. (2022). Microarray cancer classification with stacked classifier in machine learning integrated grid L1-regulated feature selection. Machine Learning Applications in Engineering Education and Management, 2(1), 01-10.

Downloads

Published

2024-07-01

How to Cite

[1]
“Deep Learning for Image Recognition: Stare of the Art Techniques & Future Trends”, JASRAE, vol. 21, no. 5, pp. 89–95, Jul. 2024, doi: 10.29070/738grk27.

How to Cite

[1]
“Deep Learning for Image Recognition: Stare of the Art Techniques & Future Trends”, JASRAE, vol. 21, no. 5, pp. 89–95, Jul. 2024, doi: 10.29070/738grk27.