Volume: 2 Issue: 1
Year: 2025, Page: 79-82, Doi: https://doi.org/10.70968/ijeaca.v2i1.D1002
Received: March 12, 2025 Accepted: June 22, 2025 Published: July 18, 2025
The way deep learning methods process, analyze, and summarize multimedia content has evolved significantly in the past few years. This review concentrates on the latest developments in video summarization, key event detection, and text generation, focusing on the implementations of CNNs, RNNs, transformers, and reinforcement learning models. Unlike modern approaches which exhibit contextual understanding, coherence, diversity, dynamic responsiveness, and real-time adaptability, traditional heuristic and rule-based systems stagnated due to a lack of scalability. The review includes cycle-consistent GANs, query-dependent summarization, boundary-aware event detection, and attention- based text generation ignoring mention. This paper sets out to compare and analyze methods published after 2015 to establish performance benchmarks alongside domain applications and enduring difficulties such as computational demand and personalization. The goal is to enhance intelligent content analysis and AI multimedia systems.
Keywords: Key Event Detection and Video Summarization System
Saini P, Kumar K, Kashid S, Saini A, Negi A. Video summarization using deep learning techniques: a detailed analysis and investigation. Artificial Intelligence Review. 2023;56(11):12347–12385. Available from: https://dx.doi.org/10.1007/s10462-023-10444-0
Yuan L, Tay FEH, Li P, Zhou L, Feng J. Cycle-SUM: Cycle-consistent adversarial LSTM networks for unsupervised video summarization. 2019. Available from: https://doi.org/10.48550/arXiv.1904.08265
Li P, Ye Q, Zhang L, Yuan L, Xu X, Shao L. Exploring global diverse attention via pairwise temporal relation for video summarization. Pattern Recognition. 2021;111:107677. Available from: https://dx.doi.org/10.1016/j.patcog.2020.107677
Alaa T, Mongy A, Bakr A, Diab M, Gomaa W. Video Summarization Techniques: A Comprehensive Review. Proceedings of the 21st International Conference on Informatics in Control, Automation and Robotics. 2024;p. 141–148. Available from: https://www.scitepress.org/Papers/2024/129364/129364.pdf
Messaoud S, Lourentzou I, Boughoula A, Zehni M, et al. DeepQAMVS: Query-Aware Hierarchical Pointer Networks for Multi-Video Summarization. SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieva. 2021;p. 1389 –1399. Available from: https://doi.org/10.1145/3404835.3462959
He X, Deng L. Deep Learning for Image-to-Text Generation: A Technical Overview. IEEE Signal Processing Magazine. 2017;34(6):109–116. Available from: https://dx.doi.org/10.1109/msp.2017.2741510
Guo H. Generating text with deep reinforcement learning. 2015. Available from: https://doi.org/10.48550/arXiv.1510.09202
Iqbal T, Qureshi S. The survey: Text generation models in deep learning. Journal of King Saud University - Computer and Information Sciences. 2022;34(6):2515–2528. Available from: https://dx.doi.org/10.1016/j.jksuci.2020.04.001
Shorten C, Khoshgoftaar TM, Furht B. Text Data Augmentation for Deep Learning. Journal of Big Data. 2021;8(1):1–54. Available from: https://dx.doi.org/10.1186/s40537-021-00492-0
Peronikolis M, Panagiotakis C. Personalized Video Summarization: A Comprehensive Survey of Methods and Datasets. Applied Sciences. 2024;14(11):4400. Available from: https://dx.doi.org/10.3390/app14114400
Banjar M, Ahmad R, Abidin AMZ. BASE: Boundary-aware scene extraction for video summarization using BiLSTM. Multimedia Tools and Applications. 2023;82:2105–2130. Available from: http://dx.doi.org/10.1007/978-3-031-26316-3_29
Huang Y, Liu T, Wu Z. Multimodal transformer for sports event detection in video streams. IEEE Access. 2024;12:44512–44524. Available from: https://doi.org/10.1109/ACCESS.2024.3280401
Aggarwal CC, Subbian K. Event detection in social streams. Proc. of the 2019 SIAM International Conference on Data Mining. 2019;p. 624–635. Available from: https://doi.org/10.1137/1.9781611975673.70
Tiwari V, Bhatnagar C. A survey of recent work on video summarization: approaches and techniques. Multimedia Tools and Applications. 2021;80(18):27187–27221. Available from: https://dx.doi.org/10.1007/s11042-021-10977-y
Khurana D, Koli A, Khatter K, Singh S. Natural language processing: state of the art, current trends and challenges. Multimedia Tools and Applications. 2023;82(3):3713–3744. Available from: https://dx.doi.org/10.1007/s11042-022-13428-4
© 2025 Belhe et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Belhe M, Diwan S, Bhalgaonkar S. (2025). Key Event Detection and Video Summarization System. International Journal of Electronics and Computer Applications. 2(1): 79-82. https://doi.org/10.70968/ijeaca.v2i1.D1002