loading page

Multimodal Video Intelligence Framework
  • Mayur Akewar
Mayur Akewar
Shri Ramdeobaba College of Engineering & Management

Corresponding Author:[email protected]

Author Profile

Abstract

Analyzing videos presents a unique challenge due to their rich content compared to images. Furthermore, processing lengthy videos efficiently necessitates segmenting them into scenes. Focusing on individual scene analysis offers an efficient alternative to analyzing entire videos. The application of this approach extends to a variety of Video Intelligence tasks, from surveillance applications to comprehensive video analytics. By capitalizing on open-source foundation models and leveraging audio and text features, our framework offers a versatile solution to the intricate task of video analysis, catering to a multitude of real-world applications.  
20 Mar 2024Submitted to TechRxiv
28 Mar 2024Published in TechRxiv