Semantic Content Extraction, Storage and Querying of Visual, Audio, and Text Data in Videos (METU-MMDS)

In this project, by using visual, audio and text data of videos (multi-modal), the semantic contents are extracted automatically, stored in an appropriate format and then a prototype system is developed that can answers the queries efficiently. A new video which is uploaded to the developed system primarily is pre-processed to obtain the corresponding visual, audio and text data. In order to extract the semantic content of the visual, audio and text data, three separate modules are developed for each modal.  Then, the information which obtained from these three modules are analyzed and integrated. Afterword, the incomplete data are concluded and the duplicate data are cleaned. These steps prepare the data to be stored in the database. Finally, the fusion process is applied on this data.  The fused data which obtained from the video are stored in the Intelligent Fuzzy Object Oriented Database System which is previously developed by the researchers in a TUBITAK 1001 project. Intelligent Fuzzy Object Oriented Database System mainly is consisted of a fuzzy knowledge base and a fuzzy object oriented database. In the domain of this project, large multimedia data are stored in the object oriented database. Furthermore, by employing some domain specific rules in the knowledge base and using the data which is stored in the database, new semantic information are extracted. Additionally, in order to answers the queries regarding to both the semantic content and the low-level features, an index structure is developed. In the proposed system, fuzzy and uncertain data also can be processed.

The main contribution of this project is fussing the different modals (visual, audio and text) which are obtained from a video and thereby, creating a more complete semantic data structure that can be stored in a database and queried effectively.

In addition, it is evaluated that the obtained results of the project fill a big gap in the academic literature. During project, 7 journal papers and 21 conference papers (19 international, 2 national), which make 28 in total, are published. An opportunity is provided for 4 PhD and 6 Ms students, who took responsibility during different terms of the project, to work on and accomplish their thesis.

This project is supported under the SCIENTIFIC AND TECHNOLOGICAL RESEARCH PROJECTS SUPPORT PROGRAM by TUBITAK with the grant number 109E014.

The above demo video shows how to use METU-MMDS for (i) extracting semantic content from videos, and (ii) querying multimedia data using various types of queries.