KMA Solaiman

and 1 more

Multi-modal information retrieval has great implications for search engines, situational knowledge delivery and complex data management systems. Existing cross-modal learning models use separate information models for each data modality and lack the compatibility to utilize pre-existing features in an application domain. Moreover, supervised learning methods lack the capability to include user preference to define data relevancy without training samples and need modality-specific translation methods. To address these problems, we propose a novel multi-modal information retrieval framework (FemmIR) with two retrieval models based on graph similarity search (RelGSim) and relational database querying (EARS). FemmIR uses extracted features from different modalities and translates them into a common information model. For RelGSim, we propose to build a localized graph for each data object with the features and define a novel distance metric to measure the similarity between two data objects. A neural network based graph similarity approximation model is trained to map the data objects to a similarity score. Furthermore, for handling feature extraction in an open world environment, appropriate extraction models are discussed for different application domains. To tackle the problem of finer attribute analysis in text, a novel human attribute extraction model is proposed for unstructured text. Contrary to existing methods, FemmIR can integrate application domains with existing features and can include user preference for relevancy determination for situational knowledge discovery. The single information model (common schema or graph) reduces the data representation overhead. Comprehensive experimental results on a novel open world cross-media dataset show the efficacy of our models.

KMA Solaiman

and 4 more

This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. We present a system for integrating multiple sources of data for finding missing persons. This system can assist authorities in finding children during amber alerts, mentally challenged persons who have wandered off, or person-of-interests in an investigation. Authorities search for the person in question by reaching out to acquaintances, checking video feeds, or by looking into the previous histories relevant to the investigation. In the absence of any leads, authorities lean on public help from sources such as tweets or tip lines. A missing person investigation requires information from multiple modalities and heterogeneous data sources to be combined. Existing cross-modal fusion models use separate information models for each data modality and lack the compatibility to utilize pre-existing object properties in an application domain. A framework for multimodal information retrieval, called Find-Them is developed. It includes extracting features from different modalities and mapping them into a standard schema for context-based data fusion. Find-Them can integrate application domains with previously derived object properties and can deliver data relevant for the mission objective based on the context and needs of the user. Measurements on a novel open-world cross-media dataset show the efficacy of our model. The objective of this work is to assist authorities in finding uses of Find-Them in missing person investigation.