Content-based retrieval algorithm for multimedia databases
A brief overview of MAD. MAD - Media Abstract Description. Content-based information retrieval process in MAD. Facets, weights and parameterization. Formal specification of queries: formulation, interpretation, modification. Conclusions and future work.
|Рубрика||Журналистика, издательское дело и СМИ|
|Размер файла||54,0 K|
Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже
Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.
CONTENT-BASED RETRIEVAL ALGORITHM FOR MULTIMEDIA DATABASES
M.-M. Vladoiu, Assistant Professor, PhD.
Department of Informatics, Petroleum-Gas University of Ploiesti
The second informational revolution has been launched by now and hypermedia has been acting as an accelerator for that. This is because hypermedia can both extend the existing applications and lead to revolutionary re-thinking of knowledge processing in several domains such as economics, art, science, education, and engineering. Use of multimedia (MM) and hyperlink techniques generates benefits for all users of automated informational systems. It enhances the quantity, the quality, and the impact of information presented to the user, as well as human-computer and human-human interactions.
Efficient organization of MM information is the responsibility of multimedia database management systems. Such a system is a high-performance database management system that offer support to MM data types, besides the traditional alphanumeric types, and that has the capability to manipulate huge amounts of MM information. A MM database management system integrates tightly three fundamental technologies: databases, information retrieval and hierarchical storage systems.
Most multimedia data must be seen as being n-dimensional. Such data require special accessing and indexing techniques. The need for performance in information retrieval process imposes the use of multi-dimensional indices (k-d trees, tries, grid files, point quad trees, multi-key hash tables, MX-quad, R trees, etc.) and media object clustering. These indices are used for associative retrieval (attribute-based). As this kind of querying is insufficient in MM databases, content-based retrieval structures had to be introduced. Inverse indices are the most used such a structure. These indexes rely on both metadata and content indexing terms. Metadata play an important role in MM systems for several reasons: exact query paradigm is not powerful enough in the MM context, the reduced efficiency of direct content-based search, inappropriate processing techniques, and the semantics of media data. There is many query types for multi-key based retrieval: exact match, partial match, range, and partial range.
This paper presents a content-based retrieval algorithm which can be used for information retrieval in multimedia databases. This retrieval mechanism is developed on top of a multimedia object model called MAD [1,2]. It is a six-step iterative process and has the following stages: formulation of query-by-example queries (using logical connectives), query interpretation, query modification (extension/reduction), matching, visualization and choice, and query reformulation. Each step is characterized by various specific parameters that are used to improve the accuracy of query results. User support is necessary in order to improve the retrieval performance.
The structure of paper is as follows: after the first introductory section, Section 2 provides a short overview of the media object model MAD, Section 3 presents the content-based retrieval algorithm and, finally, the last section lists some conclusions.
2. A BRIEF OVERVIEW OF MAD
MAD (Media Abstract Description) is an object-oriented multi-structure model for hypermedia data that deals with media objects from a twofold, yet unifying, perspective: it treats them both as media structures and as semantic repositories [1,2]. The MM objects considered by MAD are: text, graphics, still images, speech, audio, video and generated media. They can be simple or compound. MAD considers as being simple objects the smallest chunks of media, the so-called states. These states can form states' groups. A media object (called whole) is a hierarchy of states and groups. Compound media is a composition of media objects that can be homogeneous (for example a list of images) or heterogeneous (as a hypermedia document). Multimedia objects are represented in MAD using a tuple
MADSt/Gr/Wh=(MMSt/Gr/Wh, CNTSt/Gr/Wh), where:
MM=(MMSt, MMGr, MMWh) includes specific properties of the media type that correspond respectively to a state, a group and a whole MM object;
CNT=(CNTSt, CNTGr, CNTWh) describes the semantic content of the object;
The MM part has three components: MMSt, MMGr, and MMWh, where:
MMSt=(St, Specific_MtD, Appl_MtD, isSpatial, isTemporal, Gr_bto, Wh_bto), where:
St stands for the states' set;
Specific_MtD refers at media specific metadata;
Appl_MtD are application metadata;
isSpatial stores spatial relationships within one state;
isTemporal describes, from a temporal point of view, a state;
Gr _bto refers to the group that contains the state;
Wh_bto designates the whole that contains the state.
MMGr part of MM can be described similarly. Here, for the image example, MAD consider as being a group, a multi-segment. This is a set of simple media segments. A multi-segment presents spatial relationships between its simple segments;
MMWh=(SSt, SGr, StoG, Specific_MtD, Appl_MtD, iwSpatial, iwTemporal, R), unde:
SSt refers to extra-group states' set;
SGr designates the set of groups within media object;
StoG is a map that shows what the states that compose any group are;
Specific_MtD, Appl_MtD, iwSpatial, iwTemporal are similar with their above correspondents;
R is the representation of the media object R=(PhR,L&H,Sp,Temp,IR), where:
PhR=(HSM_path, Appl_MtD, format(coding)/compressing, storage_place_MtD, MMcomputer&peripherals_MtD, on/near/off-line) - the components here are self explanatory;
L&H refers to the representation of logical and hyperlink structure; it is specific to each media type. Sp and Temp stand for spatial, respectively temporal structure of the media object;
IR is the indexing representation that holds media type specific terms.
Semantic content is represented using the tuple: CNT=(CNTSt, CNTGr, CNTWh), where:
CNTSt=(St, Stfe, feATTR, StMfe, MfeATTR, StR, StVar, feVar, MfeVar, Ife, pi_fe, StoF, StoMF, iSt) where:
St designates a set of states;
Stfe refers to the built-in properties of interest from a state;
feATTR denotes attributes of properties from a state;
StMfe refers to meta-features regarding a state (descriptions, information about external events etc);
MfeATTR describes the attributes associated with meta-features;
StR is a set of state-dependent relationships on feixfeATTRjxSt, i,j>0.
StVar is a set of state objects, called state variables;
feVar designates the set of feature variables;
MfeVar refers to the set of meta-feature variables;
Ife denotes the set of relations that can exist between features' groups, having the domain on feixfejxSt (i,j>0) - for example ((ana,maria) play_tennis (ion,radu));
piperfe stands for a map that associates with every feature a probability interval;
StoF:St2fe is a extraction map that shows what features appear in every state;
StoMF:St2Mf is similar with the above one, but it corresponds to meta-features;
iSt denotes inter-state relationships;
CNTGr and CNTWh can be described similarly. CNTWh has though some extra components: exR which denotes feature relationships that are state independent, ccMtD that stands for classification metadata and CR which describes the terms used for content retrieval.
This abstract description is represented using a multi-way tree, which combines structure hierarchies (logical&hyperlink, spatial and temporal) with indexing and content retrieval components (stored in modified hash tables). The indexing scheme relies on the fact that information retrieval, both for media and content terms, can be tackled globally. MAD had introduced an enhanced inverted indexing scheme that works bi-univocally :
determines what are the features from any state - they are stored in the state associated boxes that contain (media and content specific) indexing terms;
the inverse problem, i.e in which state each feature ca be found, is solved by using a four levels dictionary structure that organize the information universe with respect to domains, sub-domains, fields and features to be found in MM objects.
This inverse index is optimized in order to save space by counter-centrifuging the common index terms that are sent from child nodes up to their parent, recursively, in the multi-way structure. This way the redundancy of index terms is eliminated and the final dimension of the multi-structure is significantly reduced.
3. CONTENT-BASED INFORMATION RETRIEVAL PROCESS IN MAD
Information retrieval process in MAD benefits from facets, weights and parameterization both in media structure and content terms. It is a six-step iterative process: formulation of queries (using logical connectives) in a query-by-example manner, query interpretation (which can be strict, medium or large), query modification (extension/reduction), matching, visualization&choice, and query reformulation. Each step is characterized by various specific parameters that are used to improve the accuracy of query results.
FACETS, WEIGHTS AND PARAMETERIZATION
Traditional document models rely on attributes. Facet is an extra-structuring element, which aggregates many attributes, and has the semantics of a point of view on the document. Use of facets facilitates the query formulation in information retrieval systems. They can be classified as follows :
general factual knowledge which is objective data that identify the production-related elements (producer, place and date, copyrights etc.). These facets express the librarian point of view;
media specific information that describes the creation process of the media entity in the chosen environment (e.g. illumination, focalization, camera focus and so on) from author viewpoint;
application specific knowledge which expresses expert viewpoint (for example, in satellite imaging, there are facets specific to meteorologist, vulcanologist or agronomist);
content elements that show point of view of an objective observer. They can be provided partially by pattern recognition algorithms;
connotations that retain subjective descriptions of media entities. They show reader perspective and try to decode author message.
MAD organizes various facets of a media entity according to its own multi-structuring strategy. Thus, on the first level, we have the two parts of a mono- or hyper-media object: MM and CNT as it can be seen in Figure 1. Further, on, facet levels correspond to sub-directories of MAD hierarchy: whole media object metadata (WhMtD), physical representation (PhR), logical and hyperlink structure (L&H), spatial (Sp) and temporal (Temp) representation, and state/group/whole-specific content information (CNTSt, CNTGr,CNTWh).
5. FORMAL SPECIFICATION OF QUERIES
The query language involve the following basic self-explanatory functions (first four are from media abstraction of Subrahmanian ; w stands for a weight, feature can be ordinary or meta-feature, and object can be any media entity from simple states to compound media):
FindObjectwithFeature(feature, w): this function determine all media entities that have a given feature. For example: find objects in which is a person called Ana can be found FindObjectwithFeature(Ana, 0.8).
FindObjectwithFeatureandAttr(feature, attribute, value, w): returns all media objects that have the required weighted feature and the given value for the presented attribute. For instance: find objects within which a woman with a 1.75m height can be found FindObjectwithFeatureandAttr(femeie, inaltime, 1.75, 1).
FindFeaturesinObj(object): determines the features from a media entity.
FindFeaturesandAttrinObj(object): determines the media features from the media object, along with their attributes and associated values.
FindRelated(Group)FeaturesinObj(object): returns the related (group of) features in the object argument.
FindObjectwithRelated(Group)Features(feature1, feature2, w1, w2, relationship, w): this is the inverse of the above function. Its input is a set of weighted related features and it returns the objects in which the features in the given relation are found - e.g.: FindObjectwithRelatedFeatures(ion,ana,1,1,dancing,1)=video[21,40].
FindRelated(Group)FeaturesandAttrinObj(object, feature1, attr1, value1, … ): finds the related (group of) attributed features from the object argument.
FindObjwithRelated(Group)FeaturesandAttr((feature1, attribute1, value1), …): the present function finds, besides the sixth above one, the attributes and their values.
It is to be noticed that weights are supposed to be more present in complex queries, where the relative importance of the indexing terms is crucial. Moreover, because the features can be both multimedia- or content-specific, the queries are multi-structural. MAD experimental prototype (implemented in Java 2 from SUN) can deal with the next queries:
given a (or a set of states) state it is required the associated features. This is the simplest kind, since these features are obtained directly from MAD hierarchy;
given a feature (or a set of features) it is required the states that contain that feature. This operation is the inverse of the above one;
given a feature and a weight (or a set of weighted features) it is required the states which contain that feature.
1. Query formulation
Formulation of queries is based on weights instead of logical connectives similar with the model for interactive retrieval  - this is due to difficulty of using these connectives (AND, OR, NOT) by general users. Moreover, facets are more important for users than attributes. Thus, user can assign to each querying term a [-1,1] range weight, which has the next interpretation: a strict positive value shows the significance of term presence in the media entity, a null value says that the criterion is neutral to current search (default), and a strict negative value states the importance of term absence within the media object. Furthermore, facets can be weighted according to their relative importance (1 by default). Such a query is the following one (between brackets we have the weights):
WhMtD: production date -1.04.1998(0.8), surveillance video (1)
ShMtD: zoom (0.5)
fe: man(1), woman(1), briefcase(1), watch(0.8)
Mfe: suspect atmosphere (1)
An intuitive interpretation of the above query could be: find a zoomed shot from a surveillance video, probably produced in 1.04.1998, that contains a man, a woman, a briefcase and a watch, and that shows a suspect atmosphere. Such a query can be formulated in MAD in a query-by-example approach.
2. Query interpretation
In this stage, the selection of media entities of interest for a given query is to be done. The strategic parameter here is the type of interpretation. This can be strict (the conjunction of conjunction of concepts from each facet), medium (the conjunction of concept disjunction), or large (the disjunction of facet-specific concept disjunction). In this stage the negative weighted terms are disregarded. They will be used later to eliminate some objects from the selection. For now, a preliminary visualization and choice is performed. The display of media entities is made in MAD prototype progressively, according to the algorithm shown below (the negative weighted featured ones are excluded):
in the first place, the entities that have all the features from the query, a maximum number of 1-weighted features, and the feature probability (piperfe) more or equal than 0.5, are listed;
further on, the objects that present all the features from the query, and that have minimum 75% from these features more or equal to 0.5 weighted, and piperfe is more or equal with 0.5, excluding the ones from the previous step, are displayed;
media entities that have 75% from features, all of these being more than 0.5 weighted and having piperfe 0.5 - the above listed ones are excluded;
media objects that have 50% from features, all of these being more than 0.5 weighted and having piperfe 0.5 - the above displayed ones are excluded;
entities that have 25% from features, all of these being more than 0.25 weighted and having piperfe 0.25 - the above presented ones are excluded;
all remained entities that have not been displayed previously.
3. Query modification
Either the pre-selection result is too large or too small query modification is needed. This can be done with user support in many ways: the weights can be changed, or the working concepts can be modified (extension, association) using a thesauri structure. The parameters of query modification are: query components (concepts, connectives), choice criteria, and modification scope. The refinement of query is followed by a re-iteration of the querying process starting with step 1.
Matching goal is to display a set of media entities ranked according to the query semantics. The matching mechanism in MAD computes the matching function, i.e. the product of similarity function that corresponds to each facet in the multi-vectorial space, and displays the entities that have this value above a given threshold, with the maximum first. The similarity function between a query and a media entity is computed as it is shown beneath:
The parameters in this stage are exhaustivity and specificity. Exhausitivity measures the level on which the query is included in the media entity, while specificity shows the degree in which a media entity is retrieved by the query. These parameters are computed as it is presented below (formula 2):
5. Visualization and choice
In this step, results of the querying process are presented to the user. If s/he is satisfied with them then the process is completed. Otherwise, the search is to be continued by refining the query according with some parameters:
if the query is multi-faceted, the question is which is the most appropriate combination to be shown to the used (ranking parameter);
in addition, a decision has to be made with respect to which entities to be listed: all or only the most relevant ones (cut-off criterion).
In MAD, a unique selection function is used. This is computed as an average of the partial results obtained for each facet for each mono-media object. For unstructured entities, these partial functions can be calculated in several ways: as a conjunction, which considers the facets complementary (minimum of partial values), as a disjunction of competing facets (maximum of partial values) or as a compensatory function (average of partial values) . MAD uses the last possibility.
6. Query reformulation
If the user has not been satisfied with the results, the relevance feedback mechanism will be activated to re-iterate the searching process. The factors that express the various alternatives for reformulation function are the parameters of this step. Thus, query reformulation takes into account the contribution of precedent query, of chosen/rejected entities and of each facet. As for the matching primitive, the reformulation is computed using the vectorial faceted representation of queries and media entities according with the next formula:
4. CONCLUSIONS AND FUTURE WORK
The content-based retrieval method presented here is an adaptation to MAD of the model for interactive retrieval of videos and still images introduced in . This retrieval mechanism can be extended further on in MAD by instantiating the parameterized model according with the nature of the searched information. The authors of the interactive retrieval model consider that according with the searched information the retrieval strategy can be as follows:
Nature of searched information
search on a comprehensive theme (for example, user wants TV news about emancipation of women in the previous century in Europe)
interpretation is medium, extension regards Boolean connectives, matching will compromise between exhaustivity and specificity, and the ranking mode will be multi-facet. User can re-formulate the initial query by choosing facets related to the initial ones
correspond to situation in which user wants an overview of a specific database (for instance, a news database)
interpretation is large, extension applies to low inverse frequency concepts, matching relies on exhaustivity, and the ranking mode will be multi-facet. On re-formulation the initial query will be given low contribution, and the chosen/rejected documents will prevail
User knows exactly what s/he is looking for (e. g. a news about a concrete law that concerns woman emancipation)
interpretation need to be strict, modification is by expansion of low weight concepts, matching uses specificity, reformulation will maximize the importance of previous query
focus on subjective impression induced by the entity to be found.
modification will favor inter-facet associations, re-formulation will rely on objective facets
Content-based retrieval parameterized mechanisms with user support, within the context of an appropriate retrieval strategy, offer, in our view, the best solution to increased need for accuracy and flexibility of user queries. This is even truer in multimedia databases where the content of media objects is more significant to users than some attributes or metadata established by the database designer.
This paper presents a content-based retrieval algorithm which can be used for information retrieval in multimedia databases. This retrieval mechanism is developed on top of a novel multimedia object model called MAD. It is a six-step iterative process and has the following stages: formulation of query-by-example queries (using logical connectives), query interpretation, query modification (extension/reduction), matching, visualization and choice, and query reformulation. Each step is characterized by various specific parameters that are used to improve the accuracy of query results. User support is necessary in order to improve the retrieval performance. Content-based retrieval parameterized mechanisms with user support, within the context of an appropriate retrieval strategy, offer, in our view, the best solution to increased need for accuracy and flexibility of user queries. This is even truer in multimedia databases where the content of media objects is more significant to users than some attributes or metadata established by the database designer.
Adiba M., STORM: An object-oriented multimedia DBMS, in Multimedia Database Systems, Design and Implementation Strategies, Kluwer Academic Publishers, 1996
Allen J.F., Maintaining Knowledge about Temporal Intervals, Comm. of the ACM, Vol. 26, 1983
Brusilovsky P., Kommers P., Streits N., Lectures Notes in Computer Science: Multimedia, Hypermedia and Virtual Reality - Models, Systems and Applications, Springer - Verlag, 1996
Chung S. M., Multimedia Information Storage and Management, Kluwer Academic Pub., 1996
Hirzalla N.B., Karmouch A., A Multimedia Query Specification Language, in Nwosu K., Thuraisingham B., Bruce Berra P., Multimedia Database Systems, Design and Implementation Strategies, Kluwer Academic Publishers, 1996
Khoshafian S., Baker A. B., Multimedia and Imaging Databases, Morgan Kaufmann Publishers Inc., San Francisco, California, 1996
Lee K., Lee Y.K., Berra P.B., Management of Multi-structured Hypermedia Documents: A Data Model, Query Language, and Indexing Scheme, in Multimedia Database Management System - Research, Issues and Future Directions, Vol. 4, No.2, Kluwer Academic Pub., 1997
Nwosu K., Thuraisingham B., Bruce Berra P., Multimedia Database Systems, Design and Implementation Strategies, Kluwer Academic Publishers, 1996
Schauble P., Multimedia information retrieval - Content - Based Information Retrieval from Large Text and Audio Databases, Kluwer Academic Publishers, 1997
Sheth A., Klas W., Multimedia Data Management, McGrawHill, 1998
Simonnot B., Smail M., Model for interactive retrieval of videos and still images, in Nwosu K., Thuraisingham B., Bruce Berra P., Multimedia Database Systems, Design and Implementation Strategies, Kluwer Academic Publishers, 1996
Smeulders A.W.M., Jain R., Chang S.K., Image Databases and Multimedia Search, World Scientific Publishing Co., Vol.8, 1997
Subrahmanian V.S., Jajordia S., Multimedia Database Systems, Issues and Research Directions, Springer - Verlag, Berlin, 1996
Subrahmanian V.S., Principles of Multimedia Database Systems, Morgan Kaufmann Publishers Inc., San Francisco, California, 1998
Vladoiu M., Baza de date orientata pe obiecte - un model pentru obiecte multimedia (MAD) - PhD Thesis, Universitatea “Politehnica” Bucuresti, 2002
Vladoiu M., MAD - A Model for Media Objects' Structure and Content, the Journal of UPG Ploiesti, Vol. LV, Technical Series, Nr. 2, 2003
Theoretical basics of Internet advertising. The analysis of the media planning process. The establishing media objectives through developing media strategies and tactics. The effectiveness of the media planning in Internet. The example of the media plan.
курсовая работа [64,2 K], добавлен 25.03.2014
Analysis of the publishing content. Relationship of international relations and the complexity of editorials in periodicals wider audience. The similarity between international relations and newspaper editorials in the western and communist countries.
статья [21,3 K], добавлен 23.02.2010
The role of mass media in modern life. The influence of newspapers, magazines and television in mind and outlook of the mass of people. Ways to provide information and display the news of dramatic events, natural disasters, plane crash, murders and wars.
презентация [730,5 K], добавлен 17.05.2011
"The Bauer media group". "The Bertelsmann" is a German multinational mass media corporation. "The Axel Springer Verlag". The German media industry. Company that is specialised in production and delivery of media in the form of digital, audio, video.
реферат [18,9 K], добавлен 13.03.2014
Consideration of the mass media as an instrument of influence on human consciousness. The study of the positive and negative aspects of the radio, television, press, magazines, Internet. Advantages and disadvantages of the media in the Great Britain.
дипломная работа [2,3 M], добавлен 14.10.2014