Addis Ababa University Sader
AROEANG RACE
stb eboN AEMTEYOURINTERLECTAND SEE ONAN Cd ™
College of Natural and Computational Sciences School of Information Science
ADDIS ABABA UNIVERSITY SCHOOL OF INFORMATION SCIENCE DEPARTMENT OF
INFORMATION SCIENCE MBA PROGRAM,
Assignment I-Multimedia Data Mining
Prepared by
Name 1D No
+ Eyob Negussie GsE/0249/14
Submitted To: Million M. (Dr.)
April. 22, 2023 G.CAbstract:
Nowadays as large multimedia data sets are becoming increasingly available and are almost.
unstructured or semi-structured data by nature, which makes it difficult for humans to get implicit
knowledge from the associations between multimedia data. Multimedia data mining is a complex and
effort-taking field of study because it deals with systematic understanding of the richness of human
perception. The aim of multimedia mining is to analyze and extract information without the use of,
powerful tools.
(ii) Overview & meaning of the concept
Data mining is an applied disc
and artificial intelligence coupled with business decision making to optimize and enhance it. Data mining
isa higher level of knowledge discovery for it enables companies to focus on the pressing aspects of
their business—telling them what they did not know and had not even thought of asking [2]
line, which grew out of statistical pattern recognition, machine learning,
Data and information mining are exploratory processes focusing on the techniques for analyzing and
combining raw data and detecting patterns and regularities within the data set. Thus, we use a concept
integrating multiple methods: information theory, stochastic modeling, Bayesian inference, machine
learning neural networks and pattern recognition [4]. It is because of the use of methodologies from
several disciplines that data mining is often viewed as a multidisciplinary field.
in combination with the research into multimedia database and advances in data mining in relational
databases, this created a possibilty for the creation of multimedia data mining systems. However, rare
are the researchers who ventured in the multimedia data mining field. Most of the studies done are
confined to the data filtering step [5]
multimedia mining (also known as automatic annotation or annotation mining) is basically used for
extracting hidden knowledge from multimedia data [2]. It discovers interesting patterns from
multimedia databases that store and manage large collections of multimedia based on their statistical
relationships. Focused on transmission problem: the source of information is an image archive; the
receiver is the community of users.[4]
We can classify multimedia data as static media and dynamic media. Static media is basically designed to
convey certain messages quickly, mostly used in messaging using text. They are GIFs MMS, graphical
texts and images of all sorts.OVERVIEW OF MULTMEDIA DATA MINING
CLASSIFICATION
Image mining determines how low-level pixel representation consists of a raw image that can be
handled to recognize "high-level" spatial objects and relationships, example maps.
‘The processing of Video mining consists of indexing, automatic segmentation, content-based retrieval,
classification and detecting triggers. and has diverse applications from entertainment to medicine and
security.
‘Audio Mining major application area is automatic speech recognition, where the analysis efforts to find
any speech within the audio by scientific detection of audio signals (majorly classified as finding the
temporal and the spectral domain features)
Application of Multimedia Mining
Digitization is basically an ‘electronic photograph’ of a physical object that can be stored electronically
and giving the privilege to systematically access and edit via the fast computer system. Butin the case of
multimedia formats lke images, Audio, Picture, Maps, Video etc., the conversion and systematic
presentation is not easy for digitalized access. However, effective and systematic presentation, in search
and retrieval of data is essential in the case of multimedia information extraction.
‘The advent of electronic resources and their increased use in libraries has brought about significant
changes in Storage and Communication of Information [2]. One can build an optimized digital library on
a genre for further multimedia information processing like creating interactive multimedia systems for
learning/training and creative art production by applying multimedia mining. For this purpose, there is a
need to adopt data mining techniques in multimedia library [2]. This includes media compression.
Various data mining techniques are used for medical image classification. Examples, Automatic 3D
delineation of highly aggressive brain tumors, Automatic localization and identification of vertebrae in
3D CT scans, MRI Scans, ECG and X-Rays.
In a business setup, evaluation of the quality of services from customers’ opinions by telephone or audio
messages is one area of multimedia data mining,
In Radio stations and TV channels broadcasting companies’ multimedia mining can be applied to
monitor their content to search for more efficient ways of presenting and improving their quality, This
may include Media restoration, transformation, and editingProcess of Multimedia Data Mining
Data Collection is the identification of useful data, and iti the initial stage of the learning system.
Pre-processingisto extract significant features from raw data. It includes data cleaning, transformation,
normalization, integrating data, feature extraction: - making choices about representing or coding
certain data fields that serve as inputs to the pattern discovered. For example, color histograms may be
used as features to characterize the spatial distribution of color in an image.
The complete process depends extremely on the nature of raw data and the difficult (high level detail)
field. The product of preprocessing is the training set. A learning model must be selected for the
specified training set.
video data, which include time aligned sequences of images; and electronic or digital, which is.
sequences of time aligned 2D or 3D coordinates of different types of image/video sensors. Spatio-
temporal segmentation is one typical feature in video data mining, It is moving objects in image
sequences in videos, and itis useful for object segmentation.
Pre-processing Multimedia
The unstructured multimedia datas bitstream (a sequence of bits that are transmitted or stored as a
single unit), for example, pixel representation for an image, audio, video and character representation
for text. These files may have an internal structure, but they are still considered “unstructured” because
their data does not fit neatly in a database. Images and videos of different objects have similarities -
each represents an interpretation of an object without a clear structure.
Data residing in a fixed field within a record or file is called structured data, and these data are stored in
sequential form. Current data mining tools operate on structured data, which resides in a huge volume
of the relational database. Hence, the semi-structured or unstructured multimedia data is converted
into structured one, and then the current data mining tools are used to extract the knowledge.
Data selection stage requires the user to select the databases, subset of fields, or data for data mining.
Feature extraction is the data-selection preprocessing step that involves making choices regarding
characterizing data fields to serve when inputs to the pattern-finding stage.
Finding a similar pattern: Some approaches to finding similar pattern stages contain association,
classification, clustering, regression, time-series analysis and visualization.Multimedia data mining techniques
Classification is the process of constructing data into categories for multimedia data analysis that can be
learned by creating predefined classes from every property of a specified set of multimedia.
For example, consider the sky images, which astronomers have carefully classified as the training set. It
can create models for recognizing galaxies, stars and further stellar objects based on properties like
magnitudes, areas, intensity, image moments and orientation.
Aparticular example which much research pointed out is the work of Yu and Wolf [19], who used one
dimensional Hidden-Markov Model for classifying images and video as indoor-outdoor games.
Association Rule is one of the most important data mining techniques that he|ps find relations between
data items in huge databases. The rulesare generated by analyzing the co-occurrence of image objects
in the database. The rules an be used to retrieve images from the database that contain specific objects
or combinations of objects.
=
CrGotdeee TRO HAfe Sept
Figure: A visualization of some association rules. It is designed to detect strong rules in the
database based on some interesting metrics. obtain rules that determine how or why
certain items are linked. confidence is defined by the number of times an if-then statement
is found to be true.
Method of Association mining
Multi-Relational Association Rule Mining (MRARM) is a data mining technique that is used to discover
association rules between items in a multi-relational database. Before this advancement there was a
traditional association rule mining which only work for a single table. MRARM can display multiple
reports for the same image and is used in many applications, such as bioinformatics, social network
analysis, and web mining.
This multi-dimensionalapproach makes navigating the database easier, screening for a particular subset
of data, or asking for data in a particular way, and being able to define analytical calculations. Becausethe data is physically stored in a multi-dimensional structure, the speed of these operations is much
quicker and more consistent than in other database structures. we can treat every image as a
transaction and find commonly occurring patterns among different images [2]. A multimedia data
mining system prototype, Multimedia Miner has been designed and developed to perform these
functions.
In multimedia mining, the clustering technique can be applied to group together similar images, objects,
sounds, videos and texts.
‘Another way forassociation mining is through a collection of annotated images used to build models for
joint distribution in probabilities that link image features and key words. By traversing on-line directory
structures, like the Yahoo directory, itis possible to create hierarchies of keywords mapped on the
directories in which the image was found. These graphs are used as concept hierarchies for the
dimension “keyword” in the multimedia data cube [5]. Statistical Modeling techniques were applied to
regulate the statistical validity of test parameters [1]
‘A wavelet is a mathematical function used to divide a given function or continuous-time signal into
different frequency components and study each component with a resolution that matches its scale.
Wavelets are used for data compression, noise reduction, feature extraction, and other applications.
The Haar wavelet transform is the simplest of the wavelet transforms.
Evaluation of Challenges in MMDM
Issues in multimedia data mining include similarity search and content-based retrieval, generalization
and multidimensional analysis
For multimedia database mining, storage and search techniques need to be integrated with standard
data mining methods.
Promising approaches include:
content-based retrieval system (CBIR) is a system that processes the information contained in image
data and creates an abstraction of its content in terms of visual attributes. In CBIR, a user specifies a
query image and gets the images in the database like the query image, which is useful when the user
wants to find images that are not textually annotated.
‘An image contains multiple objects, each with various features such as color, shape, texture, keyword,
and spatial locations, so that many possible associations can be made. With the associations between
multimedia objects, we can find commonly occurring patterns among different images. The mining
results of association relation are applied to improve classification performance for objects and
concepts (such as multiple people labeling the same image differently)
Content based retrieval in multimedia is @ challenging problem since multimedia data needs detailed
interpretation from pixel values (1). There are differences between the image features of the examples
(ex. color, texture) and the semantic (ex. flowersin frontof a lake) the user is looking for. Queries basedon visual content are not powerful enough to specify semantics in queries, such as “finding images of
animals in mountains”. Such queries are not easy, because the links, between the content and the
semantic in the user's mind, are not easy to declare [3].
Visual features are extracted automatically, however textual features are annotated manually. The
‘manual annotation is quite subjective and ambiguous, and it is very difficult to capture the visual
content of an image using words.
Based on experiments, itis proved that queries based on textual and visual features are more efficient
than queries based on textual or (exclusive) visual queries. What is missing is the knowledge extraction
capability from specific colors, textures and relationships.
‘The image data are frequently in large volumes and need substantial processing power, such as parallel
and distributed processing.
While image size and information content are continuously growing CBIR was not any more satisfactory
and Region Based Information Retrieval (RBIR) has been developed. Each image is segmented, and
individual objects are indexed by primitive attributes like color, texture and shape. [4]
‘Thus, RBIR is a solution to deal with the variability of image content. The systems are designed to search
images like the user conjecture.
Data mining is the automatic extraction of patterns of information in effort to the process of automating
information discovery.
(iv) Architecture and approach to describe
the step-by-step procedure followedInput Multimedia
Contents See en we
Spatiotemporal Segmentation
Text [Image | Audiol Video Feature Extraction
Finding the Similar Pattems Evaluation of Result
Generalized Architecture of Multimedia Mining
the information mining accuracy depends on
+ information content modelling
+ modelling the users understanding,
‘Accurate modeling typically requires high-dimensional and multi-scale modeling.
For multimedia data mining to work effectively it should have a standard visual and auditory set of
features standardization for meaningful knowledge extraction. Standards and guidelines associated with
library digitization practices vary from projectto project. Overthe years, universities, publicschools, and
special libraries have adopted their own policies regarding digitization.
Meta features are estimation of the image features, which requires the assumption of some data
models. From a data aggregation perspective, a meta feature is an indicator of information
commensurabilty.
There has been research on data resource extraction based on the meta database [visual and statistical
summary of content of images] by using a ranking algorithm in a distributed visual information system.
The two selection algorithms are mean-based and histogram-based algorithms. They work on basic
features [2]
Metadata standards and image quality standards and guidelines are commonly sought when planning
digitization projects. Common metadata standards used to date are RDF, SGMLand its descendants XML
and HTML The MARC standard has been used as the standard interchange format in representing
catalog records electronically.‘The most compact encoding of the data is by the probabilistic model that describes it best, thus there is
a fundamental link between information and probabilistic models that exemplifies image mining
functions, lke, search by example, search by data model, exploration in the scale space. [4]
‘Amultimedia data mining system prototype, MultiMediaMiner, has been designed and developed to
include the construction of a multimedia data cube which facilitates multiple dimensional analysis of
multimedia data. For each image collected, the database contains some description information, a
feature descriptor, anda layout descriptor. The original image is not directly storedin the database; only
its feature descriptors are stored [5]
‘Amultimedia data cube is a data abstraction to evaluate aggregated datafrom a variety of viewpoints. it
isa data abstraction in a multi-dimensional data structure that stores the optimized, summarized or
aggregated data for quick and easy analysis for business requirements. It has several dimensions and
‘groups data into two categories: data dimensions and measurements.
The visual features define the various ways to capture the spatial distribution of color over major image
regions following main features for color attributes: For instance, Color Layout, Color Structure.
While these features are defined with respect to an image or its part, when it comes to video the
feature describes the color histogram aggregated over multiple frames of a video.
‘Their Image Excavator also uses contextual image information, like HTML tags in web pages, to derive
keywords. By traversing on-line directory structures, itis possible to create hierarchies of keywords
mapped on the directoriesin which the image was found. These graphs are used as concept hierarchies
for the dimension “keyword” in the multimedia data cube. [5]
Angular Radial Transform (ART) is used to describe the shape of a region of multimedia data and is
suitable for complex objects consisting of multiple disconnected regions and for simple objects with or
without holes. ART works by dividing an image into a set of concentric rings or annul, and then
computing the average intensity of the pixels in each ring. This creates a set of intensity profiles that
describe the shape of the object in the image.
Motion activity feature basically deals with the intensity and pace of action in video clips space and time
segment (spatial and temporal distribution of activity in a video segment).
‘Audio standards can be classified into two sets of audio features. The first set is of low-level features,
which are meant for a wide range of applications. The features in this set include silence, power,
Spectrum, and Harmonicity.
‘There are higher level set of audio features meant for specific purposes such as the Audio Signature
designed to generate a unique identifier for identifying audio content. In addition, the high-level,
features can include automatic speech recognition, sound classification and indexing,
‘A good implementation standard of all these representative sets of features for multimedia data is the
MPEG-7 standard. The features are referred to as descriptors in MPEG-7.
One study focusses on an algorithm that discovers hidden relationships between image features. To
reduce the "combinatory explosion " of relationships, from large numbers of colors and textures images
from repository may contain, it considered a visual thesaurus that group together similar colors andtextures (MUCH LIKE THE FONT COLORS WE USE IN MS WORD). Thus, the visual thesaurus summarizes
the image features. The visual thesaurus is created by an algorithm based on a clustering strategy. The
relationships discovered permit the automatic categorization of images during their insertion into image
databases and return accurate and relevant results [3]
Another automatic hierarchical image categorization is based on a training set of images with known
labels in the database. Banded color correlograms are used as image descriptor (An image descriptor is,
an algorithm that governs how an input image is quantified and returns a feature vector abstractly
representing the image contents) and constructs a categorization tree. Once the categorization tree is,
obtained, any new image can be classified. It seems to conform to the semantic content of image
database. The results suggest that correlograms have more latent semantic structures than histograms
(a).
The basic idea is to split the information representation into four steps [4]:
1 image feature extraction using a library of algorithms such to obtain a quasi-complete signal,
description
2. unsupervised grouping in many clusters to be suitable fora large set of tasks
3. data reduction by parametric modeling the clusters
4, supervised learning of user semantics
where, instead of being programmed, the systems are trained by a set of examples; thus, the links from
image contents to the users are created.
‘The feature extraction is equivalent to splitting the image content in different information channels.
Unsupervised clustering is done for each information channel as an information encoding and data
reduction operation [4].
‘The extracted information, represented in the form of classes, is fused in a supervised learning process.
Which is Semantic representation: Augmentation of data with meaning. Step 4is a man-machine
dialogue, the information exchange is done using advanced visualization tools, thus adding a label in the
archive.
‘ABayesian learning algorithm allows a user to visualize and to encapsulate interactively his prior
knowledge of certain image structures and to generate a supervised classification of signal content
index. Prior information in the form of training data sets is used to create semantic categories by
associations to different information classes. This label is further used to specify queries. The system
learns what the users need.
In the study [3], the content of databases is summarized by relationships among visual features: auto-
correlograms of colors and Fourier descriptors of the texture. These two descriptors make the
description of the image content accurate
Firstly, because of semantic databases (Birds, Flowers etc.), the system extracts image features, and
discovers the relationship (relationships shared by images) that discriminates each database.In indexing, the image contents are extracted automatically, identify relevant regions in images and
compute features such as color and texture of the regions or compute the visual features of the whole
image. The extracted contents are represented as or transformed into suitable models and data
structures, and then stored in the database.
Each set of relationships linked to a database summarizes image contents of the database. Relationships
contribute to database discriminations. The features and relationship extracted are saved in the
database.
‘The distance between the image and the relationship associated to the database is the shortest one,
compared to the distance between the image and the other databases. When an image is inserted into
the database, itis classified "automatically" in the database hierarchy.
Ideally, the study envisions developing afully automated system that, after extracting and storing visual
features of images, clusters together similar images in databases.
When the user gives an example image (called source image) to formulate his query and asks, "find
images similar to the source image” the system will not match the source image with all images in
databases. It will match the source image features with only the target image features of suitable
databases,
‘The centroids of the feature clusters constitute the visual thesaurus. Without the visual thesaurus, they
must consider all features of all images, and then, they obtain very few features shared by images, and
then very few relationships, that discriminate databases. The visual thesaurus contains the most
representative colors and textures of images. The algorithm clusters together similar numerical
representations of color and texture.
‘The quadratic distance considers the color similarity between the correlogram, For the texture, we
extend the Euclidean distance to Fourier coefficients; we callit « texture_Fourier_distance ».
The second step, optimization, permits the correct adaptation of the visual thesaurus clusters by k-
medoid algorithms - like k-means algorithms, but instead of using the mean value of each cluster to
define the cluster center, they use the most centrally located point in the cluster, known as the medoid.
Based on the visual thesaurus, the algorithm discovers relationships. Relationships are relationships
shared by images of the same databases.
The extraction of relationships is done in two steps: symbolic cluste ring and relationship discovering, In
the first step, the features of the image are replaced by the most similar features defined in the visual
thesaurus. The description of the content of images is transformed into a symbolic form defined in the
visual thesaurus. In the second step, the relationship discovery engine automatically determines
common relationships among the image features.
‘The user should indicate the threshold above which relationships discovered will be kept (relevant
relationships). Conditional probability allows the system to determine the discriminating characteristics
of considered images. implication intensity requires a certain number of examples or counter examples.
When the doubt area is reached, the intensity value increases or decreases rapidly contrary to the
conditional probability that is linear.Openal's Image generation technology can allow you to create an original image given a text prompt.
Basically, these new systems work like an informalized Query by Example (QBE). This QBE is a graphical
query language that a user fils in some required fields to get our proper result or wrong result but no
error. The Al systems are informalized by natural language processing and pattern recognition systems
built onto them.
‘The image edits endpoint allows you to edit and extend an image by uploading a mask. The masks are
used for showing where images are to be edited by the Al system.
(iii) Significance, Advantage and disadvantage of the
concept
Images are stores as binary objects or streams without byte boundaries. You don’t search inside the
content of images and searching/collating is one of the main reasons to use a database at all. Images
‘would make the database much larger. In an ideal world where processing power, bandwidth, and
storage are infinite, everyone would store images in databases along with related data,
Information mining from large volumes of heterogeneous images and the correlation of this information
opens new perspectives and a huge potential for information extraction with the goals of different
applications [4]. This study s promising for a system that can create links between the conceptlevel and
the image data and cluster levels. The user is enabled to specify semantic queries at concept level and
the system returns images with specified classification
can a system extract hidden relationships between the features in order to make possible
semantic search?
One algorithm to deal with such Content based retrieval problem, organizes the digital databases in a
meaningful manner using image categorization. Image categorization classifies images into semantic
databases that are manually pre-categorized,
‘All images are catalogued into broad databases and each image carries an associated description. The
pre-categorization of images in semantic databases such as panorama, flowers, etc. simplifies the
estimation process. According to another study (4] while exploration of large image archives with rich
information content, itis important to group the data according to various objective information
measures. That helps the users to orient within the search process.
The categorization of images is quite challengingin general. Imagesin the same semantic databases may
have large variations with dissimilar visual descriptions (images of persons, images of industries, etc.),
and images from different semantic databases might share a common background (some flowers and
sunset have similarity). And this limits the efficiency of automatic categorization based exclusively on
visual content of images. [for POWERPOINT TOO]
In a paper that proposes anew scheme for automatic hierarchical image categorization, they assume a
training set of images with known database labels. They use low level features ~ auto correlogram of
colors and Fourier descriptors of texture — which are, together, efficient for content-based imageretrieval. Using low level features for the training images, relationships among visual features are
extracted automatically. These relationships discriminate image databases.(3]. Relationships are
selected based on two confidence measurements (conditional probabiity and implication intensity)
‘The color is the first feature considered to describe the image content. Each element of the auto
correlogram answers to the question: what the probability of two pixel separated by k pixels is to have
the same color
Fourier model has very interesting advantages: - the texture can be reconstructed from the features; it
has a mathematical description rather than a heuristic one. An important contribution of the study
representation is the extension of Fourier model to texture description.
The complexity of the images is another information theoretical measure used to rank images. The
complexity is defined as the Kullback-Leiber divergence between the cluster level and the image data
level [4].
The system [3] includes a visual query tool that lets users form a query by drawing, sketching and
selecting textures and colors and "query by examples" that are based on combination of the visual
features; means "find images that are similar to those specified". The query may be composed of several
images. For example, several images of "waterfall give us accurate description of the waterfall, This
property enables possible recursive query refinement based on the feedback (results of previous
queries).
‘The relationship discovery engine automatically determines common relationships among the image
features with certain confidence number attached to it. Weak relationships are relationships that are
not representative of the shared relationship. The set is refined and expanded by using query by
example and exhaustive browsing.
‘The interesting approach was a learning algorithm to select and combine features and to allow users to
ive positive and negative examples. The method refines the user interaction and enhances the quality
of the queries [4]. Both previously mentioned concepts are first steps to include the user in the search
loop. This is called Mining by interactive learning.
An interactive learning process allows the user to create links, i., to discover conditions between the
low-level signal description and the target of the user [5]Data accuiton, preprocessing, archiving eystem
Figure: The system architecture for Mining by interactive learning
Advantage from evaluation of implementation
Retrieval efficiency is measured by recall and precision metrics. For image reference, they associate a
relationship-content-based query (hierarchical algorithm of image categorization by creating
kaleidoscope of image features for each cluster), and a content-based query that doesn’t use the
relationship associated with databases. In the content-based queries, each query is decomposed into
texture and color sub-queries.
Precision and recall are better for relationship-content-based queries (queries that mix visual features
and relationship associated to databases) than for queries that use only visual features (colors and
textures). The average improvements of relationship-content based queries over content-based queries
are 23% for precision and 17 9 for recall
However, there are drawbacks. relationship content-based retrieval that selects only one database may
forget images that are visually ike the query examples, and belong, semantically, to several databases.
We observed that the general principle of «the larger the retrieved set, the higher the recall, and the
lower the precision» is observed.
Larger data warehouse and computational needs are the major disadvantages of multimedia data
mining. These requirements have financial and environmental implications.
‘The research [4] is seen as value adding tools in geoinformation, and several applications in medicine
and biometrics are also foreseen. The concept was demonstrated for a variety of Earth Observation
data. Further work is being done for the development of intelligent satellite ground segment systems.
‘Accessing the image information content, in comparison with other data types, is rising higher
complexity problems, residing mainly in the huge volume of data, the rich information content, and the
subjectivity of the user interpretation. The visual features of image databases are grouped
homogeneously around the features of the visual thesaurus.(v) Conclusion remarks with one major
recommendation
Difficult problems are further under research, such as developing image grammar and representation of
image content in different contextual environments. This isa semantic problem which can arise
between different users when they define or describe the same structures differently.
However, the new revolution of Alis becoming so prospective in solving most of these shortcomings. AS
| researched, two companies in this area: Clearview and Open Al seem to be leading the industry
revolution. Clearview gathers images from the internet to create a global facial recognition database.
Their application ranges from customer transactions and protecting customers and enterprises from
fraud to providing law enforcement with a highly accurate image recognition and matching to instantly
classify their criminals’ history. Facial recognition technology used to ensure the safety of Ukrainian
citizens and military personnel in Ukraine & Russia war. Al developed was effective evenifthe face of an
individual was disfigured. These Al techs can detect and interrupt threats like deep fakes and
presentation attacks.
‘Though this advancement in multimedia data mining is seen as milestones they are not without
controversies. Recently the company Clearview was fined in the UK due to privacy and security
concerns. They use publicly posted pictures from social media and other sources, usually without the
knowledge of the platform or any permission of the subject. This sparked many concerns in some
European countries.
Severalchallenges, mainly in the design of multidimensional DBMS, man-machine interfaces, distributed
information systems, will probably be approached soon.
‘Al empowered multimedia quality enhancements the next important area that needs attention. This is
because, even the most advanced multimedia processing Al technologies require some level of high-
quality data inputs to build the knowledge.
My recommendation is that although research of hierarchical algorithm of image categorization has laid
the foundation that can be used for further work, | imply what is missing is the image features are a
shallow representation of the actual portray of object. Ifall the image features that made up the images
are used for clustering the semantically categorized images, then we will get much more effective data
mining results.INDEX
The multimedia data cube MultiMediaMiner uses has many dimensions. The following are some
examples: (1) the size of the image or video in bytes with automatically generated numerical hierarchy;
(2) the width and height of the frames (or picture) constitute 2 dimensions with automatically generated
numerical hierarchy; (3) the date on which the image or video was created (or last modified) is another
dimension on which a time hierarchy is built; (4) the format type of the image or video with two-level
hierarchy containing all video and still image formats; (5) the frame sequence duration in seconds (0
seconds for still images) with numerical hierarchy; (6) the image or video Internet domain with a pre-
defined domain hierarchy; (7) the Internet domain of pages referencing the image or video (parent URL)
with a pre-defined domain hierarchy; (8) the key words with aterm hierarchy definedas described above;
(9) a color dimension with a pre-defined color hierarchy; (10) an edge-orientation dimension witha pre-
defined hierarchy, etc. (5]
Reference
[1 Hegadi, R., & RavikumarG, K. (2010). A Survey on Multimedia Data Mining and Its Relevance
Today.
[2]-Jadhay, S.R., & Kumbargoudar, P. (2007). Multimedia Data Mining in Digital Libraries : Standards
and Features.
[3] - Chabane D. (2001) Relationship extraction from large image databases
[4] - Datcu, M., & Seidel, K. (2002). An Innovative Concept for Image Information Mining. Revised
Papers from MDM/KDD and PAKDD/KDMCD.
[5] -Zaiane, Osmar R et al. “MultiMediaMiner: a system prototype for multimedia data mining.” ACM
SIGMOD Conference (1998).