Project Proposal
Title: Spatial and Temporal Relationship Analysis in Images and Video
Students
I will work alone on this project, as required for graduate students.
Overview
Finding spatial relationships among objects in images and video is an important component of image understanding and especially content-based image retrieval (e.g. in multimedia databases). Traditional approaches use fuzzy sets to analyze the relationships of points between pairs of objects in images. Recent research has shown that these approaches fail to find robust/universal rules to correctly classify object relationships. The fuzzy set theories have been enhanced with machine learning approaches that allow more robust classification that can be customized to match the classification tendencies of each user.
I intend to explore the use of these new techniques in video (or in series of still images) to extract more metadata, such as temporal relationships. The k-NN approach of Wang et. al. (see below) can be generalized for this task.
Background and Sources
“Generating Fuzzy Semantic Metadata Describing Spatial Relations from Images using the R-Histogram”, by Wang, Makedon, et. al. (JCDL ’04, June 7–11 2004, Tuscon, AZ) — Introduces the k-NN/machine learning approach to spatial relationship metadata generation. I will use these algorithms as a basis for this project.
“Comparison of Spatial Relation Definitions in Computer Vision”, by Keller and Wang (Proceedings of ISUMA-NAFIPS ’95) — Summarizes the traditional fuzzy set approaches to spatial relationship analysis.
“Mining Temporal and Spatial Object Relations in Multimedia Contents”, by Tseng, Tseng, and Lin (2005 International Conference on Wireless Networks, Communications and Mobile Computing) — Discusses the extraction of temporal metadata from time-series of photos in multimedia databases.
Many other papers are available by Keller, Wang, and other authors, written in the 1990s.
I anticipate use of the Image Processing Toolbox provided by the authors of our class text.
Data
In the paper by Keller and Wang, synthetic test images were generated containing one stationary rectangle and one randomly-placed ellipse.
The following web pages provide links to many test image databases. I am searching for an acceptable database of video files.
- http://www.cs.cmu.edu/~cil/v-images.html
- http://i21www.ira.uka.de/image_sequences/
- http://peipa.essex.ac.uk/benchmark/databases/index.html
Web Page
Please bookmark http://www.mertsock.com/blog/category/rit/cv-20052/ to track weekly updates to this project.