台灣留學生出席國際會議補助

2009年10月13日 星期二

Video Copy Detection by Fast Sequence Matching

論文發表人: 葉梅珍(加州大學聖塔芭芭拉分校電機工程學系博士班)

http://www.civr2009.org/

 

序列配比的方法用於比對視訊資料是有效的。然而,現有的方法需要大量的計算,因此不適用於大規模應用。在本文中我們將視訊拷貝偵測問題視為一個兩序列間局部校準的問題,並且提出一個新的方法來大幅加速配比過程。首先,我們使用一個詞彙樹」來對數據庫中所有擷取影像建立索引。在這個步驟中,我們視每個視頻資料為一組無序的影像。這個索引架構不僅提供豐富的詞彙量,並且可用來快速計算視頻間的相似度。基於上述的方法,在這一階段詞彙樹過濾了一部分非拷貝的視頻資料。其次,基於過濾掉不相似影像間的比較,我們提出第二階段的快速配比方法。這個步驟可使二次方序列長度的運算時間降低到線性的運算時間。運用MUSCLE VCD公認測試標準的實驗,我們顯示這個方法是有效且快速的。我們提出的加速方法比原始的序列配比算法提高18倍效能。這個技術可被應用於其他影像檢索的工作。舉例來說,我們顯示這個方法可以使用於搜尋MPEG-7輪廓數據庫並達到大幅加速。

 

Sequence matching techniques are effective for comparing two videos. However, existing approaches suffer from demanding computational costs and thus are not scalable for large-scale applications. In this paper we view video copy detection as a local alignment problem between two frame sequences and propose a two-level filtration approach which achieves significant acceleration to the matching process. First, we propose to use an adaptive vocabulary tree to index all frame descriptors extracted from the video database. In this step, each video is treated as a "bag of frames." Such an indexing structure not only provides a rich vocabulary for representing videos, but also enables efficient computation of a pyramid matching kernel between videos. This vocabulary tree filters those videos that are dissimilar to the query based on their histogram pyramid representations. Second, we propose a fast edit-distance-based sequence matching method that avoids unnecessary comparisons between dissimilar frame pairs. This step reduces the quadratic runtime to a linear time with respect to the lengths of the sequences under comparison. Experiments on the MUSCLE VCD benchmark demonstrate that our approach is effective and efficient. It is 18X faster than the original sequence matching algorithms. This technique can be applied to several other visual retrieval tasks including shape retrieval. We demonstrate that the proposed method can also achieve a significant speedup for the shape retrieval task on the MPEG-7 shape dataset.