Authorea

JM VILLAROSA edited results.tex almost 8 years ago

Commit id: b6009f1ad6f095363c298b04da32f7298e64fd80

deletions | additions

The Ferret process makes the process easier into two steps: \begin{enumerate} \begin{itemize} \item Filtering step Similarity ranking step \end{enumerate} \begin{itemize} \item Quickly - it quickly filters out “bad” answers \end{itemize} \begin{itemize} \item Generates and generates a small candidate set\end{itemize} \begin{enumerate} \item Similarity ranking step \end{enumerate} \begin{itemize} \item Uses multi-feature object distance function \end{itemize} \begin{itemize} \item Computes Similarity ranking step - it uses multi-feature object distance function, computes distance for each candidate object object. It also returns k nearest objects \end{itemize} \begin{itemize} \item Returns k nearest objects \end{itemize} Criteria in picking candidate objects: Has at least one segment that is close enough to one of the top segments of the query object. By separating the search process into these two steps, one can use a forgiving approximation method in the filtering process to improve search speed and allow the user to apply a sophisticated and perhaps inefficient ranking step to ensure the search quality. If the candidate set size is small, only a small number of EMD computations are needed in the second step. In addition to speed, the filtering step also provides a natural way to integrate the content-based similarity search engine with an attribute-based search engine. The goal of filtering is to generate a small candidate set quickly that contains most of the similar data objects. The second is to generate a small and high-quality candidate set quickly.