Authorea

Iv Sjsn edited untitled.tex about 9 years ago

Commit id: d04fea20c2960345e48b89ff8a7dfeab6e32dd7e

deletions | additions

We partition the video volume into $C^N$ non-overlapping regions using GBH segmentation. The segmentation is based on appearance and motion similarity between the local regions. Each segment $c_i \in C^N$ is comprised of arbitrary shape & sized point cloud $x_i=\{x^0_i, x^1_i, ...., x^P_i\}$ in video volume space. space $\mathbbR$. The practical challenge is to represent segment $c_i$ efficiently without comprimising on the memory and accuracy. Because it is difficult to fit regular structure such as 3D bounding box or ellipsiod. So we came up with solution to divide the video into regular $m \times m \times m$ grids and construct the representation based on such structure. It does reduce the memory load by $m^3$ times. Also such grids can be constructed to represent arbitrary shape and sized 3D regions.