My first exposure to the area of computer vision was during my Master’s. I extensively analyzed the effect of bounding-box based representation of an object, which, due to its simplicity, is widely used for object tracking or object detection. Particularly, I focused on handling ambiguity induced by discordance between the shape of the object and the bounding-box. Appearance models for accurate discrimination of the object region from the background were proposed in my thesis. I also participated in the study that used two bounding-boxes to avoid using information from the ambiguous region around the conventional single bounding-box, which was presented in ECCV 2014. After graduation, I joined the Korea Institute of Science and Technology (KIST) as a research staff and conducted research on scene flow estimation from a pair of RGB-D data. Dense correspondence estimation is an essential problem in modeling a dynamic 3D object such as human. With a help of an RGB-D camera, I had an access to both image and depth data. I tried to generalize a total variation (TV), a widely used motion prior which is robust near boundary. Employing total generalized variation (TGV) made the estimator prefer natural solutions. Furthermore, I adopted a deformation graph, a graph that efficiently leverages the geometry of surface, for estimation of motion with large displacement.