Authorea

Joseph Jhon D. Galdo edited subsection_Background_of_the_Study__.tex about 8 years ago

Commit id: 44684a1166124d3b3a81d4de57cc6f17d911b926

deletions | additions

The gaming industry is pushing towards a game where it easily tracks people, object and space by an acceptable computational effort and convenient hardware investment \cite{larssen2004understanding}. Kinect is a motion sensing input device by Microsoft for the Xbox 360 video game console and Windows Personal Computers. Based around a web-cam style add-on peripheral for the Xbox 360 console, it enables users to control and interact with the Xbox 360 without the need to touch a game controller, through a natural user interface using gestures and spoken commands. This new technology allows the sensor to recognize your body and mirror your movements in the game, making you the controller. It was built to change the way people play games and experience enjoyment \cite{zhang2012microsoft}. Human body part detection and tracking has a wide range of applications. In the past, camera-based motion capture systems that required cumbersome markers or suits were used.Recent research has focused on marker-free camera-based systems. The complexity of such systems regarding image processing depends largely on how the scene is captured.When 2D cameras are used, problems such as the variety of human motions, occlusions between limbs or with other body parts, and the sensitivity to illumination changes are difficult to cope with \cite{Gonz_lez_Ortega_2014}. In order to provide a more flexible and robust approach, we can see gesture recognition as a classification problem [7]. In this context, a classification problem consists in assigning one label or class to a gesture in such a way that it is consistent with the available data about the problem. For dealing with a classification problem, machine learning techniques can be applied. These techniques use a gesture training set, in which each gesture is labeled to generate a classifier\cite{Iba_ez_2014}. Here we enumerate four major challenges to vision based human action recognition. The first is low level challenges [93,13]. Occlusions, cluttered background, shadows, and varying illumination conditions can produce difficulties for motion segmentation and alter the way actions are perceived. This is a major difficulty of activity recognition from RGB videos. The introduction of 3D data largely alleviates the low-level difficulties by providing the structure information of the scene. The second challenge is view changes [63,93,37,92,68]. The same actions can generate a different "appearance" from different perspectives \cite{Aggarwal_2014}. Face databases are crucial for evaluation and validation of pro-posed methods in research.Many researchers built their own face databases for specific applications.For instance,most of the research summarized in the previous sections are evaluated on private and size limited datasets.Using the same dataset and evaluation protocol is essential for research reproducibility and fair comparison of different works.A number of face databases acquired with Kinect sensors have been recently made available for research purposes.These databases have been collected to study various problems related to human faces\cite{Boutellaa_2015}. And they address several problems such as sensor calibration,automatic integration time and data filtering schemes for outliers measurements removal.compare the accuracy of two ToF cameras and the Kinect SL camera to a precise laser range sensor(a LRF)\cite{Sarbolandi_2015}. Lastly, some problems occur with MBS in daily practice: accuracy and mainly reproducibility of such a system is still controversial for the estimation of joint centers and relative segment orientations. This can be explained by the fact that small errors in marker placement and soft tissue artifacts are causing larger errors in the estimation of the joint centers and the relative segment orientations\cite{Bonnech_re_2014}.