Design of data analysis platform

4.1 MongoDB database
Data analysis platform is designed to receive, analyze and present data from monitoring SDK. MongoDB database is applied in storage of data considering that:
  1. Data stored at platform grows with increase of applications registered at monitoring SDK as well as number of users. MongoDB can realize increase of capacity via dynamic mechanism, assuring high efficiency in database operation.
  2. Data structure in MongoDB database is collection-oriented, which is made up of KVPs (Key-Value Pair) that saved as a BSON file, which is similar to JSON file. Data obtained from SDK in JSON file format can be directly stored to MongoDB without converting to Java object, reducing time cost.
  3. MongoDB supports various types of data including array, which is involved in JSON file from SDK. In addition, format of fields in every line of data is not limited, making it much easier to expand the database since no rebuilt is required on existing data structure.
4.2 Evaluation of Frequent Interact Sequence
Excavation of Frequent Interact Sequence (FIS) is similar to that of Frequent Element Sequence (FES), where difference lies in that time variable is considered in FIS, while not in FES. Both Apriori and FP-Tree algorithms can be applied to excavation of FIS, where the former scans the database back and forth and generates numerous of results, while the latter only scans the database twice throughout the entire process, making it more efficient in operation. FP-Tree algorithm is applied in this study for evaluation of FIS considering time variable in interaction. Only Maximum Frequent Interact Sequence (MFIS) is considered since any FIS can be regarded as part of MFIS. Major steps can be listed as follows:
  1. Define frequency threshold Minsup , define MFISTree as empty.
  2. Scan user interact record stored in database, and compare whether the frequency is greater than Minsup or not. Elements which frequency exceeds Minsup are set as header in order of frequency.
  3. Scan user interact record in database for another time to generate MFISTree, which process is similar to FP-Tree. Major difference lies in that generation of MFISTree will not change the order of interact record according to order of header. In other words, time order of interact record is maintained in MFISTree.
  4. Generate a InvTree according to order of elements in header so as to obtain the FIS which ends with the element.
  5. Examine all FISs obtained from InvTree and only store MFISs via KMP algorithm.
4.3 Layout of data analysis platform
Data analysis platform consists of account management, data storage, data analysis and report modules, shown as Fig.4.
Account management module provides developers with developer account register as well as management functions. In addition, application management is also realized, where developers can register application to obtain a unique appId, which is a string generated by UUID and consists of 32 hexadecimal numbers. Only when appId is configured in VR monitoring SDK can data collection on application and upload be carried out.
Data storage module receives and stores data from SDK. Once received the JSON file from SDK, examination on data is carried out to remove illegal or abnormal data to ensure accuracy in following evaluations. Data passed the examination will be stored to MongoDB database.
Data analysis module carries out aggregation statistics on application data stored in MongoDB database, including user device, operation performance, crash report and user interaction. In addition, evaluation of frequent interact sequence (FIS) is also carried out. Since FIS represents how users interact with VR application, developers may realize upgrade on application according to user manner so as to improve user experience.
Data report module generates results from analysis on application data in an intuitive way. Developers can either examine result in Web page or download full report in various formats. Echarts[9] which is an open-source JavaScript-based toolbox developed by Baidu is selected as the tool to generate data report in this study considering of its outstanding compatibility which supports major browsers on desktop as well as mobile devices.