Design of data analysis
platform
4.1 MongoDB database
Data analysis platform is designed to receive, analyze and present data
from monitoring SDK. MongoDB database is applied in storage of data
considering that:
- Data stored at platform grows with increase of applications registered
at monitoring SDK as well as number of users. MongoDB can realize
increase of capacity via dynamic mechanism, assuring high efficiency
in database operation.
- Data structure in MongoDB database is collection-oriented, which is
made up of KVPs (Key-Value Pair) that saved as a BSON file, which is
similar to JSON file. Data obtained from SDK in JSON file format can
be directly stored to MongoDB without converting to Java object,
reducing time cost.
- MongoDB supports various types of data including array, which is
involved in JSON file from SDK. In addition, format of fields in every
line of data is not limited, making it much easier to expand the
database since no rebuilt is required on existing data structure.
4.2 Evaluation of Frequent Interact Sequence
Excavation of Frequent Interact Sequence (FIS) is similar to that of
Frequent Element Sequence (FES), where difference lies in that time
variable is considered in FIS, while not in FES. Both Apriori and
FP-Tree algorithms can be applied to excavation of FIS, where the former
scans the database back and forth and generates numerous of results,
while the latter only scans the database twice throughout the entire
process, making it more efficient in operation. FP-Tree algorithm is
applied in this study for evaluation of FIS considering time variable in
interaction. Only Maximum Frequent Interact Sequence (MFIS) is
considered since any FIS can be regarded as part of MFIS. Major steps
can be listed as follows:
- Define frequency threshold Minsup , define MFISTree as empty.
- Scan user interact record stored in database, and compare whether the
frequency is greater than Minsup or not. Elements which
frequency exceeds Minsup are set as header in order of
frequency.
- Scan user interact record in database for another time to generate
MFISTree, which process is similar to FP-Tree. Major difference lies
in that generation of MFISTree will not change the order of interact
record according to order of header. In other words, time order of
interact record is maintained in MFISTree.
- Generate a InvTree according to order of elements in header so as to
obtain the FIS which ends with the element.
- Examine all FISs obtained from InvTree and only store MFISs via KMP
algorithm.
4.3 Layout of data analysis platform
Data analysis platform consists of account management, data storage,
data analysis and report modules, shown as Fig.4.
Account management module provides developers with developer account
register as well as management functions. In addition, application
management is also realized, where developers can register application
to obtain a unique appId, which is a string generated by UUID and
consists of 32 hexadecimal numbers. Only when appId is configured in VR
monitoring SDK can data collection on application and upload be carried
out.
Data storage module receives and stores data from SDK. Once received the
JSON file from SDK, examination on data is carried out to remove illegal
or abnormal data to ensure accuracy in following evaluations. Data
passed the examination will be stored to MongoDB database.
Data analysis module carries out aggregation statistics on application
data stored in MongoDB database, including user device, operation
performance, crash report and user interaction. In addition, evaluation
of frequent interact sequence (FIS) is also carried out. Since FIS
represents how users interact with VR application, developers may
realize upgrade on application according to user manner so as to improve
user experience.
Data report module generates results from analysis on application data
in an intuitive way. Developers can either examine result in Web page or
download full report in various formats.
Echarts[9] which is an open-source
JavaScript-based toolbox developed by Baidu is selected as the tool to
generate data report in this study considering of its outstanding
compatibility which supports major browsers on desktop as well as mobile
devices.