Serum protein changes over time
For each analyte, a mixed-effect model with year, gender, and race as fixed covariates was applied to the log2-transformed analyte concentration. Comparisons of each analyte at each time point (Y1 to Y2, Y1 to Y5, and Y2 to Y5) were made and significance assessed using the Tukey Multiple Comparison Test. Volcano plots were generated and used to identify changes from Y5 to Y1 and from Y2 to Y1 (p -values <0.05 and fold change >1.5X or <-1.5X).
Prediction model of SCORAD change
The prediction model of the change in SCORAD at Y5 to Y1 with all the baseline (Y1) analytes concentrations was built using JMP [13]. With forward selection, starting with null model and add the analyte most predicted the most on dependent measure (e.g. smallest p -values) one by one until including analyte predictors provided no additional predictive power. The stopping criteria is based on minimum corrected Akaike Information Criterion (AIC) [14]. To evaluate the performance of prediction model, we calculate the R squares and RMSE (Root Mean Square Error) between predicted values and observed values for all patients and RMSE. Furthermore, the cross validation on the proposed prediction model with selected markers was applied with 100 iterations [15]. At each iteration, the entire baseline patient set was split into training set (75%) and testing test (25%). The training set data was used to build the model and estimate model parameters. The model was then used to predict the testing set data. The R squares and RMSE between predicted values and observed values were calculated for each iteration to assess the robustness of prediction model.