Feature name                               Type                              Description and values                                                           % missing                                     

Encounter ID                      Numeric               Unique identifier of an encounter                                             0%
Patient number                Numeric               Unique identifier of a patient                                                      0%
Race                                      Nominal               Values: Caucasian, Asian, African American,                         2%
                                                                                 Hispanic, and other         
Gender                                Nominal               Values: male, female, and unknown/invalid                         0%
Age                                        Nominal               Grouped in 10-year intervals:                                                      0%                                                                                                                                                                                          (0, 10), 10, 20), …, 90, 100)             
Weight                                 Numeric                Weight in pounds.                                                                           97%
Admission type                 Nominal               Integer identifier corresponding to 9 distinct                        0%
                                                                                 values, for example, emergency, urgent, elective,
                                                                                 newborn, and not available        
Discharge disposition     Nominal               Integer identifier corresponding to 29 distinct values,      0%
                                                                                   for example, discharged to home, expired, and 
                                                                                    not available     
Admission source             Nominal                 Integer identifier corresponding to 21 distinct values,      0%
                                                                                    for example, physician referral, emergency room,
                                                                                    and transfer from a hospital       
Time in hospital                Numeric               Integer number of days between admission                         0%
                                                                                 and discharge    
Payer code                          Nominal               Integer identifier corresponding to 23 distinct values,      52%
                                                                                 for example, Blue Cross/Blue Shield, Medicare,
                                                                                 and self-pay      
Medical specialty             Nominal               Integer identifier of a specialty of the admitting                  53%
                                                                                 physician, corresponding to 84 distinct values, for
                                                                                 example, cardiology, internal medicine, family/
                                                                                 general practice, and surgeon    
Number of lab                   Numeric               Number of lab tests performed during the                            0%
Procedures                                                          encounter
Number of                          Numeric               Number of procedures (other than lab tests)                       0%
Procedures                                                         performed during the encounter            
Number of                          Numeric               Number of distinct generic names administered                0%
Medications                                                       during the encounter   
Number of                          Numeric               Number of outpatient visits of the patient in the                0%
outpatient visits                                                year preceding the encounter   
Number of                          Numeric               Number of emergency visits of the patient in the               0%
emergency visits                                              year preceding the encounter  
Number of                          Numeric               Number of inpatient visits of the patient in the                   0%
inpatient visits                                                   year preceding the encounter  
Diagnosis 1                          Nominal               The primary diagnosis (coded as first three                           0%                                                                                                                                                                              digits of ICD9); 848 distinct values.                                                                                                                                                                                                                               (International Classification of diseases)              
Diagnosis 2                          Nominal               Secondary diagnosis (coded as first three digits                   0%                                                                                                                                                                             of ICD9); 923 distinct values       
Diagnosis 3                          Nominal               Additional secondary diagnosis (coded as first                     1%                                                                                                                                                                            three digits of ICD9); 954 distinct values               
Number of                          Numeric               Number of diagnoses entered to the system                           0%
diagnoses
Glucose serum                  Nominal               Indicates the range of the result or if the test was                0%
test result                                                            not taken. Values: “>200,” “>300,” “normal,” and                                                                                                                                                                                             “none” if not measured                              
A1c test result                   Nominal               Indicates the range of the result or if the test was                 0%                                                                                                                                                                           not taken. Values: “>8” if the result was greater                                                                                                                                                                                                 than 8%, “>7” if the result was greater than 7% but                                                                                                                                                                                          less than 8%, “normal” if the result was less than                                                                                                                                                                                              7%, and “none” if not measured.              
Change of                            Nominal               Indicates if there was a change in diabetic                               0%
medications                                                        medications (either dosage or generic name).                                                                                                                                                                                                       Values: “change” and “no change”           
Diabetes                              Nominal               Indicates if there was any diabetic medication                       0%
Medications                                                        prescribed. Values: “yes” and “no”          
23 features for                  Nominal               For the generic names: metformin, repaglinide,                     0%         
medications                                                        nateglinide, chlorpropamide, glimepiride,                                                                                                                                                                                                             acetohexamide, glipizide, glyburide, tolbutamide,                                                                                                                                                                                             pioglitazone, rosiglitazone, acarbose, miglitol,                                                                                                                                                                                                     troglitazone, tolazamide, examide, sitagliptin,                                                                                                                                                                                                    insulin, glyburide-metformin, glipizide-metformin,                                                                                                                                                                                          glimepiride-pioglitazone, metformin-rosiglitazone,                                                                                                                                                                                           metformin-rosiglitazone, and metformin-pioglitazone,
                                                                                the feature  indicates  whether the drug was prescribed
                                                                                 or there was a change, and “no” if the drug was not
                                                                                 prescribed
Readmitted                        Nominal               Days to inpatient readmission. Values: “<30” if the                0%                                                                                                                                                                           patient was readmitted in less than 30 days, “>30” if                                                                                                                                                                                         the patient was readmitted in more than 30 days,                                                                                                                                                                                              and “No” for no record of readmission.