The goal of this work is to infer the personality traits of the PsychoFlickr dataset, fine-tuning the ImageNet CNN pre-trained model.
The first idea was to adapt the prototxt containing the structure of the network so that it was able to take as input the images of the psycho Flickr datasets as data and the traits scores, both self and attributed as target, and fine-tune the Imagenet model of classifying 1000 classes of objects into our task of predicting the personality trait given an image.
The initial goal was indeed to surgery the prototxt changing the last layer so that the net has to learn it for the new task and changing the classification layer to a regression layer. We divide the dataset into 75% for the training set and the remaining 25% for the testing set. Some observation: we can’t build the files containing the list of images of training and testing together with all the labels for the train.
To work with the regression we build a txt file containing the path of the images and a hdf5 file for the 10 labels of the traits as they are oat number and not integer.
When we launched the first trait we noticed that all the network layers, both weights and data were set to zeros. To overcome this problem we decrease the learning rate and build the training files using random permutation of the images.
Then we test the fine-tuned model building a deploy.prototxt file that takes as input images and predict the traits of them. Testing the net we noticed that the net was not able to learn much. Maybe the task is to hard. We pretend to generalize a personality trait form 45000 images a predict a float number given a new image.
The first idea is to select the images of those users that have te trait minor and equal to the first quartile and greater and equal than the third quartile of a given distribution trait, and assign label 0/1 respectively. In this case the structure of the net was the same as the one of ImageNet, but we modified the last layer in order to allow to the net to learn the new data representation. In this case we want to classify an image belonging to a low or high level of a trait, so it is a binary classification. We had to treat each trait independently, so we build 10 files containing the path and the target for each trait in a unique txt file in this case as the labels are integer (randomly sampled for the training set). With this idea we have finally some results.
The latest batch of 2C experiments have corrected some errors in the prototxt files (traits 7 to 10 were affected) and completed the missing ones (traits 1 to 4). Moreover, in preliminary experiments, we found out that it was possible to reach near 100% training accuracy by fixing the first 4 conv layers, and letting only the last 3 fully connected layers (InnerProduct ones) fine-tune on the PsychoFlickr data. Unfortunately, the training tended to overfit, showing a test accuracy graph with an ”elbow” path. To reduce overfitting, we increased the weight decay parameter dramatically in order to avoid any ”neuron specialization” (neural units with high weights values), but this limited the max training accuracy (not needed anyway). Together, fixing the conv layers and raising weight decay, we achieved the best 2C results so far. See Figure 1, where the self traits are still unimpressive, but the attributed ones clearly show predictions above chance.