Deyan Ginev edited subsection_Average_Paper_Size_The__.tex  almost 9 years ago

Commit id: 659681f3a304b291a41ecfd0449c982f25f1839e

deletions | additions      

       

The practical motivation for doing this measurement run was to obtain an understanding of the overall distribution of paper sizes, in order to design an adequate processing framework, which won't run into silly buffer overflows.  We have already seen that the trend is to see super-linear growth in paper sizes over time, so we start with this caveat in mind.  Collecting the disk sizes in each paper directory, we see a rough average size of $\approx 50$ MB per arXiv paper, with a variance from the tens of kilobytes to a maximum $5.1$ GB, as of end of May, 2015. You can play with the detailed paper size dataset in the below active figure.