Science communication is primarily based on publishing research results in research papers. Anecdotally, authors feel that the publication cycle takes too long \cite{Himmelstein2015-me}. A better understanding of the publication lag could provide solace when feelings of substantial delay occur, where the main question is whether there are predictive factors of time taken from submission to publication. This paper tries to model publication times for the Public Libary of Science (PLoS) journals with metadata available for resesarch papers. The PLoS journals include PLoS Medicine, PLoS Biology, PLoS ONE, PLoS Pathogens, PLoS Genetics, PLoS Computational Biology, PLoS Neglected Tropical Diseases, and PLoS Clinical Trials (which was later merged into PLoS Medicine).

Previous research indicated that statistically nonsignificant results take longer to be published \cite{ioannidis1998}, review times have decreased \cite{lyman2013}, and that the amount of figures or tables does not predict publication time \cite{lee2013}. Other research into the academic publication cycle has focused on rejection rates of submitted manuscripts or the types of decisions made after the peer-review process \cite{Rosenkrantz2015-uj}. These studies primarily relied on sampling research papers from journals, but with the rise of APIs and scrapers to mine the literature \cite{Smith-Unna2014-cd} such sampling is becoming redundant. In this paper, I analyze the entire population of PLoS research articles and split between predicting review time (i.e., time from submission through acceptance) and production time (i.e., time from acceptance through publication) in order to investigate whether publication time can be predicted with paper metadata.