Should I Stay or Should I Go? Career Tracks in Astronomy Based on Publication Records

Abstract

Abstract

You can get started by double clicking this text block and begin editing. You can also click the Insert button below to add new block elements. Or you can drag and drop an image right onto this text. Happy writing!

Introduction

Understanding what are the career tracks of PhD recipients is important for several reasons (expand into a paragraph?):

1. Setting appropriate expectrations for students entering graduate programs and setting a realistic standard of what typical career(s) look(s) like

2. Providing adequate mentoring and resources to departments and students

3. Understanding the effects of large scale economic and social trends on the astronomy workforce

4. Others.

There is a percieved worsening of the funding climate and tightening of the job market as of late. Or maybe this is a persistent feelind that each generation experiences anew. Suggestions have appeared that there is an increasing over-production of PhDs relative to jobs in the field (figure which Jessica Kirpatrick showed at AAS). However career tracks are not a pipeline from graduate school into a postdoc into a tenure track position. Instead, astronomy PhDs follow a variety of paths and end up in a wide range of positions, institutions and industries. And while the number of astronomers has increased (based on AAS membership?) and the number of astronomy PhDs each year has risen (based on NSF stats), it is not yet clear that a larger that usual fraction of people are forced to seek employment outside of academia as a result of these trends.

Academics produce a trail of publications during their career. This publication record and its associated meta-data can be used to draw some conclusions about the career tracks of astronomers. In this analysis we address the following questions:

1. What fraction of astronomy PhDs drop out of research at different intervals folloing their graduation? What fraction of graduates are still actively engaged in research?

2. Is there a change in these numbers as a function of time?

3. Are the rates different based on the PhD recipient’s gender?

4. Do graduates of the departments ranked in the top-20 fare differently?

5. Do the recipients of prize fellowships fare differently?

Data and Methods

All astronomical publications are indexed in the Astrophysics Data System abstract service (ADS).

ADS also indexes thesis publications fro ProQuest, the largest publisher of PhD theses in the US. This limits this analysis to US institutions.

Recently, ADS created an API which allows automated batch searches of the ADS archive.

Additional data is used from the NSF surveys of PhD recipients.

How we do it?

First we construct a querry which requests all PhD theses from the ADS Astronomy database for a given year. The following querry requests the first 200 theses of 2000:

http://adslabs.org/adsabs/api/search/?q=bibstem:PhDT&filter=year:2000&filter=database:astronomy&rows=200&start={1}&dev_key=XXXXXXXXXXXXXXXX

There are a total of 12112 PhDs in ADS in the period from 1990 to 2013.

Affiliations age given as a text string and there is no standard way to select the US vs. non-US institutions. We identify the names of all unique institution names ( 1700 of them) and then handpick a list of all unique names of US institutions ( 800). A cetrain fraction of disserations ( 1800) do not have author affiliations. For a given year this is typically 5 to 30% but for the 1996 sample 60% of diserations do not have author affiliation. For these, we go a step further and collect the publishing institution directly from the ADS HTML page. We then follow the same procedure as previously outlined to divide these into US and non-US institutions. The final tally for the numbers of PhDs as a function of time are presented in Table 1 and Figure 1.

Then we pick out the names of the PhD recipients. This group of names is the US graduating cohort with PhDs in anstronomy for a given year.

Using the first names of the PhD recipients we can asign gender with some probability. X fraction of names have only an initial and cannot be gendered at all. Methods for gendering. What fraction of names don’t get gendered?

Caveats:

1. ProQuest does not contain all theses, some schools which grant astronomy degrees may not work with ProQuest. Based on our sample it seems that most major institutions do work with ProQuest. In an attempt to determine this incompleteness we compared the number of theses from ADS to that from NSF which by itself led to an interesting result (see below).

2. Name disambiguations is difficult (OrcID is trying to remedy that). Issues arise in two general cases:

• Author changes last name throughout career. Additionally some universities require student to provide their legal last name for their thesis which may differ from the name they use to publish. This is an issue more common to women which may change their last name at marriage. To estimate the effect we will look at the gender-split sample.

• Author has a common name such as “Smith” or “Williams”. This issue manifests itself as authors with unusually large publications lists. To remedy it we remove

\section{Analysis} \subsection{Number of PhDs} }Year| Total PhDs| US PhDs | Foreign PhDs| Unknown Origin| |----|-----------|-------------|-------------|---------------| |1990| 437 | 312(71.40\%)| 70 (16.02\%)| 55 (12.59\%) | |1991| 454| 318 (70.04\%)| 67 (14.76\%)| 69 (15.20\%)| |1992| 446| 317 (71.08\%)| 62 (13.90\%)| 67 (15.02\%)| |1993| 495| 337 (68.08\%)| 47 ( 9.49\%)| 111 (22.42\%)| |1994| 516| 325 (62.98\%)| 45 ( 8.72\%)| 146 (28.29\%)| |1995| 581| 369 (63.51\%)| 47 ( 8.09\%)| 165 (28.40\%)| |1996| 424| 133 (31.37\%)| 35 ( 8.25\%)| 256 (60.38\%)| |1997| 623| 333 (53.45\%)| 78 (12.52\%)| 212 (34.03\%)| |1998| 582| 325 (55.84\%)| 81 (13.92\%)| 176 (30.24\%)| |1999| 460| 280 (60.87\%)| 106 (23.04\%)| 74 (16.09\%)| |2000| 439| 299 (68.11\%)| 113 (25.74\%)| 27 ( 6.15\%)| |2001| 380| 260 (68.42\%)| 95 (25.00\%)| 25 ( 6.58\%)| |2002| 402| 251 (62.44\%)| 116 (28.86\%)| 35 ( 8.71\%)| |2003| 392| 250 (63.78\%)| 120 (30.61\%)| 22 ( 5.61\%)| |2004| 480| 302 (62.92\%)| 126 (26.25\%)| 52 (10.83\%)| |2005| 485| 301 (62.06\%)| 135 (27.84\%)| 49 (10.10\%)| |2006| 525| 320 (60.95\%)| 170 (32.38\%)| 35 ( 6.67\%)| |2007| 554| 344 (62.09\%)| 191 (34.48\%)| 19 ( 3.43\%)| |2008| 574| 406 (70.73\%)| 138 (24.04\%)| 30 ( 5.23\%)| |2009| 632| 431 (68.20\%)| 159 (25.16\%)| 42 ( 6.65\%)| |2010| 572| 358 (62.59\%)| 157 (27.45\%)| 57 ( 9.97\%)| |2011| 597| 379 (63.48\%)| 152 (25.46\%)| 66 (11.06\%)| |2012| 566| 376 (66.43\%)| 157 (27.74\%)| 33 ( 5.83\%)| |2013| 496| 358 (72.18\%)| 129 (26.01\%)| 9 ( 1.81\%)| \subsection{Publications as an Indicator of Research Activity}