loading page

The Use of Computational Phenotypes within Electronic Healthcare Data to Identify Transgender People in the United States: A Narrative Review
  • +1
  • Theo G. Beltran,
  • Elle Lett,
  • Tonia Poteat,
  • Juan Hincapie-Castillo
Theo G. Beltran
The University of North Carolina at Chapel Hill Gillings School of Global Public Health
Author Profile
Elle Lett
Center for Applied Transgender Studies
Author Profile
Tonia Poteat
The University of North Carolina at Chapel Hill School of Medicine
Author Profile
Juan Hincapie-Castillo
The University of North Carolina at Chapel Hill Gillings School of Global Public Health

Corresponding Author:[email protected]

Author Profile


Purpose: With the expansion of research utilizing electronic healthcare data to identify transgender (TG) population health trends, the validity of computational phenotype algorithms to identify TG patients is not well understood. We aim to identify the current state of the literature that has utilized CPs to identify TG people within electronic healthcare data and their validity, potential gaps, and a synthesis of future recommendations based on past studies. Methods: Authors searched the National Library of Medicine’s PubMed, Scopus, and the American Psychological Association Psyc Info’s databases to identify studies published in the United States that applied CPs to identify TG people within electronic health care data. Results: Twelve studies were able to validate or enhance the positive predictive value (PPV) of their CP through manual chart reviews (n=5), hierarchy of code mechanisms (n=4), key text-strings (n=2), or self-surveys (n=1). CPs with the highest PPV to identify TG patients within their study population contained diagnosis codes and other components such as key text-strings. However, if key text-strings were not available, researchers have been able to find most TG patients within their electronic healthcare databases through diagnosis codes alone. Conclusion: CPs with the highest accuracy to identify TG patients contained diagnosis codes along with components such as procedural codes or key text-strings. CPs with high validity are essential to identifying TG patients when self-reported gender identity is not available. Still, self-reported gender identity information should be collected within electronic healthcare data as it is the gold standard method to better understand TG population health patterns.
15 Mar 2023Submitted to Pharmacoepidemiology and Drug Safety
15 Mar 2023Submission Checks Completed
15 Mar 2023Assigned to Editor
15 Mar 2023Review(s) Completed, Editorial Evaluation Pending
28 Apr 2023Reviewer(s) Assigned
23 Oct 2023Editorial Decision: Revise Minor
08 Nov 20231st Revision Received
08 Nov 2023Submission Checks Completed
08 Nov 2023Assigned to Editor
08 Nov 2023Review(s) Completed, Editorial Evaluation Pending
12 Nov 2023Editorial Decision: Accept