Jump to content

EMRBots

From Wikipedia, the free encyclopedia


EMRBots is a set of synthetic electronic health record datasets and related source code for artificially generated patient records. The project was created to support research and education using synthetic medical data rather than real patient records.[1]

Synthetic health data is used in biomedical informatics when access to real patient records is limited by privacy, confidentiality, or institutional restrictions. A 2018 letter in Communications of the ACM described EMRBots as a system for generating synthetic patient populations containing demographics, admissions, comorbidities, and laboratory values.[2]

Use in research

[edit]

EMRBots has been cited or used in biomedical informatics research as a source of synthetic clinical data. A 2018 article in Bioinformatics on the R package comoRbidity included artificially generated clinical data from EMRBots to demonstrate systematic analysis of disease comorbidities.[3]

A 2020 article in JMIR Medical Informatics discussed EMRBots in the context of methods for generating synthetic patient repositories while comparing analyses based on synthetic and real medical data.[4]

Availability

[edit]

EMRBots datasets have been made available through Figshare, including 100-patient, 10,000-patient, and 100,000-patient synthetic databases.[5][6][7] The 10,000-patient dataset also appears in the National Institute of Allergy and Infectious Diseases data discovery portal.[8]

Criticism

[edit]

The developers of Synthea, another synthetic patient generator, criticized EMRBots as a set of pregenerated synthetic electronic health record datasets with limited explanation of the generation process and with inconsistencies between health problems, age, and gender.[9]

See also

[edit]

References

[edit]
  1. ^ Kartoun, Uri (September 2019). "Advancing informatics with electronic medical records bots (EMRBots)". Software Impacts. 2 100006. doi:10.1016/j.simpa.2019.100006.
  2. ^ CACM Staff (1 January 2018). "A leap from artificial to intelligence". Communications of the ACM. 61 (1): 10–11. doi:10.1145/3168260.
  3. ^ Gutiérrez-Sacristán, Alba; Bravo, Àlex; Giannoula, Alexia; Mayer, Miguel A; Sanz, Ferran; Furlong, Laura I; Kelso, Janet (15 September 2018). "comoRbidity: an R package for the systematic analysis of disease comorbidities". Bioinformatics. 34 (18): 3228–3230. doi:10.1093/bioinformatics/bty315. PMC 6137966. PMID 29897411.
  4. ^ Reiner Benaim, Anat; Almog, Ronit; Gorelik, Yuri; Hochberg, Irit; Nassar, Laila; Mashiach, Tanya; Khamaisi, Mogher; Lurie, Yael; Azzam, Zaher S.; Khoury, Johad; Kurnik, Daniel; Beyar, Rafael (2020). "Analyzing Medical Research Results Based on Synthetic Data and Their Relation to Real Data Results: Systematic Comparison from Five Observational Studies". JMIR Medical Informatics. 8 (2) e16492. doi:10.2196/16492. PMC 7059086. PMID 32130148.
  5. ^ Kartoun, Uri (2018-09-03). "EMRBots: A 100-patient database". figshare. doi:10.6084/m9.figshare.7040039.v3. {{cite journal}}: Cite journal requires |journal= (help)
  6. ^ Kartoun, Uri (2018-09-03). "EMRBots: A 10,000-patient database". figshare. doi:10.6084/m9.figshare.7040060.v3. {{cite journal}}: Cite journal requires |journal= (help)
  7. ^ Kartoun, Uri (2018-09-03). "EMRBots: A 100,000-patient database". figshare. doi:10.6084/m9.figshare.7040198.v1. {{cite journal}}: Cite journal requires |journal= (help)
  8. ^ "EMRBots: a 10000-patient database". NIAID Data Ecosystem Discovery Portal. National Institute of Allergy and Infectious Diseases. Retrieved 20 June 2026.
  9. ^ Walonoski, J; et al. (2018). "Synthea: An approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record". Journal of the American Medical Informatics Association. 25 (3): 230–238. doi:10.1093/jamia/ocx079. PMC 7651916. PMID 29025144.