Original research| Volume 101, ISSUE 10, P1739-1746, October 2020

Challenges of Developing a Natural Language Processing Method With Electronic Health Records to Identify Persons With Chronic Mobility Disability



      To assess the utility of applying natural language processing (NLP) to electronic health records (EHRs) to identify individuals with chronic mobility disability.


      We used EHRs from the Research Patient Data Repository, which contains EHRs from a large Massachusetts health care delivery system. This analysis was part of a larger study assessing the effects of disability on diagnosis of colorectal cancer. We applied NLP text extraction software to longitudinal EHRs of colorectal cancer patients to identify persons who use a wheelchair (our indicator of mobility disability for this analysis). We manually reviewed the clinical notes identified by NLP using directed content analysis to identify true cases using wheelchairs, duration or chronicity of use, and documentation quality.


      EHRs from large health care delivery system


      Patients (N=14,877) 21-75 years old who were newly diagnosed with colorectal cancer between 2005 and 2017.


      Not applicable.

      Main Outcome Measures

      Confirmation of patients’ chronic wheelchair use in NLP-flagged notes; quality of disability documentation.


      We identified 14,877 patients with colorectal cancer with 303,182 associated clinical notes. NLP screening identified 1482 (0.5%) notes that contained 1+ wheelchair-associated keyword. These notes were associated with 420 patients (2.8% of colorectal cancer population). Of the 1482 notes, 286 (19.3%, representing 105 patients, 0.7% of the total) contained documentation of reason for wheelchair use and duration. Directed content analysis identified 3 themes concerning disability documentation: (1) wheelchair keywords used in specific EHR contexts; (2) reason for wheelchair not clearly stated; and (3) duration of wheelchair use not consistently documented.


      NLP offers an option to screen for patients with chronic mobility disability in much less time than required by manual chart review. Nonetheless, manual chart review must confirm that flagged patients have chronic mobility disability (are not false positives). Notes, however, often have inadequate disability documentation.


      List of abbreviations:

      EHR (electronic health record), ICD-9-CM (International Classification of Diseases–9th Revision–Clinical Modifications), ICD-10-CM (International Classification of Diseases–10th Revision–Clinical Modifications), NLP (natural language processing)
      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Archives of Physical Medicine and Rehabilitation
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Centers for Disease Control and Prevention
        Prevalence of disabilities and health care access by disability status and type among adults-United States, 2016.
        MMWR Morb Mortal Wkly Rep. 2018; 67: 882-887
        • Centers for Disease Control and Prevention
        Disability and health data system.
        (Available at:)
        Date accessed: June 17, 2020
        • Brault M.W.
        Americans with disabilities: 2010. Household Economic Studies.
        U.S. Census Bureau, Washington (DC)2012
      1. Institute of Medicine Committee on Disability in Ameriac Board on Health Sciences Policy. The future of disability in America. National Academies Press, Washington (DC)2007
        • LaPlante M.P.
        Key goals and indicators for successful aging of adults with early-onset disability.
        Disabil Health J. 2014; 7: S44-S50
        • Iezzoni L.I.
        • Kurtz S.G.
        • Rao S.R.
        Trends in pap testing over time for women with and without chronic disability.
        Am J Prev Med. 2016; 50: 210-219
        • Horner-Johnson W.
        • Dobbertin K.
        • Iezzoni L.I.
        Disparities in receipt of breast and cervical cancer screening for rural women age 18 to 64 with disabilities.
        Womens Health Issues. 2015; 25: 246-253
        • Iezzoni L.I.
        • Kurtz S.G.
        • Rao S.R.
        Trends in mammography over time for women with and without chronic disability.
        J Womens Health (Larchmt). 2015; 24: 593-601
        • Horner-Johnson W.
        • Dobbertin K.
        • Andresen E.M.
        • Iezzoni L.I.
        Breast and cervical cancer screening disparities associated with disability severity.
        Womens Health Issues. 2014; 24: e147-e153
        • Gofine M.
        • Mielenz T.J.
        • Vasan S.
        • Lebwohl B.
        Use of colorectal cancer screening among people with mobility disability.
        J Clin Gastroenterol. 2018; 52: 789-795
        • McCarthy E.P.
        • Ngo L.H.
        • Roetzheim R.G.
        • et al.
        Disparities in breast cancer treatment and survival for women with disabilities.
        Ann Intern Med. 2006; 145: 637-645
        • Iezzoni L.I.
        • Ngo L.H.
        • Li D.
        • Roetzheim R.G.
        • Drews R.E.
        • McCarthy E.P.
        Treatment disparities for disabled Medicare beneficiaries with stage I non-small cell lung cancer.
        Arch Phys Med Rehabil. 2008; 89: 595-601
        • Iezzoni L.I.
        Using administrative data to study persons with disabilities.
        Milbank Q. 2002; 80: 347-379
        • Ben-Shalom Y.
        • Stapleton D.C.
        Predicting disability among community-dwelling Medicare beneficiaries using claims-based indicators.
        Health Serv Res. 2016; 51: 262-281
        • Iezzoni L.I.
        • Wint A.J.
        • Tishler L.
        • Palmisano J.
        • Tripodis Y.
        Using claims for long-term services and support to predict mortality and hospital use.
        Disabil Health J. 2019; 12: 523-527
        • Palsbo S.E.
        • Sutton C.D.
        • Mastal M.F.
        • Johnson S.
        • Cohen A.
        Identifying and classifying people with disabilities using claims data: further development of the Access Risk Classification System (ARCS) algorithm.
        Disabil Health J. 2008; 1: 215-223
        • Iezzoni L.I.
        Imperatives for HSR addressing: Individuals with disabilities--the canaries in health care’s coal mine.
        Med Care. 2013; 51: 133-136
        • Murdoch T.B.
        • Detsky A.S.
        The inevitable application of big data to health care.
        JAMA. 2013; 309: 1351-1352
        • Sarmiento R.F.
        • Dernoncourt F.
        Improving patient cohort identification using natural language processing.
        Secondary analysis of electronic health records. Springer, Cham (CH)2016: 405-417
        • Bates D.W.
        • Saria S.
        • Ohno-Machado L.
        • Shah A.
        • Escobar G.
        Big data in health care: using analytics to identify and manage high-risk and high-cost patients.
        Health Aff (Millwood). 2014; 33: 1123-1131
        • Krumholz H.M.
        Big data and new knowledge in medicine: the thinking, training, and tools needed for a learning health system.
        Health Aff (Millwood). 2014; 33: 1163-1170
        • Udelsman B.
        • Chien I.
        • Ouchi K.
        • Brizzi K.
        • Tulsky J.A.
        • Lindvall C.
        Needle in a haystack: natural language processing to identify serious illness.
        J Palliat Med. 2019; 22: 179-182
        • Forsyth A.W.
        • Barzilay R.
        • Hughes K.S.
        • et al.
        Machine learning methods to extract documentation of breast cancer symptoms from electronic health records.
        J Pain Symptom Manage. 2018; 55: 1492-1499
        • Lindvall C.
        • Lilley E.J.
        • Zupanc S.N.
        • et al.
        Natural language processing to assess end-of-life quality indicators in cancer patients receiving palliative surgery.
        J Palliat Med. 2019; 22: 183-187
        • Lilley E.J.
        • Lindvall C.
        • Lillemoe K.D.
        • Tulsky J.A.
        • Wiener D.C.
        • Cooper Z.
        Measuring processes of care in palliative surgery: a novel approach using natural language processing.
        Ann Surg. 2018; 267: 823-825
        • Poort H.
        • Zupanc S.N.
        • Leiter R.E.
        • Wright A.A.
        • Lindvall C.
        Documentation of palliative and end-of-life care process measures among young adults who died of cancer: a natural language processing approach.
        J Adolesc Young Adult Oncol. 2020; 9: 100-104
        • Noy N.F.
        • McGuinness D.L.
        Ontology development 101: a guide to creating your first ontology.
        Stanford University, Stanford, CA2001
        • Bodenreider O.
        • Smith B.
        • Burgun A.
        The ontology-epistemology divide: a case study in medical terminology.
        Form Ontol Inf Syst. 2004; 2004: 185-195
        • Bethard S.
        • Guergana S.
        • Wei-Te C.
        • Derczynski L.
        • Pustejovsky J.
        • Verhagen M.
        SemEval-2016 Task 12: Clinical TempEval.
        in: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016). Association for Computational Linguistics, San Diego2016: 1052-1062 (Available at:)
        • Dligach D.
        • Miller T.
        • Lin C.
        • Bethard S.
        • Savova G.
        Neural temporal relation extraction.
        in: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Valencia2017: 746-751 (Available at:)
        • Hsieh H.-F.
        • Shannon S.E.
        Three approaches to qualitative content analysis.
        Qual Health Res. 2005; 15: 1277-1288
        • Sandelowski M.
        What’s in a name? Qualitative description revisited.
        Res Nurs Heal. 2010; 33: 77-84
        • Sandelowski M.
        Focus on research methods: whatever happened to qualitative description?.
        Res Nurs Heal. 2000; 23: 334-340
        • Shivade C.
        • Raghavan P.
        • Fosler-Lussier E.
        • et al.
        A review of approaches to identifying patient phenotype cohorts using electronic health records.
        J Am Med Inform Assoc. 2014; 21: 221-230
        • Sarmiento C.
        • Miller S.R.
        • Chang E.
        • Zazove P.
        • Kumagai A.K.
        From impairment to empowerment: a longitudinal medical school curriculum on disabilities.
        Acad Med. 2016; 91: 954-957
        • Kreimeyer K.
        • Foster M.
        • Pandey A.
        • et al.
        Natural language processing systems for capturing and standardizing unstructured clinical information: a systematic review.
        J Biomed Inform. 2017; 73: 14-29
        • Yim W.-W.
        • Yetisgen M.
        • Harris W.P.
        • Kwan S.W.
        Natural language processing in oncology: a review.
        JAMA Oncol. 2016; 2: 797-804
        • Sheikhalishahi S.
        • Miotto R.
        • Dudley J.T.
        • Lavelli A.
        • Rinaldi F.
        • Osmani V.
        Natural language processing of clinical notes on chronic diseases: systematic review.
        JMIR Med Inform. 2019; 7: e12239
      2. Iezzoni L. Risk adjustment for measuring health care outcomes. 4th ed. Health Administration Press, Chicago2013
        • Newman-Griffis D.
        • Zirikly A.
        Embedding transfer for low-resource medical named entity recognition: a case study on patient mobility.
        in: Proceedings of the BioNLP 2018 Workshop. Association for Computational Linguistics, Melbourne2018: 1-11 (Available at:)
        Date accessed: June 17, 2020
        • Iezzoni L.
        Multiple chronic conditions and disabilities: implications for health services research and data demands.
        Health Serv Res. 2010; 45: 1523-1540
        • Bierman A.S.
        Functional status: the six vital sign.
        J Gen Intern Med. 2001; 16: 785-786
        • Newman-Griffis D.
        • Porcino J.
        • Zirikly A.
        • et al.
        Broadening horizons: the case for capturing function and the role of health informatics in its use.
        BMC Public Health. 2019; 19: 1288
        • Siebens H.
        Proposing a practical clinical model.
        Top Stroke Rehabil. 2011; 18: 60-65
        • Siebens H.
        Applying the domain management model in treating patients with chronic diseases.
        Jt Comm J Qual Improv. 2001; 27: 302-314
        • Kushner D.
        • Peters K.M.
        • Johnson-Greene D.
        Evaluating Siebens Domain Management Model for inpatient rehabilitation to increase functional independence and discharge rate to home in geriatric patients.
        Arch Phys Med Rehabil. 2015; 96: 1310-1318
        • Kushner D.S.
        • Peters K.M.
        • Johnson-Greene D.
        Evaluating use of the Siebens Domain Management Model during inpatient rehabilitation to increase functional independence and discharge rate to home in stroke patients.
        PM R. 2015; 7: 354-364
        • Newman-Griffis D.
        • Zirikly A.
        • Divita G.
        • Desmet B.
        Classifying the reported ability in clinical mobility descriptions.
        in: Proceedings of the BioNLP 2019 Workshop. Association for Computational Linguistics, Florence2019: 1-10 (Available at:)
        • Gold M.
        • McLaughlin C.
        Assessing HITECH implementation and lessons: 5 years later.
        Milbank Q. 2016; 94: 654-687
        • Blumenthal D.
        • Tavenner M.
        The “meaningful use” regulation for electronic health records.
        N Engl J Med. 2010; 363: 501-504
        • Xu H.
        • Stetson P.D.
        • Friedman C.
        A study of abbreviations in clinical notes.
        AMIA Annu Symp Proc. 2007; 2007: 821-825
        • Goddard K.
        • Roudsari A.
        • Wyatt J.C.
        Automation bias -- a hidden issue for clinical decision support system use.
        Stud Health Technol Inform. 2011; 164: 17-22