Background Public health nurses (PHNs) engage in home visiting services and documentation of care services for at-risk clients. To increase efficiency and decrease documentation burden, it would be useful for PHNs to identify critical data elements most associated with patient care priorities and outcomes. Machine learning techniques can aid in retrospective identification of critical data elements. Objective We used two different machine learning feature selection techniques of minimum redundancy-maximum relevance (mRMR) and LASSO (least absolute shrinkage and selection operator) and elastic net regularized generalized linear model (glmnet in R). Methods We demonstrated application of these techniques on the Omaha System database of 205 data elements (features) with a cohort of 756 family home visiting clients who received at least one visit from PHNs in a local Midwest public health agency. A dichotomous maternal risk index served as the outcome for feature selection. Application Using mRMR as a feature selection technique, out of 206 features, 50 features were selected with scores greater than zero, and generalized linear model applied on the 50 features achieved highest accuracy of 86.2% on a held-out test set. Using glmnet as a feature selection technique and obtaining feature importance, 63 features had importance scores greater than zero, and generalized linear model applied on them achieved the highest accuracy of 95.5% on a held-out test set. Discussion Feature selection techniques show promise toward reducing public health nursing documentation burden by identifying the most critical data elements needed to predict risk status. Further studies to refine the process of feature selection can aid in informing PHNs' focus on client-specific and targeted interventions in the delivery of care.
Bibliographical notePublisher Copyright:
© Wolters Kluwer Health, Inc. All rights reserved.
- Omaha System
- machine learning
- nursing informatics
- public health nursing