Abstract
Nurse practitioners (NPs) represent the fastest growing segment of the U.S. primary care workforce. Surveys of primary care NPs can help to better understand the care NPs deliver across different health care settings, the factors that impact NP job satisfaction and burnout, and the structural capabilities required to support their practice. The purpose of this article is to provide an overview of national sampling frames that can be used by researchers interested in surveying or studying the U.S. primary care NP workforce. We conducted an environmental scan and review of published literature on the NP workforce to identify data sources that can be used to sample primary care NPs. In this article, we (a) identify the data elements needed to develop an NP sampling frame and (b) describe national data sets that can be used to sample primary care NPs, including the strengths and weaknesses of each. This information is intended to facilitate research on the primary care NP workforce to inform practice and policy.
Background
Nurse practitioners (NPs) are a vital component of the U.S. primary care workforce. With escalating patient demand for health care services that exceeds the supply of providers, the current shortfall of primary care physicians is projected to reach between 21,100 to 55,200 physicians by 2032 (Dall et al., 2019). However, the growing supply of NPs, particularly in rural and underserved communities, shows promise for meeting demands for care (Davis et al., 2018; Xue et al., 2019). Growth in the NP workforce has accelerated in recent years, with the number of full-time NPs more than doubling between 2010 and 2017, from approximately 91,000 to 190,000 (Auerbach et al., 2020). NPs disproportionately care for patients in underserved areas (Buerhaus et al., 2015; Davis et al., 2018; Xue et al., 2019), and primary care practices increasingly rely on NPs to deliver care (Barnes et al., 2018; Friedberg et al., 2017). As of 2016, NPs represented 25.2% of providers in rural primary care practices and 23.0% in nonrural practices (Barnes et al., 2018). Shifts in the composition of the primary care workforce, coupled with the demands of an aging population, will require models of primary care delivery that expand the role of NPs (Auerbach et al., 2018).
Primary care practices that employ NPs show wide variation in NP clinical roles and models of care delivery (Buerhaus et al., 2015; Donelan et al., 2019; Poghosyan et al., 2014). Primary care NPs practice independently or as part of interprofessional teams; some manage independent patient panels, while others share a patient panel with a physician (Poghosyan et al., 2014). NP clinical roles, including time spent on chronic care management and care coordination, often vary depending on the presence of other clinicians in the practice, such as social workers or registered nurses (RNs; Donelan et al., 2019). Primary care NPs also work in a wide range of community settings, including long-term care, home- and community-based settings, walk-in or retail clinics, and student health services (Buerhaus et al., 2015).
In previous studies, researchers have surveyed NPs to better understand the supply of NPs, the settings in which they work, and their roles and practice patterns (Buerhaus et al., 2018; Donelan et al., 2013, 2019; Health Resources and Services Administration [HRSA], 2014, 2019a). Surveys of primary care NPs can help to better understand the care NPs deliver across different health care settings, the factors that impact NP job satisfaction and burnout, and the structural capabilities required to support their practice, such as electronic health record functionality, disease registries, and quality improvement infrastructure (Martsolf et al., 2018). However, identification of a representative sample of primary care NPs presents a challenge for a variety of reasons, not least of which is that NP specialty information is not easily identified in most data sources, including state licensure data.
The purpose of this article is to provide an overview of national sampling frames that can be used by researchers interested in surveying or studying the U.S. primary care NP workforce. In this article, we (a) identify the data elements needed to develop an NP sampling frame and (b) describe national data sets that can be used to sample primary care NPs, including the strengths and weaknesses of each. This information is intended to facilitate research on the primary care NP workforce to inform practice and policy.
Methods
We conducted an environmental scan and review of published literature on the NP workforce to identify data sources that can be used to sample NPs and describe the strengths and weaknesses of each. Our search strategy included three approaches. First, we reviewed documentation (i.e., articles and reports) for NP sampling frames known to the study team and compiled a list of sampling frames and their strengths and weaknesses. We solicited input from four experts in health workforce and survey research, who reviewed our list of sampling frames, suggested additional sampling frames, and provided input on the strengths and weaknesses of each. Second, with expert input, we identified nine key articles and reports (listed in Table 1) that describe NP sampling frames and/or physician sampling frames that are relevant to NPs (e.g., National Plan and Provider Enumeration System [NPPES], Masterfile, SK&A). We reference mined the key articles and reports to identify additional documents describing sampling frames used in U.S.-based surveys and studies of the NP workforce. We used forward and reverse citation mining (i.e., reviews of article citations and citations of the articles), as well as the find similar articles tool from PubMed. For feasibility and relevance, we focused primarily on surveys and studies conducted within the past 10 years (2010–2020). Third, we conducted internet searches on Google and targeted searches of organization websites to identify relevant details about each sampling frame, including the variables available to researchers and the process for researchers to obtain the data. If information was not available online, we contacted organizations and sampling vendors by phone or email.
Key Articles and Reports.
We identified eight sampling frames suitable for identification of a national sample of primary care NPs. For each sampling frame, we summarized the following information: description, cost, strengths, and weakness. Based on our existing knowledge and review of key articles and reports, we also identified a list of data elements needed to develop an NP sampling frame.
Results
Data Elements for a Nurse Practitioner Sampling Frame
A sampling frame is defined as the list of members of the population from which the sample is selected (DiGaetano, 2013). Surveys of the NP workforce have used a variety of sampling frames, including state licensure data, national certification data, and National Provider Identifiers (NPIs; Barnes & Novosel, 2018). Ideally, a sampling frame should cover the entire target population without duplication, and each provider in the sampling frame should be eligible for inclusion in the study (DiGaetano, 2013). The data elements required to conduct a survey of primary care NPs include contact information (telephone number, mailing address, and/or email), practice location (address and ZIP code), and specialty information (provider-level or practice-level) to identify NPs practicing in primary care settings. Existing data sources vary in the extent to which information about NPs is available and up to date. Factors to consider when choosing a sampling frame for health care providers include cost, contact information, specialty information, and availability of sampling variables (DiGaetano, 2013).
Cost
Some data sources such as the NPPES are publicly available, but others can be costly to obtain. Provider data collected by sampling vendors such as Masterfile and IQVIA OneKey typically vary in cost depending on the number of records and variables requested.
Contact Information
Completeness and accuracy of contact information is a key consideration for survey research. The contact information required to conduct a survey includes provider telephone and mailing address (home or work) or email address. Contact information for practices and providers becomes out of date quickly due to provider turnover. Missing or out-of-date contact information can create additional costs to track providers and can potentially lead to nonresponse bias if a provider’s eligibility status cannot be determined (DiGaetano, 2013).
The available contact information for NPs will determine which survey mode(s) are feasible (e.g., mail, phone, online). Response rates vary by survey mode. Mailed surveys are cost efficient and typically have the highest response rate, while online-only surveys tend to have lower response rates (Dillman et al., 2009). Mixed-mode surveys may achieve a higher response rate, and a sequential strategy alternating between modes (e.g., mailed survey with phone follow-up) is generally most effective (Dillman et al., 2009). A consideration for mixed-mode surveys is that responses often vary by mode—for example, responses to telephone surveys tend to be more positive than responses to mailed surveys; therefore, gains in response rate should be weighed against potential measurement bias (Elliott et al., 2009).
Specialty Information
Most databases used to sample primary care physicians (e.g., NPPES, the American Medical Association Masterfile, and the SK&A physician file) contain detailed information about physician specialties listed in their licensure data (DesRoches et al., 2015). NPs also have specialization, but specialty information is not listed in their licensure data and often must be inferred based on the specialty of their practice or the specialties of physicians in the practice. For this reason, identification of NPs in primary care is not straightforward. Spetz et al. (2015) used national survey data from the National Sample Survey of Nurse Practitioners (NSSNP) and state survey data from California and North Carolina to compare several approaches to identification of primary care NPs including education, fields of certification, and employment setting. Estimates of the number and proportion of NPs providing primary care varied widely depending on the sampling frame. National estimates of the proportion of NPs in primary care from the NSSNP were as high as 75% of NPs if based on current or past fields of certification, 68% if based on current employment setting, and as low as 48% if based on current field of clinical specialization (e.g., family medicine, internal medicine, geriatrics). Comparatively, state-level estimates based on education (type of NP program completed) were 83.5% in North Carolina and 90.7% in California (Spetz et al., 2015).
Sampling Variables
To ensure that the sample is representative of the population, researchers often use survey variables for sampling and weighting purposes including provider characteristics such as age, gender, and years of experience and practice characteristics such as ZIP code. These variables can be used for stratification (i.e., dividing the population into homogeneous subgroups before sampling) to ensure proportional allocation within each strata, or for weighting (i.e., assigning a value to each case) to ensure that the sample is representative of the population. These variables can also be used to evaluate nonresponse bias by comparing characteristics of survey respondents and nonrespondents (DiGaetano, 2013).
National Sampling Frames for Primary Care NPs
Here, we provide an overview of each sampling frame identified in our search. Details about each sampling frame, including strengths, weaknesses, and cost information, are summarized in Table 2.
Overview of National Sampling Frames for Nurse Practitioners.
Note. NPPES = National Plan and Provider Enumeration System; NPIs = National Provider Identifiers; NP = nurse practitioner; NCSBN = National Council of State Boards of Nursing; RNs = registered nurses; LPN = licensed practical nurse; LVNs = licensed vocational nurses; APRNs = advanced practice registered nurses; ANCC = American Nurses Credentialing Center; AANP = American Association of Nurse Practitioners; SQPP = Survey Question Procurement Program; DEA = Drug Enforcement Agency; NTIS = National Technical Information Service; MMS = Medical Marketing Service.
National Plan and Provider Enumeration System
One approach to identification of NPs is through NPIs. NPIs are 10-digit numeric identifiers assigned by the federal government to all health care providers and health care organizations for use in health care billing. NPIs provide a unique identifier for providers consistent across all health plans, although issuance of an NPI does not indicate that the provider is licensed or credentialed. NPPES maintains current information about health care providers with NPIs, publicly available through the searchable NPI Registry and the NPPES Downloadable File (U.S. Department of Health and Human Services, 2016). Advantages of the NPPES are that the database is searchable, and there is no cost to download the files (Klabunde et al., 2012). The NPPES Downloadable File includes provider name, taxonomy codes for identification of provider type (e.g., nurse practitioner) and specialty, primary business address (including mailing address and practice location address), and phone number. Another file lists secondary addresses and telephone numbers, if applicable.
If provider information (e.g., practice location) changes, providers are required to update their NPPES records within 30 days of the effective change (U.S. Department of Health and Human Services, 2016). However, there is no formal effort to keep provider contact information in NPPES up to date, and there is no link provided in the NPPES system between the individual providers and the organizations for which they work (DiGaetano, 2013). In some cases, it is possible that provider addresses listed may be for group billing NPI rather than the provider’s practice location, or providers may list their home address rather than their practice address. Previous work has found that physician contact information in NPPES is reasonably accurate and up to date for physicians billing public and private insurers (DesRoches et al., 2015), but this data source has not been validated for NPs.
In addition to the taxonomy code for nurse practitioner, NPs can select additional taxonomy codes to identify their specialty. Taxonomy codes to describe NP specialty include acute care, adult, family, gerontology, pediatrics, primary care, psychiatric/mental health, and so forth (NPIdb, 2020). These taxonomy codes have been used to identify primary care NPs for research purposes (Muench et al., 2019), although they have not been used to survey NPs. Taxonomies are self-selected by providers and are not verified by NPPES.
A related data source based on provider NPIs is the Medicare Data on Provider Practice and Specialty (MD-PPAS). This provider-level data set is built around two identifiers: NPI and tax identification number. The MD-PPAS assigns Medicare providers to medical practices based on tax identification numbers identified through Medicare fee-for-service claims (Centers for Medicare & Medicaid Services, 2019). However, the data do not include provider contact information, and the only location information provided is state and core-based statistical area.
State Licensure Data and the National Council of State Boards of Nursing Nursys Database
Another approach to identification of NPs is through state licensure data. All 50 states and the District of Columbia maintain lists of actively licensed NPs. NP licensure data are not reported at the national level; however, National Council of State Boards of Nursing (NCSBN)’s license verification system, Nursys, collects licensure data for advanced practice registered nurses (APRNs) in 27 states, updated on a frequent basis (NCSBN, 2020). Researchers seeking to use NCSBN data must submit a written request describing the research question, methodology, requested data elements, and intended use of the findings, as well as documentation of Institutional Review Board approval. Available data elements are confidential and must be submitted as part of the data request. As state licensure data do not include specialty information, an additional data source is required to identify primary care NPs.
Multiple large-scale surveys have used state licensure data to sample NPs, including the NSSNP. The 2012 NSSNP was conducted by the HRSA to provide national estimates of the NP workforce supply and collect detailed information on their licensure, education, practice characteristics, and demographics. A national sampling frame was created based on lists of actively licensed NPs obtained from state licensing boards for all 50 states and the District of Columbia. A single national sampling frame was developed from the 51 individual lists, using probability matching to account for NPs with licenses in multiple states. A random sample of 22,000 NPs was drawn from the unduplicated list, allocated by state in proportion to the number of licensed NPs in each state (HRSA, 2014). As described earlier, these survey data have been used to estimate the number and proportion of NPs delivering primary care (Spetz et al., 2015).
The National Sample Survey of Registered Nurses (NSSRN) is the longest running survey of RNs in the United States, fielded by HRSA every 4 years between 1980 and 2008. The NSSRN is widely regarded as the gold standard for descriptive data on the nursing workforce in the United States, with detailed information about RNs’ demographic characteristics, education, and employment (Auerbach et al., 2012). After a 10-year lapse following the 2008 survey, the survey was fielded again in 2018. The 2018 NSSRN included questions derived from the NSSNP and was the first version of the survey that provided data for both RNs and NPs at the state and national levels. The sampling frame was compiled from files provided by NCSBN and individual state boards of nursing. Weighted estimates from the NSSRN data generalize to state and national RN and NP populations to give a broader perspective on the national nursing workforce (HRSA, 2019b). Variables reported for NPs include demographics, areas of certification, educational background, years since completing NP degree program, employment status, employment setting (e.g., hospital, other inpatient setting, clinic or ambulatory care), and details regarding NP practice (e.g., presence of physicians where they work, whether they are the primary provider for a patient panel, perceptions of and satisfaction with their work, insurance and billing practices). Survey data files for the NSSRN and NSSNP are publicly available on the HRSA website (HRSA, n.d.).
Credentialing Organizations
American Nurses Credentialing Center Nurse Practitioner Certification Data
The American Nurses Credentialing Center (ANCC)’s credentialing programs certify and recognize individual nurses and APRNs in specialty practice areas. NPs can obtain national certifications in primary care and acute care in various patient populations (e.g., adult, adult-gerontology, family, pediatric, etc.) and in psychiatric-mental health (across the life span). Primary care NPs can be identified based on their certification(s), with the caveat that many NPs practice in specialties that do not match their certification (Spetz et al., 2015). The number of NPs certified to provide primary care is much larger than the number currently practicing in primary care settings (Spetz et al., 2015). ANCC maintains a comprehensive database of nationally certified NPs that has been used to study workforce trends and inform the Consensus Model, which provides guidance for states to adopt uniformity in the regulation of APRN roles (APRN Joint Dialogue Group Report, 2008). The main drawback to this data source is that NP certification data from ANCC is not typically available to researchers. An additional caveat is that NPs can obtain certification through other organizations such as the American Association of Nurse Practitioners (AANP) and therefore may not be represented in the ANCC database.
AANP National Nurse Practitioner Database
AANP maintains the National Nurse Practitioner Database, a comprehensive list of nationally certified NPs comprised of information obtained annually from state boards of nursing. This database is used to identify a random sample of NPs for the annual AANP National NP Sample Survey, which captures information regarding NP practice patterns, education, specialization, compensation, and benefits (AANP, 2020). The data are typically not available to researchers. However, through AANP’s Survey Question Procurement Program, researchers have the opportunity to add up to five questions to the annual survey and receive the raw data and accompanying demographic data already included on the survey (e.g., certifications, gender, state, work setting, and self-reported clinical focus). A research proposal must be submitted to AANP, and the cost depends on the number of questions added to the survey.
In addition, professional organizations such as AANP and the National Association of Pediatric Nurse Practitioners maintain membership databases that include members’ contact information. Access to this data is typically restricted to members of the organization. In some cases, members may be permitted to access this data for research purposes.
Drug Enforcement Agency Registrant File
The Drug Enforcement Agency (DEA) Registrant File, available for purchase from National Technical Information Service, includes all clinicians registered with the DEA. Federal law requires all health care providers to obtain a DEA number to write prescriptions for medications classified as controlled substances. A provider’s DEA registration number acts as an individual identifier to track prescriptions for controlled substances. NPs now have prescriptive authority for controlled substances in all 50 states, although scope of practice restrictions in many states limits independent prescribing (Phillips, 2020). The majority of NPs have DEA registration; in 2020, there were 290,000 licensed NPs in the United States (AANP, 2020), and approximately 272,000 NPs had DEA registration (DEA Registrant File, 2020). The DEA file includes provider name, provider type (e.g., NP, physician, physician assistant), DEA registration number, drug schedules handled by the provider, and business address. As the DEA file does not include specialty information, another data source is required to identify NP specialties. The DEA–NPI crosswalk (National Bureau of Economic Research, 2018) may be used to link the DEA file to other data sources. The cost to download the DEA file varies based on the number of data users and frequency of data updates (files are updated daily, weekly, monthly, or quarterly).
The DEA also maintains a file of clinicians waivered to prescribe buprenorphine, although this includes only a small proportion of NPs. The data have been used to survey NPs and physician assistants regarding the barriers they experience to prescribing buprenorphine for treatment of opioid use disorder (Andrilla et al., 2020).
Physician Compare
Physician Compare is an online resource for health care consumers to find review information about clinicians and groups enrolled in Medicare (Centers for Medicare & Medicaid Services, 2020). Data come from the Provider Enrollment and Chain/Ownership System, the electronic portal through which providers enroll in Medicare, and are verified with Medicare claims. The Physician Compare National Downloadable File lists providers that participate in the Medicare program, including NPs and physician assistants. The file is updated twice per month with the most current demographic information available. The data are free to download and include provider name, NPI, graduation year (medical school or other), practice name, group practice ID, practice street address, and practice phone number. No additional contact information is listed. The data can be linked to Medicare claims using NPIs. Physician Compare does not include specialty information for NPs; however, one study identified a sample of primary care NPs on Physician Compare based on string searches of organization names for “internal,” “family,” or “primary care” (Ellenbogen & Segal, 2020).
Nurse Practitioner Masterfile
The Nurse Practitioner Masterfile is a comprehensive list of practicing state-licensed NPs maintained by the Medical Marketing Service (MMS). The NP Masterfile is updated monthly and includes practice locations, provider contact information (e.g., mailing address, phone number, email), and specialty information (often self-designated practice specialties). These data can be used to identify a targeted sample of NPs in a particular specialty such as primary care. MMS maintains similar files for other health care professions including the American Medical Association Masterfile, a comprehensive listing of all licensed physicians in the United States. Researchers can obtain a list of provider mailing addresses and phone numbers and can also request email broadcasts to a list of providers (MMS does not release provider email addresses and performs the broadcast on researchers’ behalf). The cost to obtain the data depends on the number of records and variables requested. For email broadcasts, the cost depends on the size of the target audience and number of sends.
Masterfile data have previously been used to study the NP workforce as part of the National Survey of Primary Care Nurse Practitioners and Physicians (Buerhaus et al., 2015; Donelan et al., 2013). This mailed survey was conducted between 2011 and 2012 to examine roles, scope of work, and practice characteristics among primary care NPs and physicians, as well as their perspectives on primary care practice (Buerhaus et al., 2015; Donelan et al., 2013). The NP Masterfile and the American Medical Association Masterfile were used to select a random sample of NPs and physicians, respectively, in primary care specialties (Donelan et al., 2013).
IQVIA OneKey (SK&A)
OneKey is a health care industry database maintained by IQVIA that collects data on health care providers and practices across the United States (IQVIA Inc., 2020). The OneKey database integrates data from IMS Health, Healthcare Data Solutions, and SK&A, a commercial market research firm that collects national provider and practice data. Provider data from various sources, including the DEA, NPPES, and state licensing boards, are regularly updated and audited through both automated and manual processes. OneKey data include provider name, practice name and location, contact information, network affiliation, state license number, and NPI. OneKey has several advantages over other data sources, including information about practices such as patient volume, number and type of providers, site specialty, and ownership that are not available elsewhere. This information can be used to identify NPs working in primary care practices. A mailing list for NPs and physician assistants is also available. The cost to obtain the data varies depending on the number of records and variables requested.
This data source was used in the 2018 Survey of Primary Care and Geriatrics Clinicians, a national cross-sectional survey designed to measure how practices with primary care and geriatrics physicians and NPs organize and deliver care to older adults (Donelan et al., 2019). The researchers selected a nationally representative random sample of practices that employed primary care or geriatrics physicians and NPs. Practices were sampled in six strata by presence of physician and NP (i.e., physician only, physician and NP, or NP only) and specialty (i.e., primary care or geriatrics; Donelan et al., 2019).
Discussion
Each of the sampling frames we identified has trade-offs, including the availability and accuracy of key variables, coverage of the population of primary care NPs, and cost to obtain the data. Researchers should select a sampling frame based on the population of interest, research question, and study design, keeping in mind the availability of key data elements necessary to conduct the study (e.g., specialty information, contact information, sampling variables). For example, the DEA Registrant File may be useful to study opioid prescribing practices among NPs, while Physician Compare may be useful to study care delivered to Medicare beneficiaries by NPs. For survey studies, the quality of contact information is a key consideration. Sampling vendors such as Masterfile and IQVIA OneKey are useful for surveys because they regularly collect updated provider contact information and specialty information, allowing researchers to target NPs practicing in particular specialties such as primary care.
One of the central challenges to surveying primary care NPs is the lack of specialty information to accurately identify primary care NPs. There is currently no national file with comprehensive NP specialty information. Barriers to collecting this data include lack of funding and lack of coordination across federal, state, and local organizations to standardize collection of nursing workforce data (Barnes & Novosel, 2018; Spetz et al., 2016; University of North Carolina [UNC], n.d.). The National Academy of Medicine’s (formerly the Institute of Medicine, 2011) Future of Nursing Report recommended that the National Health Workforce Commission and HRSA collaborate with state licensing boards to standardize collection of minimum health care workforce data sets at the state level, to be collected during license renewal. State-level nursing workforce data sets could be used to monitor supply, demand, and education pipelines and could be aggregated to create a national nursing workforce data set (UNC, n.d.). However, it is unclear how NP specialty information should be collected and reported. NP accreditation is not regulated by state boards of nursing, and NPs’ education and certification does not always align with their practice setting (Martsolf et al., 2020). To address this issue, specialty information could potentially be self-reported by NPs or determined by an algorithm that includes a combination of self-reported specialty, certifications, and employment setting (UNC, n.d.). More consistent and comprehensive data collection for NPs and other APRNs, including specialty information, education and training, certification and licensure, and practice characteristics, can inform nursing workforce planning and policies (Barnes & Novosel, 2018; Spetz et al., 2016). In particular, better data collection using unique NP identifiers would enable linkage of NP workforce data to other data sources to evaluate the impact of NPs on patient outcomes (Barnes & Novosel, 2018; Campaign for Action, 2016).
This article is one of the first to provide a detailed overview of national NP sampling frames for researchers interested in surveying or studying the primary care NP workforce. We did not perform a systematic review of published studies of the NP workforce, and there may be additional data sources that we have not identified. The scope of this article is limited to the United States; however, previous studies have described the evolving roles of advanced practice nurses in primary care internationally (Maier & Aiken, 2015; Maier et al., 2016).
The primary care NP workforce plays a key role in addressing national health care priorities, including care of an aging population, expanding access to care, and promoting health education and disease prevention (Buerhaus et al., 2019). An important research priority for the next decade is to examine the role of NPs in high-performing models of primary care delivery (Buerhaus et al., 2019). Despite the current challenges to surveying and studying primary care NPs, multiple existing data sources can be leveraged to better understand their evolving roles and practice patterns and identify strategies to support NPs in delivering high-quality care.
Footnotes
Acknowledgments
The authors would like to acknowledge the following researchers for providing their expertise on nurse practitioner surveys and sampling frames: Holly Andrilla, Karen Donelan, Erin Fraher, Ryan Kandrack, and Joanne Spetz.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by National Institute on Minority Health and Health Disparities (NIMHD R01MD011514).
