The likelihood of requiring a diagnostic test: Classifying emergency department patients with logistic regression

Abstract

Background:

Emergency departments (EDs) play an important role in health systems since they are the front line for patients with emergency medical conditions who frequently require diagnostic tests and timely treatment.

Objective:

To improve decision-making and accelerate processes in EDs, this study proposes predictive models for classifying patients according to whether or not they are likely to require a diagnostic test based on referral diagnosis, age, gender, triage category and type of arrival.

Method:

Retrospective data were categorised into four output patient groups: not requiring any diagnostic test (group A); requiring a radiology test (group B); requiring a laboratory test (group C); requiring both tests (group D). Multivariable logistic regression models were used, with the outcome classifications represented as a series of binary variables: test (1) or no test (0); in the case of group A, no test (1) or test (0).

Results:

For all models, age, triage category, type of arrival and referral diagnosis were significant predictors whereas gender was not. The main referral diagnosis with high model coefficients varied by designed output groups (groups A, B, C and D). The overall accuracies of the logistic regression models for groups A, B, C and D were, respectively, 74.11%, 73.07%, 82.47% and 85.79%. Specificity metrics were higher than the sensitivities for groups B, C and D, meaning that these models were better able to predict negative outcomes.

Implications:

These results provide guidance for ED triage staff, researchers and practitioners in making rapid decisions regarding patients’ diagnostic test requirements based on specified variables in the predictive models. This is critical in ED operations planning as it potentially decreases waiting times, while increasing patient satisfaction and operational performance.

Keywords

data mining data analysis algorithms emergency department diagnostic test referral diagnosis health information management classification techniques logistic regression electronic medical records

Introduction

An emergency department (ED) is a medical treatment unit responsible for providing medical and surgical care for patients presenting without prior appointment, either by their own means or via ambulance (Etu, 2018). Due to the unplanned nature of patient arrivals, EDs face overcrowding. This has become one of the biggest barriers preventing ED managers and practitioners from providing high-quality and timely medical care (Kobayashi et al., 2019; Linder and Woitok, 2020; Sarıyer et al., 2018). Because EDs provide a 24/7 service as the front line for patients presenting with a wide range of complaints, ED personnel (physicians, technicians, nursing, administration and security personnel) must be well prepared. In particular, to diagnose sometimes very complex ED cases, detailed investigations and diagnostic tests are required. However, ordering any type of diagnostic test increases the length of patient stay in ED, potentially causing bottlenecks in the already overcrowded environment. Therefore, it is crucial from a process planning perspective to identify which patients may require any type of diagnostic test at the point of triage. Time lost at this bottleneck can be minimised if patients can be classified on arrival as requiring diagnostic tests or not by staff responsible for triage, and if diagnostic test-related preparations can be completed before examination by an ED physician. This also accelerates and improves the accuracy of the ED physicians’ decision-making.

To efficiently classify arriving ED patients, it may be advantageous to generate information based on past raw data stored in hospital databases as electronic medical records. Here, the science of data mining is key. Data mining is a “process to locate non-obvious, unknown, and potential possible usable information from data” (Frawley et al., 1992) while Reinschmidt et al. (1999) described its goal as being “to extract effective, useful and unknown comprehensible information to serve as a foundation of decision-making for enterprises.” Thus, appropriate use of information generated by data mining can provide organisations with a sound basis for decision-making (Lin et al., 2010), with health systems being no exception.

The use of data mining techniques in health systems has received significant attention recently. Rather than the commonly used techniques of clustering (Lin et al., 2011; Resta et al., 2018) and association rule mining (Huang, 2013; Lee et al., 2013; Nahar et al., 2013), most research utilises classification techniques. There have been various approaches. By comparing the accuracies of different classification techniques, Hu et al. (2017) classified patients based on their return probabilities. Classification techniques have also been used to predict patient arrivals or admissions (Golmohammadi, 2016; Taşar and Sarıyer, 2018; Xu et al., 2013) and to model the occurrence conditions of different types of diseases in patients (Arslan et al., 2016; Safdari et al., 2018; Yeh et al., 2011). However, the majority of studies have classified patients based on length of stay (Chuang et al., 2018; Gül and Güneri, 2015; Hachesu et al., 2013; Pendharkar and Khurana, 2014; Rowan et al., 2007). This research suggests that data mining can be particularly applied in the medical field to improve decision-making, such as prognosis, diagnosis and treatment planning (Bellazzi and Zupan, 2008). Although the use of diagnostic tests is very important in treatment planning, to the best of our knowledge it has rarely been used as an output variable for data mining models. Instead, it has more frequently been treated as an input variable or predictor (Arslan et al., 2016; Golmohammadi, 2016; Gül and Güneri, 2015; Hachesu et al., 2013; Roy et al., 2019).

The aim of this research was to determine the likelihood that a patient presenting to an ED will need to have tests ordered, and whether triage staff could arrange for these tests to be carried out before the patient is seen by a physician. Results of such tests would then be available earlier to assist physicians to make decisions and potentially reduce waiting times in the ED. In this study, patients were classified by characteristics (gender, age, triage category, type of arrival, referral diagnosis) to determine whether they required a diagnostic test. Additionally, using two diagnostic test groups, radiology and laboratory, classification models were developed for four output groups: patients requiring no diagnostic test, any type of radiology test, any type of laboratory test and at least one type of radiology and laboratory tests. By addressing this gap in the literature, this study can contribute to ED-related research by defining diagnostic test requirements as an output, and generating models to serve as guides for ED triage staff, practitioners and researchers in making decisions on diagnostic test requests.

Method

Study design

This was a retrospective study to model the diagnostic test requirements of patients based on various inputs at a single ED. The local institutional review board approved the study.

Study setting and participants

The data were obtained from a large-scale urban training hospital in İzmir, Turkey. Since this hospital is located in a metropolitan district and is easily accessible due to its proximity to a metro station, its ED is extremely busy, with an average daily census of 900 patients. All patients registered to this ED during the study period of March–May 2017 were included in the study. Data for these patients were extracted from the hospital’s electronic data warehouse.

Data sources

The raw data were obtained from three databases. Patients arriving at this ED are registered by a triage staff, and the required demographic data are entered into the ED database as “Data of Arriving Patient.” The “laboratory database” includes data on patients receiving any type of laboratory investigation while the “radiology database” stores data on patients receiving any type of radiology investigation.

Variables

The input variables for this study were defined using the first database. This includes data on protocol ID (unique to patient), gender, age, triage category, type of arrival, referral diagnosis (assigned based on patient’s complaints and vital signs in accordance with International Classification of Diseases, 10th Version [ICD-10] codes) (World Health Organization, 2018), date of arrival, timestamps of arrival and departure, and final diagnosis. All these were used as input variables except for the timestamps since a time-based analysis was not within the scope of this study. Because the study aimed to improve decision-making and accelerate ED operations before patients are assigned a final diagnosis, the data on final diagnosis were also not considered. The output variables, obtained from the laboratory and radiology databases, were defined as nominal variables representing whether the test was needed. These two databases store the patient’s protocol ID, type of the examination and important timestamps. Many different types of tests were listed for both databases. Most of the laboratory tests were in the haemogram, biochemistry, enzyme, hormone or blood type categories while radiology tests included X-ray, tomography, ultrasound and magnetic resonance imaging. Based on how many tests from each group (laboratory and/or radiology) that the patients needed, they were assigned to one of the four groups according to the following four questions:

Group A: Did the patient require neither laboratory nor radiology test?

Group B: Did the patient require any type of radiology test?

Group C: Did the patient require any type of laboratory test?

Group D: Did the patient require at least one type of radiology and one type of laboratory test?

Data preprocessing

Any patient data with missing entries in input or output variables were excluded from the analysis. The percentage of missing entries was below 1%. All databases may repeat a protocol ID because a patient can have more than one referral diagnosis and may need different types of test (e.g. a patient requiring a haemogram test may also have a biochemistry test). Given the research design (i.e. the four output groups), these repeated entries for radiology and laboratory tests were not significant because, if a patient’s protocol ID was entered in either the radiology or laboratory database, the patient was classified as receiving a diagnostic test. Thus, these entries were removed from the radiology and laboratory databases. However, each entry was significant for the arriving patients’ database because differences in referral diagnoses could affect whether test is needed or not. Thus, these entries were used repetitively in the classification analysis with a different referral diagnosis entered while all other inputs remained the same.

In the first database, gender had two categories, and it was kept in this form in the analysis; however, the other input variables were transformed. In the raw data, age was continuous, but was converted into an ordinal scale based on the World Health Organization description. Triage level had seven different categories, which were combined into two levels based on the Emergency Severity Index description. Similarly, over 10 different categories of type of arrival were combined into two levels. Since referral diagnoses were entered based on ICD-10 groups, their form was “LXX.XX,” where L denotes different letters of the English alphabet and X denotes numbers from 0 to 9. According to ICD-10, these diagnoses can be combined into 21 levels. The output variable had two levels. The input and output variables and their levels are summarised in Table 1.

Table 1.

Description of input and output variables.

Type	Variable	Levels
Input	Gender	Male, female
	Age	(Age 0–14), (age 15–64), $age \geq 65$
	Triage category	Urgent or emergent, not urgent
	Type of arrival	Walk in, by ambulance
	Referral diagnosis	A00-B99, C00-D49, D50-D89, E00-E89, F01-F99, G00-G99, H00-H59, H60-H95, I00-I99, J00-J99, K00-K95, L00-L99, M00-M99, N00-N99, O00-O9A, P00-P96, Q00-Q99, R00-R99, S00-T88, V00-Y99, Z00-Z99
Output	Needed or not for group A	Yes: if neither laboratory nor radiology test is neededNo: if at least one of the test is needed
	Needed or not for group B	Yes: if any type of radiology test is neededNo: if any type of radiology test is not needed
	Needed or not for group C	Yes: if any type of laboratory test is neededNo: if any type of laboratory test is not needed
	Needed or not for group D	Yes: if at least one type of radiology and laboratory test is neededNo: if neither of the tests or just one is needed

Statistical analysis

For each output group, multivariable binary logistic regression models were built to classify patients under any of two mutually exclusive and exhaustive outcomes (yes/no). This method models the posterior probability of a sample being classified in the positive class as a logistic function of the linear combination of input variables. This enables logistic regression to show which of the various factors being assessed has the strongest relationship with an output variable, and provides a measure of the magnitude of this potential influence. One superiority of logistic regression is adjustment for confounding variables (input variables that are related with both the output variable and other input variables). This prevents the measure of the influence of the input variable of interest being distorted by the effect of the confounder. Given n predictors, the logistic model predicts the natural logarithm of odds, defined as logit. Assuming the input or independent variables are shown respectively as $X_{1}, X_{2}, \dots, X_{n}$ and the output variable as $Y,$ the model has the following mathematical structure:

ln (\frac{π}{1 - π}) = log (odds) = logit = α + β_{1} X_{1} + β_{2} X_{2} + \dots + β_{n} X_{n}

Hence:

π = Pr (Y = yes) = \frac{e^{α + β_{1} X_{1} + β_{2} X_{2} + \dots + β_{n} X_{n}}}{1 + e^{α + β_{1} X_{1} + β_{2} X_{2} + \dots + β_{n} X_{n}}}

where $π$ represents the probability of an event, α is the intercept of Y and $β_{i}, i = 1, \dots, n$ are the slope parameters. The intercept ( $α)$ and slope ( $β_{i}, i = 1, \dots, n$ ) parameters are estimated based on maximum likelihood method.

In the medical literature, logistic regression is frequently used to estimate the probability that a patient will have a specific outcome (presence/absence of disease, presence/absence of drugs or presence/absence of treatment) depending on characteristics thought to be associated with this outcome. In this study, this outcome was defined as receiving any type of laboratory and/or radiology examination.

Outcome measures

In the dataset, actual values of the output variable (either yes or no) were available. Logistic regression models were built to make predictions for all instances. To evaluate the performance of the models, matches between the actual and predicted values of the output variable were used. Four situations occurred based on the following matches/mismatches:

True positive (TP): Actual and prediction values of the output variable are both “yes.”

False positive (FP): While the actual value of the output variable is “no,” the model incorrectly predicts “yes.”

True negative (TN): Actual and prediction values of the output variable are both “no.”

False negative (FN): While the actual value of the output variable is “yes,” the model incorrectly predicts “no.”

Three performance evaluation metrics of sensitivity, specificity and accuracy were defined based on these four statistics as follows:

sensitivity = \frac{TP}{TP + FN}

specificity = \frac{TN}{TN+FP}

accuracy = \frac{TP+TN}{TP+TN+PF+FN}

These three metrics were then used to evaluate the performance of the logistic regression models for the four patient groups. All of the steps defined in the Method section are summarised in Figure 1.

Figure 1.

Data mining flow of the research. ED: emergency department; LR: logistic regression.

Results

Descriptive results

During the study period, 89,236 patients arrived at this ED, of whom 75,930 (85.09%) had a unique referral diagnosis. The remaining 13,306 patients (14.91%) had at least two different diagnostic codes, for example, J00-J99 and R00-R99 together. Since differences between these codes may affect the decision to order specific test types, each was given a separate entry for the classification models. Thus, for the classification analysis, the dataset included 107,746 instances. The frequency and percentage distributions of the input and output variables of the arriving patients are given in Table 2.

Table 2.

Patients’ frequency and percentage distributions based on input and output variables.

Type	Variables	Levels	Frequencies	Percentages (%)
Input	Gender	Male	43,741	49.02
	Gender	Female	45,495	50.98
	Age	(Age 0–14)	17,979	20.15
		(Age 15–64)	62,455	69.99
		$Age \geq 65$	8802	9.86
	Triage category	Urgent or emergent	46,537	52.15
	Triage category	Not urgent	42,699	47.85
	Type of arrival	Walk in	84,759	94.98
	Type of arrival	By ambulance	4477	5.02
Output	Needs test or not	Group A	53,435	59.88
		Group B	26,110	29.26
		Group C	20,268	22.71
		Group D	10,577	11.85

Table 2 shows that patients were evenly distributed across the levels for two input variables, gender and triage category. However, the frequency distributions differed markedly across levels for both age and type of arrival. Specifically, a majority of patients were between 15 and 64 years old and arrived by walking. Regarding the output groups, most of the arriving patients needed neither a laboratory or radiology test (group A). Of these, more needed some type of radiology test (group B) than some type of laboratory test (group C). Patients needing at least both a radiology and laboratory test (group D) formed the smallest group. Table 3 presents the characteristics of the study participants in terms of output groups. The correlation between the input variables and output groups were analysed using χ² test. These results are also in Table 3. For each group, the results are interpreted in terms of the output level “yes” since this category mainly represents the output group definition. For all output groups, male–female frequencies were very close, with female frequencies being somewhat higher for groups A, C and D. However, for group B, the reverse was true. The frequency distribution for age differed significantly between groups in that the second age group, (age 15–64), had the highest frequencies in all groups. Elderly patients ( $age \geq 65)$ appeared most frequently in group D and least in group A whereas the youngest age group, (age 0–14), showed the opposite pattern.

As Table 3 shows, most patients in group A were coded as not urgent in triage, whereas the majority of patients in the other three groups were coded as urgent or emergent, with the largest number in group D and the lowest in group B. This indicates that the requirement for some type of diagnostic test significantly increased for urgent or emergent patients. In addition, the ratio of urgent or emergent patients to non-urgent patients was higher for patients requiring laboratory tests than radiology tests. Type of arrival showed a similar pattern. That is, almost all group A patients arrived by walking whereas the proportion arriving by ambulance was markedly greater in the other groups. The ratio of those arriving by ambulance and walking was highest for group D and lowest for group B. This indicates that there were significantly more requests for diagnostic tests for patients arriving by an ambulance, who required laboratory tests more frequently than radiology tests. Table 3 additionally shows that some referral diagnosis codes (P00-P96, Q00-Q99) were rare while others (H00-H59, H60-H95, L00-L99) had low frequencies. Almost all these arriving patients were classified as group A, meaning that they required no diagnostic tests. In contrast, for other low-frequency codes (C00-D49, D50-D89, E00-E89, O00-O9A), patients were classified as B, C or D, meaning that they required at least one diagnostic test. More patients were coded as S00-T88, V00-Y99 or Z00-Z99 and most required some type of radiology test. Patients coded as I00-I99, N00-N99 or R00-R99 generally required at least one type of laboratory or radiology test, most often a laboratory test. Most patients coded M00-M99 required some type of radiology test whereas most coded J00-J99 required no tests. No significant group differences were found for A00-B99, F01-F99, G00-G99 and K00-K95, meaning that referral diagnosis alone was not an important predictor of requiring a diagnostic test for these patients. Finally, Table 3 shows that there were significant correlations between output groups and the input variables (gender, age, triage category, type of arrival, referral diagnosis). Thus, all these variables were included as input variables for the logistic regression models of each output group.

Table 3.

Demographic characteristics of patients by diagnostic test requirement groups.

Inputs	Levels	Output levels (group A)			Output levels (group B)			Output levels (group C)			Output levels (group D)
		Yes	No	p Value	Yes	No	p Value	Yes	No	p Value	Yes	No	p Value
		n	n	p Value	n	n	p Value	n	n	p Value	n	n	p Value
Gender	Male	26,629	17,112	<0.001	13,214	30,527	<0.001	9030	34,711	<0.001	5132	38,609	<0.001
Gender	Female	26,806	18,689	<0.001	12,896	32,599	<0.001	11,238	34,257	<0.001	5445	40,050	<0.001
Age	≤14 years	11,708	6271	<0.001	4916	13,063	<0.001	2413	15,566	<0.001	1058	16,921	<0.001
	15–64 years	39,150	23,305		16,611	45,844		12,641	49,814		5947	56,508
	≥65 years	2577	6225		4583	4219		5214	3588		3572	5230
Triage category	Urgent or emergency	19,578	26,959	<0.001	19,646	26,891	<0.001	17,049	29,488	<0.001	9736	36,801	<0.001
Triage category	Not urgent	33,857	8842	<0.001	6464	36,235	<0.001	3219	39,480	<0.001	841	41,858	<0.001
Mode of arrival	Walk in	53,062	31,697	<0.001	22,790	61,969	<0.001	16,905	67,854	<0.001	7998	76,761	<0.001
Mode of arrival	By ambulance	373	4104	<0.001	3320	1157	<0.001	3363	1114	<0.001	2579	1898	<0.001
Referral diagnosis	A00-B99	2142	836	<0.001	280	2698	<0.001	801	2177	<0.001	245	2733	<0.001
	C00-D49	22	188		144	66		184	26		140	70
	D50-D89	88	291		152	227		290	89		151	228
	E00-E89	94	396		192	298		393	97		189	301
	F01-F99	831	481		234	1078		445	867		198	1114
	G00-G99	1327	536		450	1413		428	1435		342	1521
	H00-H59	1113	24		18	1119		16	1121		10	1127
	H60-H95	1282	105		68	1319		88	1299		51	1336
	I00-I99	1189	2009		1372	1826		1959	1239		1322	1876
	J00-J99	17,638	2986		2177	18,357		2174	18,360		1455	19,079
	K00-K95	3707	1846		1006	4547		1630	3923		790	4763
	L00-L99	1336	98		42	1392		76	1358		20	1414
	M00-M99	9853	12,040		11,243	10,650		1809	20,084		1012	20,881
	N00-N99	2183	2687		934	3936		2551	2319		798	4072
	O00-O9A	237	794		750	281		230	801		186	845
	P00-P96	107	22		5	124		20	109		3	126
	Q00-Q99	10	5		4	11		3	12		2	13
	R00-R99	4223	7090		3944	7369		6314	4999		3168	8145
	S00-T88	2215	1354		1273	2296		279	3290		198	3371
	V00-Y99	1772	1010		877	1905		294	2488		161	2621
	Z00-Z99	2066	1093		945	2214		284	2875		136	3023

Logistic regression model results

The predicted probabilities of the logistic regression models were obtained for membership of the yes levels for each output group. The last categories of the independent variables (female for gender, $age \geq 65$ for age, arriving by an ambulance for arrival type, urgent and emergency for mode of arrival and Z00-Z99 for referral diagnosis) were taken as reference categories for the models. Table 4 presents the odds ratios, significance of predictors (exp(β)), 95% confidence intervals for the odds ratios, model constants and R² values of the four models.

Table 4.

Summary statistics of logistic regression models.

	Group A			Group B			Group C			Group D
	Exp(β)	95% CI for exp(β)		Exp(β)	95% CI for exp(β)		Exp(β)	95% CI for exp(β)		Exp(β)	95% CI for exp(β)
	Exp(β)	Lower	Upper	Exp(β)	Lower	Upper	Exp(β)	Lower	Upper	Exp(β)	Lower	Upper
Male	1.02	0.95	1.10	0.97	0.94	0.99	0.97	0.93	1.00	1.03	0.96	1.12
(Age 0−14)	1.41^a	1.33	1.50	0.67^a	0.64	0.71	0.40^a	0.38	0.43	0.32^a	0.30	0.35
(Age 15−64)	2.14^a	2.03	2.25	0.47^a	0.45	0.49	0.41^a	0.39	0.43	0.40^a	0.38	0.42
Walk in	11.62^a	10.54	12.80	0.19^a	0.17	0.20	0.12^a	0.11	0.13	0.16^a	0.15	0.17
Not urgent	4.18^a	4.05	4.32	0.29^a	0.28	0.30	0.19^a	0.18	0.20	0.14^a	0.13	0.15
A00-B99	1.58^a	1.42	1.75	0.17^a	0.15	0.19	3.46^a	3.09	3.88	1.02	0.87	1.19
C00-D49	0.24^a	0.15	0.38	1.12	0.82	1.54	15.75^a	10.10	24.57	4.91^a	3.55	6.79
D50-D89	0.28^a	0.22	0.36	0.54^a	0.43	0.66	18.04^a	13.86	23.49	2.91^a	2.31	3.66
E00-E89	0.31^a	0.24	0.39	0.41^a	0.34	0.50	15.19^a	11.91	19.37	2.05^a	1.67	2.50
F01-F99	2.12^a	1.85	2.42	0.20^a	0.17	0.24	2.04^a	1.77	2.34	0.88	0.74	1.04
G00-G99	1.76^a	1.56	2.00	0.52^a	0.46	0.59	1.99^a	1.74	2.30	2.18^a	1.87	2.53
H00-H59	19.96^a	14.30	27.86	0.04^a	0.03	0.06	0.18^a	0.12	0.27	0.17^a	0.10	0.28
H60-H95	7.56^a	6.11	9.34	0.10^a	0.08	0.13	0.67^a	0.53	0.85	0.65^a	0.48	0.87
I00-I99	0.86^a	0.78	0.94	0.58^a	0.53	0.63	5.01^a	4.52	5.55	2.69^a	2.42	2.99
J00-J99	2.93^a	2.74	3.14	0.30^a	0.28	0.32	1.44^a	1.33	1.57	1.60^a	1.46	1.76
K00-K95	1.18^a	1.09	1.27	0.40^a	0.37	0.44	3.73^a	3.41	4.09	1.98^a	1.78	2.21
L00-L99	5.30^a	4.43	6.34	0.10^a	0.08	0.13	0.86	0.71	1.05	0.47^a	0.36	0.63
M00-M99	0.45^a	0.42	0.48	2.11^a	1.98	2.24	0.63^a	0.58	0.69	0.52^a	0.47	0.58
N00-N99	0.39^a	0.36	0.43	0.51^a	0.47	0.56	12.18^a	11.11	13.34	2.74^a	2.47	3.04
O00-O9A	0.30^a	0.26	0.35	3.49^a	3.03	4.02	1.36^a	1.17	1.58	1.73^a	1.47	2.03
P00-P96	4.58^a	3.22	6.50	0.05^a	0.03	0.09	1.39	0.97	1.99	0.30^a	0.14	0.64
Q00-Q99	0.47^b	0.21	1.06	1.60	0.73	3.49	6.05^a	2.70	13.55	6.20^a	2.66	14.49
R00-R99	0.40^a	0.37	0.42	0.86^a	0.81	0.92	9.40^a	8.69	10.15	3.91^a	3.59	4.26
S00-T88	0.99	0.92	1.08	1.11^b	1.02	1.20	0.47^a	0.41	0.53	0.59^a	0.52	0.67
V00-Y99	1.22^a	1.12	1.34	0.77^a	0.70	0.84	0.78^a	0.69	0.81	0.72^a	0.63	0.82
Constant	0.04^a			9.49^a			4.56^a			2.21^a
Nagelkerke R²	0.45			0.38			0.55			0.47

CI: confidence interval.

^a Significant at 99% CI.

^b Significant at 95% CI.

The following observations can be made based on Table 4. For each of the output groups, males and females had similar probabilities of inclusion in the “yes” categories; that is, exp(β) values were around 1 for A, B, C and D. In contrast, for output group A, participants in the first two age categories ((age 0–14), (age 15–64)) were more likely to be included in the “yes” category than the reference age group whereas the reverse was seen for B, C and D. Similar results held for type of arrival and triage category. While patients in the “walk in” and “not urgent” categories were more likely to be included in the “yes” level of output group A than those in the “arriving by ambulance” and “urgent and emergent” categories, the opposite trend held for output groups B, C and D. Turning to referral diagnosis categories, patients coded as H00-H59, H60-H95, J00-J99 or L00-L99 had significantly higher odds ratios than the reference categories for output group A. On the other hand, the odds ratios for patients coded as C00-D49, D50-D89, E00-E89, I00-I99, N00-N99, Q00-Q99 or R00-R99 were significantly higher in output groups C and D. Finally, the odds ratios for patients coded as M00-M99 or O00-O9A were significantly higher than for those in output group B.

The performance of the logistic regression models are shown in Figure 2. Figure 2 shows that the models were reasonably accurate for groups A (74.11%) and B (73.07%), but much more accurate for groups C (82.47%) and D (85.79%). That is, the logistic regression model performed best in classifying patients requiring both tests while also performing well in classifying patients as requiring no diagnostic test or requiring at least one radiology test. The model sensitivity was higher than its specificity for group A whereas the reverse was true for groups B, C and D. Thus, the model was better able to predict positive outcomes for group A and negative outcomes for the other groups. However, the sensitivity–specificity comparison of the models was consistent since, based on the model design, positive outcomes for group A represent patients who did not require any diagnostic test.

Figure 2.

Logistic regression model performance for each output group.

Discussion

Interpretation of descriptive analyses

Regarding the frequency and percentage distributions, one interesting result concerns the influence of gender, specifically that there were more male than female patients in output group B. This could be because males are more likely to suffer traffic or occupational accidents (Sadeghi-Bazargani et al., 2018), which require X-ray, tomography or ultrasound tests. The distribution of patients by age level could be interpreted as follows. As age increases, the patient’s case becomes more critical and complex, which in turn increases diagnostic test requirements. The results for triage category and type of arrival can be interpreted together as indicating diagnostic test requirements markedly increase for patients who arrive by ambulance and are then triaged as urgent or emergent. There were also significant differences in the percentage distributions based on referral diagnosis codes. Overall, the descriptive analysis shows that these input variables significantly determined whether patients required a diagnostic test.

Discussion of logistic regression model results

Using a structured dataset, this study built logistic regression models for all four output groups. For output groups B, C and D, “yes” represented requiring a diagnostic test (radiology and/or laboratory). Thus, high odds ratios for the levels of the input variables were associated with an increased probability of requiring diagnostic tests for these output groups. Conversely, for group A, the output level “yes” represented patients who did not require any diagnostic tests. Therefore, the meaning of the model coefficients was reversed: higher ratios were associated with a lower likelihood of requiring diagnostic testing. Thus, based on the logistic regression model statistics summarised in Table 4, it was concluded that gender had no significant effect on the probability of requiring a diagnostic test whereas age, triage category and type of arrival were significant predictors. This result both supports this study’s descriptive results and confirms previous studies (Cheung et al., 2002; Izady and Worthington, 2012). The odds ratios for 21 ICD-10 codes representing the referral diagnoses of the ED patients are given in Table 4. This section interprets the model coefficients in comparison with existing laboratory and radiology testing guidelines and the research literature.

Various inferences can be drawn about the relationship between patient codes and output group. First, patients coded H00-H59, H60-H95, J00-J99 or L00-L99 generally required no diagnostic testing. For example, if a patient arrives in ED complaining of a sore throat, fever or asthenia, experienced coders assigned the ICD-10 code JXX.XX as the referral diagnosis. Subsequently, ED physicians can generally assign a final diagnosis of upper respiratory tract infection, J06.9, just from physical examination without any diagnostic testing (Gonzales et al., 2001). On the other hand, referral diagnosis codes C00-D49, D50-D89 and E00-E89 generally required laboratory tests or laboratory and radiology tests together (hence, the ratios of these diagnoses categories were higher in output groups C and D). Similarly, if a patient with a known history of diabetes mellitus arrives at ED unconscious and is assigned the referral code E16.X, physicians generally order both laboratory and radiology tests to determine the patient’s state of consciousness (Ben-Ami et al., 1999). Patients coded I00-I9, N00-N99 or R00-R99 generally required diagnostic testing, especially laboratory tests (hence, the ratios were high for output group C). For example, when a patient arrives at ED with chest pain, they are generally coded as R07.X. To determine whether the final diagnosis should be acute coronary syndrome or not, the ED physician then orders a cardiac enzyme, specifically a high-sensitivity troponin T laboratory tests on arrival, and repeated after 1 and 3 hours (Vigen et al., 2018). These few studies have been discussed to demonstrate how the model results of this study are in line with the research literature. However, many other studies could also be cited (e.g. Hot et al., 2007; Martin and Rossi, 1997; Schaefer, 2011; Shulman et al., 2012; Zuberbier and Maurer, 2007). While previous studies have offered laboratory and radiology testing guidelines, the present study contributes to the literature by offering a novel template model for the diagnostic testing requirements of ED patients by simultaneously considering both referral ICD-10 classifications and different types of demographic inputs, which may affect decisions about whether a patient requires a specific diagnostic test.

Limitations of the study

There are some limitations to the present work. First, the use of retrospective data from one hospital means that the descriptive results are context-specific and hence not generalisable. However, the logistic regression model results can be generalised because they are supported by existing research findings. Another limitation concerns the range of types of data stored in this hospital. If other potentially relevant data were available, such as patient and family medical history, type of drugs used, height and weight, and body mass index, they could be added as input variables. This might improve the performance of the classification model. Finally, the dataset had a “no dominant” structure because the frequencies of ordering any/both test types were low and much lower when requesting radiology or laboratory tests only. This creates an obstacle in using a multinominal logistic regression model with four mutually exclusive outcomes of no diagnostic test, radiology only, laboratory only and both.

Practical implication

Since time is critical in emergency situations (Sarıyer et al., 2017), models that can decrease time spent per patient are highly beneficial from the perspective of both individual patients and the social system. Thus, a model for classifying patients in terms of their diagnostic test requirements has many practical implications in planning ED operations. First, with this model, a triage staff can estimate a patient’s diagnostic test requirements and inform related personnel and units so that necessary preparations can be made before the physician’s examination, thereby saving valuable time. Secondly, the high volume and wide range of sometimes very complex cases that EDs deal with causes diagnostic difficulties for ED physicians. Specifically, it includes determining whether a patient with specific demographic characteristics requires a particular diagnostic test type. In such cases, this model can be used to improve decision-making by ED personnel because it is developed based solely on past experience in the ED. For instance, the model can propose a radiology test for a patient based on previous ED test selection decisions for patients with similar demographic characteristics. This can also improve ED budget planning by reducing costs from unnecessary use of resources (capacities, equipment, radiology/laboratory units, etc.). Finally, and perhaps most importantly, for medium- to long-term planning, the knowledge generated by this classification model can be integrated with population demographics (percentages based on age, gender, diagnosis and any other variables identified by future studies). This integration can guide the planning of various ED processes, such as facility design (laboratories, radiology units), capacity planning, stock management and quality management (improving services quality, decreasing waste of clinical resources).

Conclusion

This study proposed classification models based on logistic regression to empower decision-making and process planning in EDs by classifying patients according to their requirement for diagnostic tests. The structure of the logistic regression equations (model coefficients and input variables) reveal the role of particular characteristics in determining the need for diagnostic testing. In addition, by dividing the diagnostic tests into radiology or laboratory tests, the proposed models provide a guide to patient characteristics that predict requiring no tests, a radiology test, a laboratory test or both a radiology and laboratory test. The logistic regression models were reasonably accurate for each group (with respective values of 74.11%, 73.07%, 82.47% and 85.79%), and the models were generally better able to predict negative outcomes compared to positive ones (specificities were higher in most models). In the ED literature, requests for diagnostic tests have been frequently used as an input variable for the generated models. However, to the best of our knowledge, these tests have never been considered as an output factor. The models proposed in this study can thus contribute to the literature and are worthy of the attention of practitioners and researchers. Regarding future research directions, deeper insights may be gained by including different input factors to improve classification accuracy and sensitivity. This could include defining the output groups more specifically (i.e. analysing patients requiring particular tests, such as X-ray, tomography and ultrasound), rather than radiology in general or, similarly, analysing patients in terms of requests for specific laboratory tests, such as haemogram, biochemistry or blood type.

Footnotes

Acknowledgements

The authors acknowledge Dr İlker Kızıloğlu for his general support and Hüseyin Çelik for his technical support. For writing assistance, the authors acknowledge Lecturer Simon Mumford, who is the English coordinator of the School of Foreign Languages and Academic Writing Center of İzmir University of Economics, İzmir, Turkey.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship and/or publication of this article.

ORCID iD

Görkem Sarıyer, PhD

Mustafa Gökalp Ataman, MD

References

Arslan

Colak

Sarihan

(2016) Different medical data mining approaches based prediction of ischemic stroke. Computer Methods and Programs in Biomedicine 130: 87–92.

Bellazzi

Zupan

(2008) Predictive data mining in clinical medicine: current issues and guidelines. International Journal of Medical Informatics 77(2): 81–97.

Ben-Ami

Nagachandran

Mendelson

, et al. (1999) Drug-induced hypoglycemic coma in 102 diabetic patients. Archives of Internal Medicine 159(3): 281–284.

Cheung

WWH

Heeney

Pound

(2002) An advance triage system. Accident and Emergency Nursing 10: 10–16.

Chuang

(2018) Predicting the prolonged length of stay of general surgery patients: a supervised learning approach. International Transactions in Operational Research 25(1): 75–90.

Etu

(2018) The impact of machine learning algorithms on benchmarking process in healthcare service delivery. PhD Thesis, Wayne State University.

Frawley

Piatetsky-Shapiro

Matheus

(1992) Knowledge discovery in databases: an overview. AI Magazine 13(3): 57.

Golmohammadi

(2016) Predicting hospital admissions to reduce emergency department boarding. International Journal of Production Economics 182: 535–544.

Gonzales

Bartlett

Besser

, et al. (2001) Principles of appropriate antibiotic use for treatment of nonspecific upper respiratory tract infections in adults: background. Annals of Emergency Medicine 37(6): 698–702.

10.

Gül

Güneri

(2015) Forecasting patient length of stay in an emergency department by artificial neural networks. Journal of Aeronautics and Space Technologies 8(2): 43–48.

11.

Hachesu

Ahmadi

Alizadeh

, et al. (2013) Use of data mining techniques to determine and predict length of stay of cardiac patients. Healthcare Informatics Research 9(2): 121–129.

12.

Hot

Schmulewitz

Viard

, et al. (2007) Fever of unknown origin in HIV/AIDS patients. Infectious Disease Clinics of North America 21(4): 1013–1032.

13.

Tai

Chen

, et al. (2017) Predicting return visits to the emergency department for pediatric patients: applying supervised learning techniques to the Taiwan National Health Insurance Research Database. Computer Methods and Programs in Biomedicine 144: 105–112.

14.

Huang

(2013) Mining association rules between abnormal health examination results and outpatient medical records. Health Information Management Journal 42(2): 23–30.

15.

Izady

Worthington

(2012) Setting staffing requirements for time dependent queueing networks: the case of accident and emergency departments. European Journal of Operational Research 219(3): 531–540.

16.

Kobayashi

Knuesel

White

, et al. (2019) Impact on length of stay of a hospital medicine emergency department boarder service. Journal of Hospital Medicine 20(14): E1–E7.

17.

Lee

Ryu

Bashir

, et al. (2013) Discovering medical knowledge using association rule mining in young adults with acute myocardial infarction. Journal of Medical Systems 37(2): 9896.

18.

Lin

Wang

Chiang

, et al. (2010) Abnormal diagnosis of Emergency Department triage explored with data mining technology: an Emergency Department at a Medical Center in Taiwan taken as an example. Expert Systems with Applications 37(4): 2733–2741.

19.

Lin

Zheng

, et al. (2011) Analysis by data mining in the emergency medicine triage database at a Taiwanese regional hospital. Expert Systems with Applications 38(9): 11078–11084.

20.

Linder

Woitok

(2020) Emergency department overcrowding: analysis and strategies to manage an international phenomenon. Wien Klin Wochenschr. Epub ahead of print 13 January 2020. DOI: 10.1007/s00508-019-01596-7

21.

Martin

Rossi

(1997) The acute abdomen: an overview and algorithms. Surgical Clinics of North America 77(6): 1227–1243.

22.

Nahar

Imam

Tickle

, et al. (2013) Association rule mining to detect factors which contribute to heart disease in males and females. Expert Systems with Applications 40(4): 1086–1093.

23.

Pendharkar

Khurana

(2014) Machine learning techniques for predicting hospital length of stay in Pennsylvania federal and specialty hospitals. International Journal of Computer Science & Applications 11(3): 45–56.

24.

Reinschmidt

Gottschalk

Kim

, et al. (1999) Intelligent miner for data: enhance your business intelligence. IBM International Technical Support Organization (IBM Redbooks). IBM Corporation.

25.

Resta

Sonnessa

Tànfani

, et al. (2018) Unsupervised neural networks for clustering emergent patient flows. Operations Research for Health Care 18: 41–51.

26.

Rowan

Ryan

Hegarty

, et al. (2007) The use of artificial neural networks to stratify the length of stay of cardiac patients based on preoperative and initial postoperative factors. Artificial Intelligence in Medicine 40(3): 211–221.

27.

Roy

Bhattacharya

Guin

(2019). A methodology for customizing clinical tests for esophageal cancer based on patient preferences. Artificial Intelligence in Medicine 95: 16–26.

28.

Sadeghi-Bazargani

Samadirad

Shahedifar

, et al. (2018) Epidemiology of road traffic injury fatalities among car users; a study based on forensic medicine data in East Azerbaijan of Iran. Bulletin of Emergency and Trauma 6(2): 146–154.

29.

Safdari

Rezaei-Hachesu

GhaziSaeedi

, et al. (2018) Evaluation of classification algorithms vs knowledge-based methods for differential diagnosis of asthma in Iranian patients. International Journal of Information Systems in the Service Sector 10(2): 22–35.

30.

Sarıyer

Ataman

Akay

, et al. (2017) An analysis of emergency medical services demand: time of day, day of the week, and location in the city. Turkish Journal of Emergency Medicine 17(2): 42–47.

31.

Sarıyer

Ataman

Kızıloğlu

(2018) Factors affecting length of stay in the emergency department: a research from an operational viewpoint. International Journal of Healthcare Management. Epub ahead of print 27 June 2018. DOI: 10.1080/20479700.2018.1489992

32.

Schaefer

(2011) Urticaria: evaluation and treatment. American Family Physician 83(9): 1078–1084.

33.

Shulman

Bisno

Clegg

, et al. (2012) Clinical practice guideline for the diagnosis and management of group A streptococcal pharyngitis: 2012 update by the Infectious Diseases Society of America. Clinical Infectious Diseases 55(10): 86–102.

34.

Taşar

Sarıyer

(2018) The use of data mining and neural networks for forecasting patient volume in an emergency department. In: 4th international researchers, statisticians, and young statisticians congress, Izmir, Turkey, 28–30 April 2018, book of abstracts. Available at: http://www.irsysc2018.com/abs.pdf (accessed 27 February 2020).

35.

Vigen

Kutscher

Fernandez

, et al. (2018) Evaluation of a novel rule-out myocardial infarction protocol incorporating high-sensitivity troponin T in a US hospital. Circulation 138(18): 2061–2063.

36.

World Health Organization (2018) Classifications. Available at: http://www.who.int/classifications/icd/en/ (accessed 27 February 2020)

37.

Wong

Chin

(2013) Modeling daily patient arrivals at Emergency Department and quantifying the relative importance of contributing variables using artificial neural network. Decision Support Systems 54(3): 1488–1498.

38.

Yeh

Cheng

Chen

(2011) A predictive model for cerebrovascular disease using data mining. Expert Systems with Applications 38(7): 8970–8977.

39.

Zuberbier

Maurer

(2007) Urticaria: current opinions about etiology, diagnosis and therapy. Acta Dermato-Venereologica 87(3): 196–205.