TEXT SIZE

CrossRef (0)
Identification of risk factors and development of the nomogram for delirium

Min-Seok Shina, Ji-Eun Janga, Jea-Young Lee1,a

aDepartment of Statistics, Yeungnam University, Korea
Correspondence to: 1 Department of Statistics, Yeungnam University, 280 Daehak-Ro, Gyeongsan, Gyeongbuk 38541, Korea. E-mail: jlee@yu.ac.kr
Received December 24, 2020; Revised March 16, 2021; Accepted March 16, 2021.
Abstract

In medical research, the risk factors associated with human diseases need to be identified to predict the incidence rate and determine the treatment plan. Logistic regression analysis is primarily used in order to select risk factors. However, individuals who are unfamiliar with statistics outcomes have trouble using these methods. In this study, we develop a nomogram that graphically represents the numerical association between the disease and risk factors in order to identify the risk factors for delirium and to interpret and use the results more effectively. By using the logistic regression model, we identify risk factors related to delirium, construct a nomogram and predict incidence rates. Additionally, we verify the developed nomogram using a receiver operation characteristics (ROC) curve and calibration plot. Nursing home, stroke/epilepsy, metabolic abnormality, hemodynamic instability, and analgesics were selected as risk factors. The validation results of the nomogram, built with the factors of training set and the test set of the AUC showed a statistically significant determination of 0.893 and 0.717, respectively. As a result of drawing the calibration plot, the coefficient of determination was 0.820. By using the nomogram developed in this paper, health professionals can easily predict the incidence rate of delirium for individual patients. Based on this information, the nomogram could be used as a useful tool to establish an individual’s treatment plan.

Keywords : delirium, incidence rate, logistic regression analysis, nomogram, risk factor
1. Introduction

Modern people are suffering from various diseases due to irregular lifestyles and various stresses. Recently, people suffer from not only physical illness but also from mental illness such as depression, impulse control disorders and dementia, which are all increasing rapidly. Interest in these mental disorders has also tended to have increased because they cause various social problems. In this study, we deal with the delirium which is one of the psychological disorders. Delirium is a reversible organic psychosis that is closely associated with and underlies a variety of mental illnesses pertaining to a confused state and excitement excited action. In particular, in the case of elderly patients, the incidence rate of delirium is 60%. It has clinical significance in terms of its mortality and morbidity rate of the underlying diseases (Cole and Primeau, 1993; Inouye, 1994; Inouye et al., 1999; Jang, 2017). Therefore, various studies on delirium risk factors have been ongoing. There are prospective studies for delirium risk factor of intensive care unit (ICU) inpatient (Dubois et al., 2001), a large cohort study about delirium risk factor and prognosis of ICU inpatient (Skrobik et al., 2007) and establishment of 32 risk factors related to delirium risk factor of ICU inpatient (Arend and Christensen, 2009). However, domestic research for delirium risk factor of inpatient via emergency room, not via ICU was hardly performed (Kwak et al., 2011). Hence, this study identified risk indicators for delirium using 31 delirium risk factors in elderly patients admitted through the emergency room. In medical research, the logistic regression model is mainly used to screen the risk factors associated with certain diseases. However, people who are unfamiliar with Statistics have difficulty interpreting the results using these methods. Thus, a lot of nomograms which graphically express the numerical relationship between the disease and risk factors are more developed and used so that health care workers identify the risk factors associated with delirium and it is easier to interpret and use the results (Lee et al., 2009). Nomogram has an advantage to predict the incidence rate and survival rate of the individual patient through the scoring system without having to calculate a complex formula (Iasonos et al., 2008). The nomogram for various diseases already has been developed overseas and nomogram also has been developed for the disease, such as gastric cancer, invasive cancer, prostate cancer, osteosarcoma in Korea (Jun, 2015; Lee, 2015; Ahn, 2013; Kim et al., 2014). Until now, a nomogram for delirium has not been developed, especially for elderly patients in emergency rooms. In this paper, we used the logistic regression model to identify risk factors of delirium and developed a nomogram to predict the incidence rate of delirium.

2. Methods

### 2.1. Data collection and management

The data used in this study was collected from elderly patients over 70 years of age who were admitted to the internal medicine ward via the emergency room at a general hospital during 24-month period from January 2008 to December 2009. Gender, age, medical history, social history and vital sign information of patients were based on a survey of medical records. Diagnosis of delirium mental disorder was finally diagnosed according to diagnostic and statistical manual of mental disorder-fourth edition (DSM-IV) (Kwak et al., 2011). The number of subjects used in the study was 414 people, of which 42 were diagnosed with delirium, due to underlying disease, through psychiatric treatment. Risk factors used in this study are 31 factors except age among 32 factors reported in 2009 (Arend and Christensen, 2009). All risk factors were expressed as binary type according to a predetermined standard for each factor. We randomly divided the raw data into training set (n = 276) and test set (n = 138) to build the model and validate it using these data. We constructed the nomogram using the training set and validated the constructed nomogram using the test set. Table 1 was a table obtained by classifying 31 risk factors according to characteristics.

### 2.2. Logistic regression analysis and nomogram

2.2.1. Logistic regression analysis

Logistic regression model is often applied in the exploratory analysis that identifies risk factors associated with the disease in the pathologic study identifies important factors in clinical research (Lee et al., 2005; Heo and Lee, 2008). The dependent variable in the logistic regression model is mainly a binary response variable, but it could be multiple response variables with several levels. In this paper, we take a look only a binary response variable. Binary response variable is also called “Bernoulli variable”. Distribution for this variable is specified by the probability for the failure and success. Assume that the response variable Y follows a distribution which has the probability of success when the explanatory variable X is x. When explanatory variable X is x, the probability that response variable Y is y is,

$P (Y=y∣X=x)=pxy (1-px)1-y, (y=0,1).$

Here, the non-linear equation for px of success probability using k explanatory variables X1, X2, . . . , Xk as follows,

$px=exp (α+β1X1+β2X2+⋯+βkXk)1+exp (α+β1X1+β2X2+⋯+βkXk).$

Followed by the former equation, the odds for the probability of success are,

$px1-px=exp (α+β1X1+β2X2+⋯+βkXk).$

Substituting a log to the equation above expressed as,

$ln (px1-px)=α+β1X1+β2X2+⋯+βkXk.$

The above equation is called a logistic regression model. In this paper, we build a logistic regression model with response variables for delirium and identify the risk factors. Finally, we develop a nomogram. The nomogram was used to predict the incidence rate of delirium by using statistical tools designed to predict the possibility for a particular result.

2.2.2. Nomogram

In medical research, the risk factors associated with human diseases need to be identified to predict the incidence rate and determine a treatment plan should be established. Logistic regression analysis is primarily used to select risk factors. However, it is difficult for the people who have never studied statistics to understand the results using these methods. A nomogram is a statistical tool to address this problem. A nomogram is built based on a number of variables associated with a disease, as well as the characteristics of the patient for a specific disease using statistical tools designed to predict the possibility for a particular result. A nomogram has a point line (having a value between 0 and 100) that indicates score assigned to the level of each risk factor, which is calculated from the estimated regression coefficient and the total points line that represents the cumulative sum of the scores assigned to each of the risk factors. Additionally, it has a probability line that can be obtained from the total points. Each of the risk factors are scored by using these estimated regression coefficients, represented by an integer from 0 to 100. The biggest factor that the value of the estimated regression coefficients will receive is 100 points followed by the most influential factor. The remaining factors are multiplied by 100 after the absolute value of each factor’s regression coefficient is divided by the absolute value of the greatest impact factor’s regression coefficient. Specific development process of nomogram is as below (Iasonos et al., 2008; Yang, 2014).

• Step 1. Definition of patient population: This study includes 414 elderly patients over 70 years of age admitted to the internal medicine ward via the emergency room at a general hospital during 24 month-period from January 2008 to December 2009.

• Step 2. Designation of interesting results: What could be inferred with this nomogram? The incidence rate of delirium.

• Step 3. Designation of risk factors that are expected to affect the outcome of interest.

• Step 4. Construction of the nomogram

• Model selection: In general, if the outcome variable is a binary type, a logistic regression model is used in this study.

• Risk factor selection: Chi-square test and Fisher’s exact test

• Analysis using a selected model (the logistic regression model): The-goodness-of-fit test utilizes the Hosmer-Lemeshow test if the logistic regression model is used.

• Nomogram construction

• 1) Calculation of the LPi j (linear predictor) value using the estimated regression coefficients obtained from the logistic regression model. As an example, we used the model in Iasonos’s paper (2008) where the number of risk factors is: i = 1, 2, . . . , m. For instance, there are four risk factors such as flow, age, clinical size, and sex. The number of categories for each risk factor: j = 1, 2, . . . , ni. In the example, flow and sex have 2 categories respectively. LP of factor is the largest absolute value of the estimated regression coefficients: LP*i Since the LP value of the flow variable is the largest absolute value, LP*i is the LP value of the flow variable.

• 2) Point calculation for each risk factor using the LP value: In Figure 1, the point is the length of the line that indicates the impact of each risk factor such as flow, age, clinical size and sex. The longer the length of this line, the greater the risk factor affects a specific disease.

$LPij=βij×Xij,Pointij=LPij-min LPijmax LP*j-min LP*j×100.$

In the example, to calculate the point when sex is male, max LP*j and min LP*j are LPflow = yes and LPflow = no respectively. min LPi j is LPsex = female, thus,

$Pointsex=male=LPsex=male-LPsex=femaleLPflow=yes-LPflow=no×100=15.$

• 3) Calculation of points per unit of linear predictor is,

$100max LP*j-min LP*j.$

• 4)

• (a) LPfor TP=0 : hold each factor at reference level (intercept)

• (b) LP for TP>0 : $ln (Risk of Y=11-Risk of Y=1)$

• 5) Total points = Points per unit of linear predictor × (LPfor TP>0 − LPfor TP=0) : This is a value shown in the total points in

The nomogram can be constructed with the score calculated using the above formula as shown in Figure 1. The developed nomogram can be used to predict the incidence rate or survival rate of the individual patient for a specific disease. For example, the case of a patient who is flow (no), age (75), clinical size (14), sex (m) determines the total points combined to the point of each factor using nomogram of Figure 1. By using this value, it is possible to predict the incidence rate of about 50%.

• Step 5. Validation

• (a) Discrimination: area under the roc curve (AUC)

Receiver operation characteristic (ROC) curve is one of the methods that has traditionally been used in order to evaluate the performance of a predictive model in the field of discriminant analysis (Kang et al., 2014). It can be created by drawing a connecting line with ‘sensitivity’ on the vertical axis against ‘1 - specificity’ on the horizontal axis. The upper ROC curve is diagonally positioned as the model performs better. The area under the roc curve (AUC) may be used as a value to measure the performance of a predictive model. This value is present between 0.5 and 1. The proximity to 1 indicates a good performance of the predictive model. Thus, AUC was used to verify the discrimination of the nomogram.

• (b) Calibration: Calibration plot is to check how predicted probability of nomogram matches with the observed probability (D’Agostino et al., 2001; Nam and D’Agostino, 2002). The bservation probability can be calculated from a group of patients with the same prediction probability. For all that, if the patient that has the same prediction probability does not exist sufficiently, observation probability can be calculated by using the ratio of onset after grouping the range of possible prediction on the patient that has the same prediction probability (Vuk and Curk, 2006). The deal line is a line drawn at a 45-degree angle in the calibration plot. The most ideal line SHOWS the predicted probability to be consistent. When the predicted probability and the observed probability drawn along the ideal line, nomogram could be assigned as an accurate predictive ability (Iasonos et al., 2008). Therefore, we used the calibration plot to validate how the predicted incidence rate corresponds with the observed incidence rate.

### 2.3. Risk factor selection

Table 2 shows variables that indicate statistically significance by the chi-square test or Fisher’s exact test using the training set in order to identify the significance of delirium and 31 risk factors associated with delirium. 12 of 31 risk factors showed significant results. Table 3 is the result of applying the 12 risk factors that were finally screened through a chi-square test or Fisher’s exact test in the multiple logistic regression model. The variable selection method was used in stepwise regression. The Hosmer-Lemeshow goodness-of-fit test suggests a good fit with a significant probability of 0.166. The final screening risk factors include nursing home, stroke/epilepsy, metabolic abnormality, hemodynamic instability and analgesics.

3. Results

### 3.1. Nomogram construction for delirium

In this study, we constructed a nomogram to predict the incidence rate of delirium using 5 risk factors that were finally screened through multiple logistic regression analysis of risk factors associated with delirium. Figure 2 is the result of calculating the values required for the development of the nomogram according to the nomogram construction procedure introduced in Section 2.2. First, the estimated regression coefficients obtained from the logistic regression model, were used to obtain the LP values for each risk factor, and the point values, which are the length of the line representing the influence of each risk factor, were calculated. In addition, the unit score per LP value, the LP value for zero total point and LP value for positive total point were calculated, and the corresponding total point and the incidence were determined. Next, a SAS data set containing all the points, total points, and incidence values for each of the calculated risk factors was created, then a nomogram was developed using the SAS program. Figure 3 represents the SAS codes used in the development. The developed nomogram is shown in

In Figure 4, the nomogram has the point line that is indicated score assigned to the level of each risk factor and a line that is the indicated influence of 5 risk factors: nursing home, stroke/epilepsy, metabolic abnormality, hemodynamic instability, and analgesics. The total points line that is represented as the cumulative sum of the scores assigned to each of the risk factors and the risk of event line represents the probability to predict the incidence rate of delirium out of the total points. The line indicating the influence of each risk factor has been assigned a score from the estimated regression coefficient using the multiple logistic regression model, and the longer the length of this line can be inferred to be a greater impact in the outbreak of delirium. In Figure 4, stroke/epilepsy is assigned with 100 points and is the factor that has the greatest impact on the incidence of delirium. Metabolic abnormality is second, followed by nursing home, analgesics, and hemodynamic instability. Furthermore, the rest of the factors, except for stroke/epilepsy are similar in length. Seeing that stroke/epilepsy is relatively long compared to the other factors, the experience of stroke/epilepsy shows that the extent of the effect on delirium onset is very large. Individual patients can predict the incidence rate of delirium by combining their scores received from each factor using the nomogram in Figure 4. The higher the total points obtained from the developed nomogram, the higher the risk is for delirium to occur.

For instance, a patient who is nursing home (no), stroke/eilepsy (yes), metabolic abnormality (no), hemodynamic instability (yes), analgesics (yes) determines the total points combined point of each factor as shown in Figure 5, and using this value, it is possible to predict the incidence rate of delirium to greater than 75%. Considering that the incidence rate of delirium of hospitalized elderly patients is 60% (Cole and Primeau, 1993; Inouye, 1994; Inouye et al., 1999), the incidence rate of this patient is judged to be very high. Therefore, consideration must be given to establish a treatment plan as soon as possible in accordance with the results obtained from the nomogram and the general characteristics of the patient.

### 3.2. Validation of the nomogram for delirium

3.2.1. Discrimination: the area under the roc curve (AUC)

Figure 6 illustrates the ROC curve drawn using the training data and the test data. The AUC of the training set and the test set showed a statistically significant determination, 0.893 (p < 0.0001), 0.717 (p = 0.010), respectively.

3.2.2. Calibration: Calibration plot

Figure 7 is the calibration plot that compares the mean value and observed probability of the respective groups after the patients were divided into six group according to the predicted probability of the nomogram. The coefficient of determination (R2) of two calibration plots was 0.871, 0.820, respectively (Qui et al., 2016). Since the estimated regression line did not significantly deviate from the ideal line, the developed nomogram is judged to be suitable for use in predicting the incidence rate of delirium.

4. Discussion

The purpose of this study was to select the risk factors for delirium, a reversible organic mental disorder, and to develop a nomogram that predicts the incidence rate of delirium using selected risk factors. The data was collected from elderly patients over 70 years of age, who were admitted to the internal medicine ward via the emergency room at a general hospital during 24-month period from January 2008 to December 2009. 42 of these elderly patients were diagnosed with delirium, due to underlying disease, through psychiatric treatment. We used 31 risk factors that were associated with delirium, except for the inclusion of age that resulted in 32 factors being reported in 2009. In addition, we randomly divided the raw data into training set (n = 276) and test set (n = 138) to build the model, which was validated by using the data, and construct a nomogram using the training set and validated the developed nomogram using the test set. First, we performed the chi-square test or Fisher’s exact test using the training set in order to identify the significance of delirium and the 31 risk factors associated with delirium. As a result, 12 risk factors out of 31 risk factors showed significant results. After identifying 12 risk factors, screened through the chi-square test or Fisher’s exact test in the multiple logistic regression model, nursing home, stroke/epilepsy, metabolic abnormality, hemodynamic instability, and analgesics were selected as the final risk factors. Thus, we constructed a nomogram to predict the incidence rate of delirium using 5 risk factors from the final screening. The validation results of the developed nomogram showed that AUC of the training set and the test set to indicate a statistically significant determination, 0.893 (p < 0.0001), 0.717 (p = 0.010), respectively. As a result of drawing a calibration plot, the coefficients of determination were 0.871 and 0.820, respectively. Meanwhile, the importance of delirium, disease-related risk factors can easily be seen in the developed nomogram shown in Figure 4. When the level of other risk factors was given, it was easy to see the change in the incidence rate in relation to the change in the value of a particular risk factor. Therefore, health care workers can easily predict the incidence rate of an individual patient’s delirium and will be able to use the nomogram as a useful tool in establishing a treatment plan for the patient based on this information. However, additional external validation, using data of patients from other organizations, may be necessary because this study developed a nomogram based on patients diagnosed delirium at a single institution. The study is ongoing so that it can be generalized to target more patients.

Figures
Fig. 1. Example of nomogram.
Fig. 2. Calculation result of LP, point, and total point.
Fig. 3. SAS codes for nomogram development.
Fig. 4. Nomogram for predicting the probability of delirium.
Fig. 5. Incidence rate calculation example using the nomogram.
Fig. 6. Receiver operation characteristic (ROC) curve.
Fig. 7. Calibration plot.
TABLES

### Table 1

Risk factors associated with delirium

PredisposingAcute illnessPharmacology
DementiaInfectionAnaesthetics
Nursing homeHypoxiaAnalgesics
Alcohol abuseMetabolic abnormalityAntibiotics
SmokingElectrolyte imbalanceAnticholinergic
Visual impairmentMalnutritionAntihistamine
Hearing lossHemodynamic instabilityAntihypertensives
Bun/CreCNS disorderBronchodilator
Congestive heart failureSeizureDiuretic
DepressionVascular problemSedative
Steroid

### Table 2

Chi-square, Fisher’s exact test results for 12 risk factors

Risk factorNon-delirium(%)Delirium(%)p-value
Nursing homeYes30(78.9)8(21.1)0.040*
No217(91.2)21(8.8)

Stroke/EpilepsyYes20(64.5)11(35.5)< 0.0001*
No227(92.7)18(7.3)

Bun/CreAbnormal22(75.9)7(24.1)0.021*
Normal225(91.1)22(8.9)

HypoxiaAbnormal78(83.9)15(16.1)0.030
Normal169(92.3)14(7.7)

Metabolic abnormalityAbnormal31(66.0)16(34.0)< 0.0001*
Normal216(94.3)13(5.7)

Electrolyte imbalanceAbnormal57(81.4)13(18.6)0.011
Normal190(92.2)16(7.8)

Hemodynamic instabilityAbnormal24(72.7)9(27.3)0.003*
Normal223(91.8)20(8.2)

AnalgesicsYes24(72.7)9(27.3)0.003*
No223(91.8)20(8.2)

BronchodilatorYes79(83.2)16(16.8)0.013
No168(92.8)13(7.2)

Cardiac drugYes20(76.9)6(23.1)0.04*
No227(90.8)23(9.2)

DiureticYes37(80.4)9(19.6)0.036*
No210(91.3)20(8.7)

SedativeYes25(75.8)8(24.2)0.012*
No222(91.4)21(8.6)

*Fisher’s exact test

### Table 3

Multiple logistic regression analysis results using the 12 risk factors

VariableVariable contrastOdds ratio95% CIp-value
Nursing homeYes versus no4.6401.565–13.760.006
Stroke/EpilepsyYes versus no11.6743.854–35.3590.000
Metabolic abnormalityYes versus no4.9891.934–12.8640.001
Hemodynamic instabilityYes versus no3.6801.262–10.730.017
AnalgesicsYes versus no4.3361.43–13.1450.010

*P-value of Hosmer–Lemshow goodness-of-fit test is 0.166

References
1. Ahn JH (2013). Nomogram for Prediction of Prostate Cancer in Korean Men with Serum Prostate-Specific Antigen Less Than 10ng/mL, Busan, Busan University.
2. Arend E and Christensen M (2009). Delirium in the intensive care unit: A review, British association of critical care nurses. Nursing in Critical Care, 14, 145-154.
3. Cole M and Primeau F (1993). Prognosis of delirium in elderly hospital patients. Canadian Medical Association Journal, 149, 41-46.
4. D’Agostino RB, Grundy S, Sullivan LM, and Wilson P (2001). Validation of the Framingham coronary heart disease prediction scores. Journal of the American Medical Association, 286, 180-187.
5. Dubois M, Strobik Y, Bergeron N, Dumont M, and Dial S (2001). Delirium in an intensive care unit: A study of risk factors. Intensive Care Medicine, 27, 1297-1304.
6. Heo MH and Lee YG (2008). Data-Mining Modeling and Example, Seoul, Hannarae.
7. Iasonos A, Schrag D, Raj GV, and Panageas KS (2008). How to build and interpret a nomogram for cancer prognosis. Journal of Clinical Oncology, 26, 1364-1370.
8. Inouye S (1994). The dilemma of delirium: Clinical and research controversies regarding diagnosis and evaluation of delirium in hospitalized elderly medical patients. The American Journal of Medicine, 97, 278-288.
9. Inouye S, Schlesinger M, and Lyndon T (1999). Delirium: A symptom of how hospital care is failing older persons and a window to improve quality of hospital care. The American Journal of Medicine, 106, 565-573.
10. Jang JE (2017). (Identification of Risk Factors and Development of the Nomogram for Delirium(Master’s thesis)) , Yeungnam University, Gyeongsan.
11. Jun HJ (2015). Establishment of a Nomogram to Predict the Prognosis of Metastatic or Recurrent Gastric Cancer Patients, Seoul, Yonsei University.
12. Kang HC, Han ST, Choi JH, Lee SG, Kim ES, and Eom IH (2014). Data-Mining Methodology for Big Data Analysis, Seoul, Freedom academy.
13. Kim SH, Shin K, Kim HY, Cho YJ, Noh JK, Suh JS, and Yang WI (2014). Postoperative nomogram to predict the probability of metastasis in Enneking stage IIB extremity osteosarcoma. The BioMed Central Cancer, 12, 666.
14. Kwak KH, Do BS, Park SY, and Lee SB (2011). Original articles: risk factors for delirium in elderly patients visiting an emergency department. Journal of the Korean Society of Emergency Medicine, 22, 489-493.
15. Lee JW, Park MR, and Yu HN (2005). Statistical Method for Bioscience Research, Seoul, Freedom Academy.
16. Lee KM, Kim WJ, and Yun SJ (2009). A clinical nomogram construction method using genetic algorithm and naïve bayesian technique. Journal of Korean Institute of Intelligent Systems, 19, 769-801.
17. Lee SC (2015). Development and Validation of Web-Based Nomogram to Predict Postoperative Invasive Component in Ductal Carcinoma in Situ at Core Needle Breast Biopsy, Seoul, Dankook University.
18. Nam BH and D’Agostino RB (2002). Discrimination Index, the Area Under the ROC Curve, Goodness-of-Fit Tests and Model Validity, Boston, Birkhauser.
19. Qiu SQ, et al. (2016). A nomogram to predict the probability of axillary lymph node metastasis in early breast cancer patients with positive axillary ultrasound. Scientific Reports, 6.
20. Skrobik Y, Ouimet S, and Kavanagh BP (2007). Incidence, risk factor and consequences of icu delirium. Intensive Care Med, 33, 66-73.
21. Vuk M and Curk T (2006). ROC curve, lift chart and calibration plot. Metodološki Zvezki, 3, 89-108.
22. Yang D (Array). Build prognostic nomograms for risk assessment using sas. Proceedings of SAS Global Forum, 264-2013.