In medical research, the risk factors associated with human diseases need to be identified to predict the incidence rate and determine the treatment plan. Logistic regression analysis is primarily used in order to select risk factors. However, individuals who are unfamiliar with statistics outcomes have trouble using these methods. In this study, we develop a nomogram that graphically represents the numerical association between the disease and risk factors in order to identify the risk factors for delirium and to interpret and use the results more effectively. By using the logistic regression model, we identify risk factors related to delirium, construct a nomogram and predict incidence rates. Additionally, we verify the developed nomogram using a receiver operation characteristics (ROC) curve and calibration plot. Nursing home, stroke/epilepsy, metabolic abnormality, hemodynamic instability, and analgesics were selected as risk factors. The validation results of the nomogram, built with the factors of training set and the test set of the AUC showed a statistically significant determination of 0.893 and 0.717, respectively. As a result of drawing the calibration plot, the coefficient of determination was 0.820. By using the nomogram developed in this paper, health professionals can easily predict the incidence rate of delirium for individual patients. Based on this information, the nomogram could be used as a useful tool to establish an individual’s treatment plan.
Modern people are suffering from various diseases due to irregular lifestyles and various stresses. Recently, people suffer from not only physical illness but also from mental illness such as depression, impulse control disorders and dementia, which are all increasing rapidly. Interest in these mental disorders has also tended to have increased because they cause various social problems. In this study, we deal with the delirium which is one of the psychological disorders. Delirium is a reversible organic psychosis that is closely associated with and underlies a variety of mental illnesses pertaining to a confused state and excitement excited action. In particular, in the case of elderly patients, the incidence rate of delirium is 60%. It has clinical significance in terms of its mortality and morbidity rate of the underlying diseases (Cole and Primeau, 1993; Inouye, 1994; Inouye
The data used in this study was collected from elderly patients over 70 years of age who were admitted to the internal medicine ward via the emergency room at a general hospital during 24-month period from January 2008 to December 2009. Gender, age, medical history, social history and vital sign information of patients were based on a survey of medical records. Diagnosis of delirium mental disorder was finally diagnosed according to diagnostic and statistical manual of mental disorder-fourth edition (DSM-IV) (Kwak
Logistic regression model is often applied in the exploratory analysis that identifies risk factors associated with the disease in the pathologic study identifies important factors in clinical research (Lee
Here, the non-linear equation for
Followed by the former equation, the odds for the probability of success are,
Substituting a log to the equation above expressed as,
The above equation is called a logistic regression model. In this paper, we build a logistic regression model with response variables for delirium and identify the risk factors. Finally, we develop a nomogram. The nomogram was used to predict the incidence rate of delirium by using statistical tools designed to predict the possibility for a particular result.
In medical research, the risk factors associated with human diseases need to be identified to predict the incidence rate and determine a treatment plan should be established. Logistic regression analysis is primarily used to select risk factors. However, it is difficult for the people who have never studied statistics to understand the results using these methods. A nomogram is a statistical tool to address this problem. A nomogram is built based on a number of variables associated with a disease, as well as the characteristics of the patient for a specific disease using statistical tools designed to predict the possibility for a particular result. A nomogram has a point line (having a value between 0 and 100) that indicates score assigned to the level of each risk factor, which is calculated from the estimated regression coefficient and the total points line that represents the cumulative sum of the scores assigned to each of the risk factors. Additionally, it has a probability line that can be obtained from the total points. Each of the risk factors are scored by using these estimated regression coefficients, represented by an integer from 0 to 100. The biggest factor that the value of the estimated regression coefficients will receive is 100 points followed by the most influential factor. The remaining factors are multiplied by 100 after the absolute value of each factor’s regression coefficient is divided by the absolute value of the greatest impact factor’s regression coefficient. Specific development process of nomogram is as below (Iasonos
Step 1. Definition of patient population: This study includes 414 elderly patients over 70 years of age admitted to the internal medicine ward via the emergency room at a general hospital during 24 month-period from January 2008 to December 2009.
Step 2. Designation of interesting results: What could be inferred with this nomogram? The incidence rate of delirium.
Step 3. Designation of risk factors that are expected to affect the outcome of interest.
Step 4. Construction of the nomogram
Model selection: In general, if the outcome variable is a binary type, a logistic regression model is used in this study.
Risk factor selection: Chi-square test and Fisher’s exact test
Analysis using a selected model (the logistic regression model): The-goodness-of-fit test utilizes the Hosmer-Lemeshow test if the logistic regression model is used.
Nomogram construction
1) Calculation of the LP
2) Point calculation for each risk factor using the LP value: In Figure 1, the point is the length of the line that indicates the impact of each risk factor such as flow, age, clinical size and sex. The longer the length of this line, the greater the risk factor affects a specific disease.
In the example, to calculate the point when sex is male, max LP_{*}
3) Calculation of points per unit of linear predictor is,
4)
(a) LP_{for }
(b) LP for
5) Total points = Points per unit of linear predictor × (LP_{for TP>0} − LP_{for TP=0}) : This is a value shown in the total points in Figure 1 (Iasonos
The nomogram can be constructed with the score calculated using the above formula as shown in Figure 1. The developed nomogram can be used to predict the incidence rate or survival rate of the individual patient for a specific disease. For example, the case of a patient who is flow (no), age (75), clinical size (14), sex (m) determines the total points combined to the point of each factor using nomogram of Figure 1. By using this value, it is possible to predict the incidence rate of about 50%.
Step 5. Validation
(a) Discrimination: area under the roc curve (AUC)
Receiver operation characteristic (ROC) curve is one of the methods that has traditionally been used in order to evaluate the performance of a predictive model in the field of discriminant analysis (Kang
(b) Calibration: Calibration plot is to check how predicted probability of nomogram matches with the observed probability (D’Agostino
Table 2 shows variables that indicate statistically significance by the chi-square test or Fisher’s exact test using the training set in order to identify the significance of delirium and 31 risk factors associated with delirium. 12 of 31 risk factors showed significant results. Table 3 is the result of applying the 12 risk factors that were finally screened through a chi-square test or Fisher’s exact test in the multiple logistic regression model. The variable selection method was used in stepwise regression. The Hosmer-Lemeshow goodness-of-fit test suggests a good fit with a significant probability of 0.166. The final screening risk factors include nursing home, stroke/epilepsy, metabolic abnormality, hemodynamic instability and analgesics.
In this study, we constructed a nomogram to predict the incidence rate of delirium using 5 risk factors that were finally screened through multiple logistic regression analysis of risk factors associated with delirium. Figure 2 is the result of calculating the values required for the development of the nomogram according to the nomogram construction procedure introduced in Section 2.2. First, the estimated regression coefficients obtained from the logistic regression model, were used to obtain the LP values for each risk factor, and the point values, which are the length of the line representing the influence of each risk factor, were calculated. In addition, the unit score per LP value, the LP value for zero total point and LP value for positive total point were calculated, and the corresponding total point and the incidence were determined. Next, a SAS data set containing all the points, total points, and incidence values for each of the calculated risk factors was created, then a nomogram was developed using the SAS program. Figure 3 represents the SAS codes used in the development. The developed nomogram is shown in Figure 4.
In Figure 4, the nomogram has the point line that is indicated score assigned to the level of each risk factor and a line that is the indicated influence of 5 risk factors: nursing home, stroke/epilepsy, metabolic abnormality, hemodynamic instability, and analgesics. The total points line that is represented as the cumulative sum of the scores assigned to each of the risk factors and the risk of event line represents the probability to predict the incidence rate of delirium out of the total points. The line indicating the influence of each risk factor has been assigned a score from the estimated regression coefficient using the multiple logistic regression model, and the longer the length of this line can be inferred to be a greater impact in the outbreak of delirium. In Figure 4, stroke/epilepsy is assigned with 100 points and is the factor that has the greatest impact on the incidence of delirium. Metabolic abnormality is second, followed by nursing home, analgesics, and hemodynamic instability. Furthermore, the rest of the factors, except for stroke/epilepsy are similar in length. Seeing that stroke/epilepsy is relatively long compared to the other factors, the experience of stroke/epilepsy shows that the extent of the effect on delirium onset is very large. Individual patients can predict the incidence rate of delirium by combining their scores received from each factor using the nomogram in Figure 4. The higher the total points obtained from the developed nomogram, the higher the risk is for delirium to occur.
For instance, a patient who is nursing home (no), stroke/eilepsy (yes), metabolic abnormality (no), hemodynamic instability (yes), analgesics (yes) determines the total points combined point of each factor as shown in Figure 5, and using this value, it is possible to predict the incidence rate of delirium to greater than 75%. Considering that the incidence rate of delirium of hospitalized elderly patients is 60% (Cole and Primeau, 1993; Inouye, 1994; Inouye
Figure 6 illustrates the ROC curve drawn using the training data and the test data. The AUC of the training set and the test set showed a statistically significant determination, 0.893 (
Figure 7 is the calibration plot that compares the mean value and observed probability of the respective groups after the patients were divided into six group according to the predicted probability of the nomogram. The coefficient of determination (
The purpose of this study was to select the risk factors for delirium, a reversible organic mental disorder, and to develop a nomogram that predicts the incidence rate of delirium using selected risk factors. The data was collected from elderly patients over 70 years of age, who were admitted to the internal medicine ward via the emergency room at a general hospital during 24-month period from January 2008 to December 2009. 42 of these elderly patients were diagnosed with delirium, due to underlying disease, through psychiatric treatment. We used 31 risk factors that were associated with delirium, except for the inclusion of age that resulted in 32 factors being reported in 2009. In addition, we randomly divided the raw data into training set (
Risk factors associated with delirium
Predisposing | Acute illness | Pharmacology |
---|---|---|
Dementia | Infection | Anaesthetics |
Nursing home | Hypoxia | Analgesics |
Alcohol abuse | Metabolic abnormality | Antibiotics |
Smoking | Electrolyte imbalance | Anticholinergic |
Visual impairment | Malnutrition | Antihistamine |
Hearing loss | Hemodynamic instability | Antihypertensives |
Bun/Cre | CNS disorder | Bronchodilator |
Stroke/epilepsy | Head trauma | Cardiac drug |
Congestive heart failure | Seizure | Diuretic |
Depression | Vascular problem | Sedative |
Steroid |
Chi-square, Fisher’s exact test results for 12 risk factors
Risk factor | Non-delirium(%) | Delirium(%) | ||
---|---|---|---|---|
Nursing home | Yes | 30(78.9) | 8(21.1) | 0.040* |
No | 217(91.2) | 21(8.8) | ||
Stroke/Epilepsy | Yes | 20(64.5) | 11(35.5) | < 0.0001* |
No | 227(92.7) | 18(7.3) | ||
Bun/Cre | Abnormal | 22(75.9) | 7(24.1) | 0.021* |
Normal | 225(91.1) | 22(8.9) | ||
Hypoxia | Abnormal | 78(83.9) | 15(16.1) | 0.030 |
Normal | 169(92.3) | 14(7.7) | ||
Metabolic abnormality | Abnormal | 31(66.0) | 16(34.0) | < 0.0001* |
Normal | 216(94.3) | 13(5.7) | ||
Electrolyte imbalance | Abnormal | 57(81.4) | 13(18.6) | 0.011 |
Normal | 190(92.2) | 16(7.8) | ||
Hemodynamic instability | Abnormal | 24(72.7) | 9(27.3) | 0.003* |
Normal | 223(91.8) | 20(8.2) | ||
Analgesics | Yes | 24(72.7) | 9(27.3) | 0.003* |
No | 223(91.8) | 20(8.2) | ||
Bronchodilator | Yes | 79(83.2) | 16(16.8) | 0.013 |
No | 168(92.8) | 13(7.2) | ||
Cardiac drug | Yes | 20(76.9) | 6(23.1) | 0.04* |
No | 227(90.8) | 23(9.2) | ||
Diuretic | Yes | 37(80.4) | 9(19.6) | 0.036* |
No | 210(91.3) | 20(8.7) | ||
Sedative | Yes | 25(75.8) | 8(24.2) | 0.012* |
No | 222(91.4) | 21(8.6) |
^{*}Fisher’s exact test
Multiple logistic regression analysis results using the 12 risk factors
Variable | Variable contrast | Odds ratio | 95% CI | |
---|---|---|---|---|
Nursing home | Yes versus no | 4.640 | 1.565–13.76 | 0.006 |
Stroke/Epilepsy | Yes versus no | 11.674 | 3.854–35.359 | 0.000 |
Metabolic abnormality | Yes versus no | 4.989 | 1.934–12.864 | 0.001 |
Hemodynamic instability | Yes versus no | 3.680 | 1.262–10.73 | 0.017 |
Analgesics | Yes versus no | 4.336 | 1.43–13.145 | 0.010 |
^{*}P-value of Hosmer–Lemshow goodness-of-fit test is 0.166