Quantile regression (Koenker and Bassett Jr., 1978) is one of the most common statistical techniques used to conduct statistical inference for conditional quantile functions. The method can be used to construct models to estimate the percentile of conditional distribution and provide robustness to outliers. Thus, quantile regression has been applied in many fields of study, such as economics, clinical studies, and epidemiology. (Eide and Showalter, 1998; Zietz
Like classical regression methods based on minimizing sums of squared residuals which can estimate conditional mean models, quantile regression uses the method to minimize sums of weighted absolute residuals for estimating conditional quantile models. Since the quantile regression utilizes the absolute loss function, the quantile estimators can often be difficult to obtain the optimal solutions, i,e., it is not continuously differentiable (Newey and Powell, 1987). Thus Newey and Powell (1987) proposed asymmetric least square (ALS) estimation, which uses the square loss function called
For quantile regression methods to apply to models with a nonlinear relationship between the covariates and the response, Koenker
In this study, we propose an extension of the method discussed in Efron (1992), the application of asymmetric maximum likelihood estimation (AMLE) to generalized nonlinear models, and use the parametric bootstrap method to obtain confidence intervals for the estimated parameters and their smoothing functions.
The paper is organized as follows: In Section 2, we introduce the AMLE method for the generalized nonlinear percentile model and the parametric bootstrap method to compute 95% confidence intervals of each percentile estimate. In Section 3, we conduct simulation studies considering three distributions of response in the exponential family to evaluate the statistical and computational performance of the proposed method. In Section 4, we use reallife data such as chlorine, and Life Span Study (LSS) cohort data to carry out the studies. In Section 5, we summarize and discuss our results.
The generalized nonlinear percentile regression uses the asymmetric maximum likelihood estimation (AMLE) method to estimate the generalized nonlinear models. In the next two subsections, we first describe the generalized nonlinear models and then provide the AMLE method to estimate the parameter coefficients for the generalized nonlinear models. In Section 2.2, we present the algorithms of AMLE and the parametric bootstrap methods to compute the variance estimates of the parameter coefficients of interest. In addition, we rewrite the expectile regression as the percentile regression to clarify the meaning of the method in this manuscript.
Suppose that data consists of (
where the parameter
where
where (
Table 1 shows the link functions corresponding to the conditional distribution of the response variable given the covariates. Generally, we take a link function as an identity, logarithm, logit and negative inverse link function for normal, Poisson, Bernoulli or binomial, and Gamma distribution, respectively. Compared with other link functions, the canonical link function in the Gamma distribution does not cover the whole space of the real number because the predictor function might be negative even though the mean of the Gamma distribution must be a positive value. So, in this case, the noncanonical link function like a logarithm link function is as an alternative.
We extend generalized linear models to generalized nonlinear models using a nonlinear predictor function
The AML method uses deviance to estimate the parameters of generalized nonlinear percentile models. In general, the deviance is defined as
where the two different parameters
From this deviance function, we compute the weight of the part of
Back to the formula (
where
where
Since
To estimate the confidence intervals for all percentile parameters, we utilize the parametric bootstrap method because we assume the link function, which implies the conditional distribution of the response. The parametric bootstrap needs the information of the population distribution. The parametric bootstrap method is more powerful and provides smaller variances than the nonparametric bootstrap if the distribution assumption is right and the sample sizes are relatively small sometimes. In particular, there are a large number of zeroresponse values, the model has a sparse response variable, at that time the parametric bootstrap provides stable results in comparison to the nonparametric bootstrap. The parametric bootstrap algorithm is provided in Algorithm 2.
AMLE for generalized nonlinear percentile model
Computation of parametric bootstrap confidence intervals for generalized nonlinear percentile model
In this section, we conduct simulation studies for generalized nonlinear percentile regression. We consider the three conditional distributions (normal, exponential, and Poisson) of a response given the covariates and three mean functions (one linear and two nonlinear models). The Monte Carlo runs of 1000 with sample sizes
where
We assume that the error distribution follows the normal distribution with the standard deviation
For
Table 3 and Figure 1 are the results of the estimated percentile models in each mean function. Table 3 illustrates that the RMSE tends to be smaller as the sample size increases. Moreover, RMSE decreases as the quantile concentrates on
In
In the second numerical example, we consider the conditional distribution of the response
Table 4 and Figure 2 show the numerical results, which demonstrate that the RMSE tends to be smaller for large
Finally, we assume that the response
Table 5 and Figure 3 illustrate that the RMSEs decrease as the sample size increases or as
We apply generalized nonlinear percentile regression using AMLE to two data samples from a chemical study that investigated the relationship between the proportion of chlorine in the product (Draper and Smith, 1981; Smith and Dubey, 1964) in Section 4.1 and a radiationassociated epidemiological study, called the Life Span Study (LSS), of health effects in atomic bomb survivors in Japan Preston
Draper and Smith (1981) stated a problem due to Smith and Dubey (1964) about a certain product having 50% available chlorine at the time of manufacturing. When it reached the customer eight weeks later, the level of available chlorine had dropped to 49%. It is known that the level should stabilize at approximately 30%. To predict how long the chemical would last at the customer site, samples were collected at different times. The data includes 44 observations in which the response is the fraction of available chlorine, and the covariate is the length of time between when the product was produced and when it was used. The fraction of available chlorine in the product decreases with time. It was postulated that the following nonlinear model fits the data,
where
Table 6 and Figure 4 show that the MLEs of the parameters of interest have the same values as the parameter estimate of the 50% quantile for the percentile model. The result is similar to that explained in Section 3.1. Figure 4 illustrates that the line corresponding to the MLEs divides the entire data point by half. For the 25% and 75% percentile models,
We also apply the proposed method to the LSS cohort data from the Radiation Effects Research Foundation (RERF), which has conducted healthrelated research among atomic bomb survivors in Hiroshima and Nagasaki, Japan, for more than 70 years. The following data is from the time period of 1958–01998 and includes 111,952 people with 2,939,361 personyear, and the data consists of cases, personyear, city, gender, attained age, age at exposure, total weighted
where
Table 7 and Figure 5 illustrate that when
The percentile rank
We studied generalized nonlinear percentile regression using AMLE. We proposed estimating the percentile nonlinear parameters, and the algorithm of the parametric bootstrap was more powerful than the nonparametric bootstrap, which is advantageous for small samples which has sparsity. The simulation results show that the proposed method has comparable asymptotic performance, but the performance along with quantile
The AMLE method is the estimation of percentile models minimizing the asymmetric version of a deviance function. In this manuscript, we presented the algorithm for finding appropriate weight corresponding to the percentile rank
The future topic related to the generalized percentile regression including the generalized nonlinear percentile regression is to investigate the approximate method for appropriate
Canonical link functions for the distributions, which are members of exponential families. Note that in the Gamma distribution,
Distribution  Distribution mean  Canonical link function 

Normal( 

Poisson( 
log 

Bernoulli( 
log{ 

Gamma( 
−( 
Deviance
Distribution  Deviance 

Normal  ( 
Poisson  2{ 
Bernoulli  −2{ 
Gamma  2 
Monte Carlo RMSE when the response is normal.
Linear  Nonlinear 1  Nonlinear 2  

RMSE  RMSE  RMSE  
0.148  0.200  0.166  0.143  0.493  0.167  0.142  0.499  0.203  0.205  
0.479  0.201  0.153  0.477  0.494  0.152  0.475  0.500  0.202  0.188  
0.745  0.201  0.143  0.745  0.495  0.141  0.748  0.500  0.202  0.177  
0.999  0.201  0.139  0.998  0.496  0.138  1.004  0.500  0.201  0.172  
1.272  0.201  0.143  1.268  0.496  0.143  1.273  0.500  0.201  0.178  
1.530  0.200  0.152  1.525  0.497  0.150  1.533  0.500  0.201  0.188  
1.855  0.200  0.163  1.853  0.497  0.165  1.863  0.501  0.201  0.205  
ML  0.999  0.201  0.003  0.997  0.499  0.004  1.000  0.500  0.202  0.159 
0.153  0.199  0.095  0.163  0.496  0.095  0.155  0.500  0.199  0.115  
0.473  0.199  0.085  0.477  0.496  0.086  0.475  0.500  0.199  0.104  
0.744  0.199  0.082  0.750  0.496  0.082  0.748  0.501  0.199  0.098  
1.001  0.199  0.083  1.004  0.496  0.081  1.000  0.501  0.199  0.097  
1.253  0.199  0.083  1.256  0.496  0.082  1.252  0.501  0.199  0.101  
1.524  0.199  0.087  1.526  0.496  0.085  1.522  0.501  0.199  0.106  
1.841  0.198  0.097  1.842  0.496  0.095  1.846  0.501  0.199  0.117  
ML  0.998  0.199  0.003  1.003  0.498  0.003  1.000  0.501  0.199  0.090 
0.152  0.201  0.072  0.149  0.502  0.072  0.156  0.500  0.200  0.091  
0.474  0.201  0.066  0.470  0.502  0.065  0.474  0.500  0.200  0.081  
0.746  0.200  0.065  0.743  0.501  0.063  0.748  0.500  0.200  0.078  
1.001  0.200  0.063  0.997  0.501  0.063  1.002  0.500  0.200  0.076  
1.254  0.200  0.065  1.251  0.501  0.064  1.254  0.500  0.200  0.078  
1.527  0.200  0.069  1.523  0.500  0.068  1.524  0.500  0.200  0.083  
1.846  0.199  0.078  1.842  0.500  0.075  1.846  0.500  0.200  0.092  
ML  1.000  0.200  0.000  0.996  0.502  0.003  0.999  0.500  0.200  0.071 
Monte Carlo RMSE when the response is exponential.
Linear  Nonlinear 1  Nonlinear 2  

RMSE  RMSE  RMSE  
−0.478  0.502  0.504  −0.480  0.183  0.473  −0.490  0.572  0.918  0.635  
−0.032  0.501  0.668  −0.029  0.186  0.630  −0.038  0.539  0.474  0.804  
0.338  0.500  0.851  0.334  0.187  0.782  0.329  0.532  0.428  0.998  
0.632  0.500  1.082  0.634  0.189  0.976  0.632  0.526  0.372  1.256  
0.933  0.500  1.422  0.928  0.191  1.201  0.932  0.523  0.356  1.609  
1.199  0.499  1.869  1.198  0.192  1.592  1.202  0.522  0.357  2.129  
1.491  0.500  2.537  1.489  0.192  2.162  1.494  0.524  0.377  2.960  
ML  0.990  0.500  1.366  0.990  0.191  1.112  0.985  0.522  0.355  1.617 
−0.501  0.502  0.297  −0.500  0.194  0.264  −0.505  0.508  0.272  0.338  
−0.032  0.501  0.402  −0.033  0.194  0.345  −0.037  0.505  0.250  0.453  
0.325  0.501  0.507  0.326  0.195  0.437  0.323  0.504  0.242  0.579  
0.632  0.500  0.626  0.632  0.195  0.541  0.628  0.504  0.239  0.730  
0.914  0.500  0.787  0.912  0.195  0.671  0.906  0.504  0.237  0.925  
1.186  0.500  1.025  1.185  0.195  0.860  1.182  0.504  0.238  1.210  
1.476  0.500  1.430  1.475  0.194  1.145  1.476  0.504  0.238  1.689  
ML  0.997  0.500  0.801  0.997  0.195  0.625  0.993  0.504  0.237  0.964 
−0.497  0.501  0.228  −0.498  0.198  0.201  −0.503  0.505  0.237  0.271  
−0.030  0.501  0.304  −0.029  0.199  0.264  −0.034  0.503  0.227  0.362  
0.327  0.501  0.399  0.327  0.200  0.333  0.325  0.503  0.224  0.467  
0.632  0.500  0.504  0.632  0.200  0.412  0.632  0.503  0.223  0.591  
0.910  0.500  0.633  0.911  0.200  0.504  0.909  0.503  0.223  0.754  
1.185  0.500  0.821  1.185  0.200  0.655  1.184  0.504  0.224  0.992  
1.476  0.500  1.135  1.474  0.199  0.863  1.474  0.505  0.227  1.399  
ML  0.998  0.500  0.648  0.997  0.200  0.479  0.995  0.504  0.223  0.788 
Monte Carlo RMSE when the response is Poisson.
Linear  Nonlinear 1  Nonlinear 2  

RMSE  RMSE  RMSE  
0.358  0.605  0.768  0.607  0.253  0.512  0.299  0.622  0.349  0.872  
0.611  0.560  0.636  0.760  0.231  0.495  0.587  0.567  0.280  0.727  
0.793  0.530  0.646  0.875  0.216  0.487  0.781  0.533  0.244  0.719  
0.967  0.502  0.598  0.978  0.204  0.484  0.962  0.502  0.220  0.690  
1.114  0.479  0.634  1.078  0.193  0.504  1.115  0.477  0.200  0.711  
1.255  0.458  0.683  1.168  0.183  0.527  1.258  0.454  0.184  0.766  
1.408  0.434  0.777  1.274  0.173  0.597  1.420  0.429  0.171  0.862  
ML  1.003  0.499  0.474  1.000  0.202  0.329  0.998  0.499  0.216  0.583 
0.376  0.591  0.664  0.605  0.248  0.388  0.343  0.603  0.287  0.720  
0.608  0.555  0.534  0.755  0.228  0.378  0.600  0.560  0.256  0.574  
0.792  0.528  0.529  0.873  0.214  0.373  0.781  0.532  0.232  0.564  
0.964  0.503  0.475  0.975  0.203  0.373  0.963  0.504  0.210  0.515  
1.091  0.484  0.505  1.069  0.192  0.386  1.096  0.484  0.195  0.527  
1.244  0.463  0.562  1.163  0.183  0.401  1.249  0.461  0.179  0.586  
1.385  0.444  0.624  1.267  0.173  0.429  1.396  0.441  0.165  0.650  
ML  1.000  0.500  0.344  0.999  0.201  0.199  1.000  0.500  0.207  0.395 
0.388  0.586  0.662  0.604  0.248  0.355  0.358  0.596  0.286  0.684  
0.612  0.553  0.534  0.756  0.228  0.347  0.605  0.557  0.256  0.535  
0.794  0.527  0.515  0.874  0.214  0.342  0.784  0.531  0.232  0.518  
0.965  0.503  0.454  0.975  0.203  0.345  0.965  0.504  0.210  0.469  
1.089  0.486  0.483  1.069  0.193  0.349  1.094  0.485  0.194  0.486  
1.243  0.465  0.543  1.164  0.183  0.360  1.245  0.464  0.177  0.552  
1.382  0.447  0.604  1.268  0.173  0.380  1.390  0.444  0.162  0.612  
ML  1.000  0.500  0.309  1.000  0.201  0.153  0.999  0.501  0.207  0.339 
Nonlinear percentile estimates using an identity link function for chlorine data along with
0.25  0.388 (0.375,0.395)  0.128 (0.093,0.161) 
0.50  0.390 (0.378,0.399)  0.102 (0.076,0.132) 
0.75  0.388 (0.373,0.407)  0.075 (0.053,0.115) 
ML  0.390  0.102 
Poisson nonlinear percentile estimates using a loglink function for Life Span Study data along with
0.75  0.000 (0.000,0.000)    −0.479 (−0.502, −0.397)  −0.039 (−0.073,0.104)  −1.577 (−1.611, −1.556)  −0.126 (−0.195,0.017) 
0.80  0.274 (0.167,0.39)    −0.509 (−0.733, −0.296)  0.154 (−0.057,0.358)  −1.729 (−2.377, −1.359)  −0.341 (−0.518, −0.16) 
0.85  0.995 (0.897,1.097)    −0.139 (−0.262, −0.069)  0.197 (0.131,0.314)  −1.458 (−1.835, −1.008)  −0.063 (−0.138,0.005) 
0.90  1.521 (1.44,1.612)    0.101 (0.028,0.197)  0.437 (0.334,0.496)  −1.254 (−1.659, −0.894)  −0.015 (−0.077,0.046) 
ML  0.510    −0.371  0.132  −1.621  −0.185 
0.75  0.000 (0.000,0.000)  0.000 (0.000,0.011)  −0.456 (−0.587, −0.305)  0.001 (−0.015,0.271)  −1.582 (−2.05, −1.573)  −0.155 (−1.44, −0.075) 
0.80  0.180 (0.068,0.324)  0.061 (0.004,0.124)  −0.502 (−0.728, −0.283)  0.166 (−0.065,0.364)  −1.723 (−2.271, −1.368)  −0.354 (−0.531, −0.165) 
0.85  0.980 (0.87,1.082)  0.000 (0.000,0.065)  −0.151 (−0.264, −0.061)  0.232 (0.139,0.33)  −1.442 (−1.839, −1.028)  −0.061 (−0.139,0.005) 
0.90  1.509 (1.436,1.624)  0.000 (0.000,0.000)  0.122 (0.017,0.189)  0.436 (0.341,0.505)  −1.292 (−1.648, −0.904)  −0.010 (−0.074,0.049) 
ML  0.451  0.035  −0.362  0.150  −1.617  −0.188 