2.1. Data structure
There are n subjects across G groups in the K genes (positions). Each subject has a gene expression level. We take the linear model Y_{k} = Xβ_{k} + E_{k} in the k^{th} gene into account where Y_{k} = (Y_{1}_{k}, …, Y_{nk})^{t} is the vector of gene expression levels collected from n individuals across G groups in the k^{th} gene and A^{t} denotes the transpose of a matrix A. The design matrix of the n × G matrix is
$$\mathbf{X}=\left[\begin{array}{cccccc}1& 0& 0& 0& 0& \cdots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ 1& 0& 0& 0& 0& \cdots \\ 1& 1& \cdots & 0& 0& \cdots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ 1& 1& \cdots & 0& 0& \cdots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ 1& 0& 0& 0& \cdots & 1\\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ 1& 0& 0& 0& \cdots & 1\end{array}\right].$$
The regression parameter of the G × 1 matrix is β_{k} = (μ_{1}_{k}, δ_{2}_{k}, δ_{3}_{k}, …, δ_{Gk})^{t} where μ_{1}_{k} denotes the average gene expression level in the k^{th} gene in the first group and δ_{jk} denotes the difference between μ_{1}_{k} and the average expression level in the k^{th} gene in the j^{th} group, j = 1, 2, …, G. The vector of error of the n × 1 matrix denotes E_{k} = (ϵ_{1}_{k}, ϵ_{2}_{k}, …, ϵ_{nk})^{t}. We estimate a parameter of interest β_{k} for each k = 1, …, K with the following Mestimator.
2.2. Mestimator
Let ρ: ℜ_{G} × X → ℜ be a measurable function. We define an Mestimator M_{n} as a solution by minimizing with respect to t ∈ ℜ_{G}.
$$\sum _{i=1}^{n}\rho \hspace{0.17em}\left({Y}_{i}{\mathbf{x}}_{i}^{t}\mathbf{t}\right),$$
where Y_{i} is the i^{th} observation of Y_{k} and x_{i}(= (x_{i}_{1}, …, x_{iG})^{t}), i = 1, …, n denotes the i^{th} row of X. M_{n} is defined to be regression equivalent if M_{n}(Y + Xb) = M_{n}(Y) + b for b in ℜ_{G}. We define M_{n} to be scale equivalent if M_{n}(cY) = cM_{n}(Y) for c > 0. Generally, the second condition is not satisfied. Fortunately, studentization makes M_{n} regression and scale equivalent. We consider a studentized Mestimator M_{n} of β_{k} as a solution for minimization
$$\sum _{i=1}^{n}\rho \hspace{0.17em}\left(\frac{{Y}_{i}{\mathbf{x}}_{i}^{t}\mathbf{t}}{{S}_{n}}\right)$$
with respect to t (G × 1 matrix) and S_{n} = S_{n}(Y) is a scale statistic. The above linear model refers to the classical ANOVA model. But the distribution Y_{1}_{k}, …, Y_{nk} may not be Gaussian. Researchers may be interested in G groups that may be stochastically ordered. For example, it is applicable to doseresponse gene expression microarray data, introduced by Peddada et al. (2003) in cases when gene expression is stochastically increasing over time or dose. We define the null hypothesis H_{0}_{k} as the fact that the G groups in the k^{th} gene are statistically homogeneous. The alternative hypothesis H_{1}_{k} is because G groups in the k^{th} gene are ordered in an increasing level of dominance. H_{0}_{k} and H_{1}_{k} is constructed as below.
$${H}_{0k}:{\delta}_{2k}={\delta}_{3k}=\cdots ={\delta}_{Gk}=0\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\text{vs\hspace{0.28em}}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}{H}_{1k}:0\le {\delta}_{2k}\le {\delta}_{3k}\le \cdots \le {\delta}_{Gk}.$$
These can be restated as the following hypotheses.
$${H}_{0k}:{\mathit{\theta}}_{k}=\mathbf{A}{\mathit{\beta}}_{k}=0\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}{H}_{1k}:{\mathit{\theta}}_{k}=\mathbf{A}{\mathit{\beta}}_{k}\ge 0,$$
where the (G − 1) × G matrix
$$\mathbf{A}=\left(\begin{array}{cccccc}0& 0& 0& 0& 0& \cdots \\ 0& 1& 0& 0& 0& \cdots \\ 0& 1& 1& 0& 0& \cdots \\ 0& 0& 1& 1& 0& \cdots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ 0& 0& 0& \cdots & 1& 1\end{array}\right).$$
So as to test the null hypothesis, we tend to take alternatives that the vector θ_{k} belongs to the positive orthant space ℜ^{+(}^{G}^{−1)} into account. As for the univariate case, we have an optimal UMP test. However, we do not have an optimal UMP test in a multivariate case. For instance, the Hotelling T^{2} creates a huge set of confidence intervals and involves loss of efficiency. It is therefore interesting to consider statistical inference under restricted setup. We can utilize UIP (Roy, 1953) for such a statistical inference under the positive orthant multivariate alternative hypothesis. We take the first derivative of ρ function as ψ. M_{n} should be a medianunbiased estimator of β_{k}. Symmetry of F and skew symmetry of ψ, which is defined to be the symmetry of top left with bottom right and top right with bottom left, are necessary conditions for the medianunbiasedness of M_{n}. We select Huber loss function as a good candidate for ψ function. Therefore, minimization makes the estimator regression and scale equivalent. We define the Huber function as follows.
$$\rho (t)=\{\begin{array}{ll}c\mid t\mid \frac{1}{2}\times {c}^{2},\hfill & \text{if\hspace{0.17em}}\mid t\mid \hspace{0.17em}>c,\hfill \\ \frac{1}{2}\times {t}^{2},\hfill & \text{if\hspace{0.17em}}\mid t\mid \hspace{0.17em}\le c.\hfill \end{array}$$
The derivative of the Huber function ψ is defined as follows.
$$\psi (t)=\{\begin{array}{ll}c\times \text{sign}(t),\hfill & \text{if\hspace{0.17em}}\mid t\mid \hspace{0.17em}>c,\hfill \\ t,\hfill & \text{if\hspace{0.17em}}\mid t\mid \hspace{0.17em}\le c.\hfill \end{array}$$
ψ function is decomposed into the sum
$$\psi ={\psi}_{a}+{\psi}_{b}+{\psi}_{c}.$$
ψ_{a} is absolutely continuous function with absolutely continuous derivative. ψ_{c} should be a continuous and piecewise linear function. ψ_{b} is an increasing step function. We have ψ_{a} = ψ_{b} for the case of Huber loss function. This function satisfies the following conditions (Jurečková and Sen, 1996).

M1: S_{n} should be both regression invariant and scale invariant, S_{n} > 0 a.s. and n^{1/2}(S_{n} − S) = O_{p}(1).

M2: H(t) = ∫ ρ((z − t)/S)dF(z) has the singular minimum at t = 0.

M3: For some δ > 0 and η > 1,
$${\int}_{\infty}^{\infty}{\left[\mid z\mid \underset{\mid u\mid \le \delta}{\text{sup}}\left{\psi}_{a}^{\u2033}\left(\frac{{e}^{v}(z+u)}{S}\right)\right\right]}^{\eta}dF(z)<\infty $$
and
$${\int}_{\infty}^{\infty}{\left[{\mid z\mid}^{2}\underset{\mid u\mid \le \delta}{\text{sup}}\left{\psi}_{a}^{\u2033}\left(\frac{{e}^{v}(z+u)}{S}\right)\right\right]}^{\eta}dF(z)<\infty ,$$
where ${\psi}_{a}^{\prime}(z)=(d/dz){\psi}_{a}(z)$ and ${\psi}_{a}^{\u2033}(z)=({d}^{2}/d{z}^{2}){\psi}_{a}(z)$.

M4: ψ_{c}(z) should be a continuous and piecewise linear function with knots at μ_{1}, …, μ_{r}. Henceforth the derivative ${\psi}_{c}^{\prime}(z)$ is a step function
$${\psi}_{c}^{\prime}(z)={\alpha}_{\nu},\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}{\mu}_{\nu}<z<{\mu}_{\nu +1},\mathrm{\hspace{0.17em}\u200a\u200a}\nu =0,1,\dots ,r,$$
where α_{0}, α_{1}, …, α_{r} ∈ ℜ_{1}, α_{0} = α_{r} = 0, and −∞ = μ_{0} < μ_{1} < ··· < μ_{r} < μ_{r}_{+1} < ∞. f(z) is assumed to be bounded in a neighborhood of S_{μ}_{1}, …, S_{μ}_{r}.

M5: ψ_{b}(z) = λ_{ν} for q_{ν} < z ≤ q_{ν}_{+1}, ν = 1, …, m where −∞ = q_{0} < q_{1} < ··· < q_{m}_{+1} = ∞, −∞ < λ_{0} < λ_{1} < ··· < λ_{m} < ∞. f(z) and f′(z) are assumed to be bounded in S_{q}_{1}, …, S_{q}_{m}. We represent M_{n} asymptotically in the following functionals
$$\begin{array}{l}{\gamma}_{1}={S}^{1}{\int}_{\infty}^{\infty}\left({\psi}_{a}^{\prime}\left(\frac{z}{S}\right)+{\psi}_{c}^{\prime}\left(\frac{z}{S}\right)\right)dF(z),\\ {\gamma}_{2}={S}^{1}{\int}_{\infty}^{\infty}z\left({\psi}_{a}^{\prime}\left(\frac{z}{S}\right)+{\psi}_{c}^{\prime}\left(\frac{z}{S}\right)\right)dF(z).\end{array}$$
Moreover, the following conditions are satisfied.
$$\begin{array}{l}\text{X}1.\hspace{0.17em}{x}_{i1}=1,\mathrm{\hspace{0.17em}\u200a\u200a}i=1,\dots ,n,\\ \text{X}2.\hspace{0.17em}{n}^{1}\sum _{i=1}^{n}{\Vert {\mathbf{x}}_{i}\Vert}^{4}={O}_{p}(1),\\ \text{X}3.\hspace{0.17em}\underset{n\to \infty}{\text{lim}}\hspace{0.17em}{\mathbf{Q}}_{n}=\mathbf{Q},\end{array}$$
where Q_{n} = n^{−1}X^{t}X and Q is a positive definite p × p matrix.
Under these conditions, M_{n} is a solution of the equation
$$\sum _{i=1}^{n}{\mathbf{x}}_{i}\psi \hspace{0.17em}\left(\frac{{Y}_{i}{\mathbf{x}}_{i}^{t}\mathbf{t}}{{S}_{n}}\right)=\mathbf{0}.$$
We calculate S_{n} follows in order to make S_{n} scale and regression invariant. Regression scores as defined below are used.
For α ∈ (0, 1), â_{n}(α) = (â_{n}_{1}(α), …, â_{nn}(α))^{t} should be the unique solution to maximize
$$\sum _{i=1}^{n}{Y}_{i}{\widehat{a}}_{ni}(\alpha )$$
with the constraint
$$\sum _{i=1}^{n}{x}_{ij}{\widehat{a}}_{ni}(\alpha )=(1\alpha )\sum _{i=1}^{n}{x}_{ij},\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}j=1,\dots ,G.$$
Hajék (1965) proposed scores
$${a}_{n}^{*}({R}_{i},\alpha )=\{\begin{array}{ll}0,\hfill & \text{if\hspace{0.28em}}\frac{{R}_{i}}{n}<\alpha ,\hfill \\ {R}_{i}n\alpha ,\hfill & \text{if\hspace{0.28em}}\frac{{R}_{i}1}{n}<\alpha <\frac{{R}_{i}}{n},\hfill \\ 1,\hfill & \text{if\hspace{0.28em}}\alpha <\frac{{R}_{i}1}{n}.\hfill \end{array}$$
An increasing and square integrable function ϕ: (0, 1) → ℜ_{1}, ϕ(α) = −ϕ(1 − α), 0 < α < 1 is chosen. For a number α_{0}(0 < α_{0} < 12), ϕ is assumed to be standardized
$${\int}_{{\alpha}_{0}}^{1{\alpha}_{0}}{\varphi}^{2}(\alpha )d\alpha =1.$$
We define regression scores by ϕ as
$${\widehat{b}}_{ni}={\int}_{{\alpha}_{0}}^{1{\alpha}_{0}}\varphi (\alpha )d{\widehat{a}}_{ni}(\alpha ),\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}i=1,\dots ,n.$$
That is,
$${\widehat{b}}_{ni}=\{\begin{array}{ll}n{\int}_{{\scriptstyle \frac{({R}_{i}1)}{n}}}^{{\scriptstyle \frac{{R}_{i}}{n}}}\varphi (\alpha ){a}_{n}^{*\prime}({R}_{i},\alpha )d\alpha ,\hfill & \text{if\hspace{0.28em}}\alpha \le \frac{({R}_{i}1)}{n}\le 1\alpha ,\hspace{0.17em}\frac{{R}_{i}}{n}\le 1\alpha ,\hfill \\ n{\int}_{{\scriptstyle \frac{({R}_{i}1)}{n}}}^{1\alpha}\varphi (\alpha ){a}_{n}^{*\prime}({R}_{i},\alpha )d\alpha ,\hfill & \text{if\hspace{0.28em}}\alpha \le \frac{({R}_{i}1)}{n}\le 1\alpha ,\hspace{0.17em}\frac{{R}_{i}}{n}>1\alpha ,\hfill \\ n{\int}_{\alpha}^{{\scriptstyle \frac{{R}_{i}}{n}}}\varphi (\alpha ){a}_{n}^{*\prime}({R}_{i},\alpha )d\alpha ,\hfill & \text{if\hspace{0.28em}}\alpha >\frac{({R}_{i}1)}{n},\hspace{0.17em}\frac{{R}_{i}}{n}\le 1\alpha ,\hfill \\ 0,\hfill & \text{if\hspace{0.28em}}1\alpha <\frac{({R}_{i}1)}{n},\hfill \\ n{\int}_{\alpha}^{1\alpha}\varphi (\alpha ){a}_{n}^{*\prime}({R}_{i},\alpha )d\alpha ,\hfill & \text{else.}\hfill \end{array}$$
We define S_{n} as
$${n}^{1}\sum _{i=1}^{n}{Y}_{i}{\widehat{b}}_{ni}={n}^{1}{\mathbf{X}}^{t}{\widehat{\mathbf{b}}}_{n}.$$