TEXT SIZE

CrossRef (0)
Repair policies of failure detection equipments and system availability

Seongryong Na1,a, Sung-Hwan Banga

aDivision of Data Science, Yonsei University, Korea
Correspondence to: 1 Division of Data Science, Yonsei University, 1 Yonseidai-Gil, Wonju, Gangwon-Do 26493,
Korea. E-mail : nasr@yonsei.ac.kr
Received June 11, 2021; Revised August 20, 2021; Accepted August 24, 2021.
Abstract
The total system is composed of the main system (MS) and the failure detection equipment (FDE) which detects failures of MS. The analysis of system reliability is performed when the failure of FDE is possible. Several repair policies are considered to determine the order of repair of failed systems, which are sequential repair (SQ), priority repair (PR), independent repair (ID), and simultaneous repair (SM). The states of MS-FDE systems are represented by Markov models according to repair policies and the main purpose of this paper is to derive the system availabilities of the Markov models. Analytical solutions of the stationary equations are derived for the Markov models and the system availabilities are immediately determined using the stationary solutions. A simple illustrative example is discussed for the comparison of availability values of the repair policies considered in this paper.
Keywords : system availability, failure detection equipment, repair policy, reliability, Markov process
1. Introduction

A repairable system can conceptually consist of a main system (MS) and a failure detection equipment (FDE). The MS performs system functions and the FDE detects the failures of MS. This conceptual formation can be found in telecommunication systems and automated plant facilities under remote monitoring and control. Failures of FDE delay the detection of MS failures and affect system availability. The reliability analysis considering the detection of failures of safety systems has been done in Zhang et al. (2003) and Guo and Yang (2008). Recently, Na (2011) and Na and Bang (2013) has studied the effect of FDE on system availability using Markov modeling.

We assume various situations about the occurrence of failures and the repair policies. For example, when the MS is under repair, it can be assumed that the FDE does not break down (Na, 2011). However, if it happens that the FDE fails after MS’s failure, several repair policies can be considered. In Na and Bang (2013), the system availability has been derived with the scheme of sequential repair. In this paper, we consider several repair policies such as priority repair, independent repair and simultaneous repair and compare the results of system availability with previous one.

This paper is constructed as follows. In Section 2, the assumptions for the MS-FDE system are stated and the results of previous studies are provided. The system availability results of various repair policies are given in Section 3. Concluding remarks are given in Section 4.

2. Markov modeling of main system (MS)-failure detection equipment (FDE) system

The total system consists of the main system (MS) and the failure detection equipment (FDE). The Markov modeling of the system is possible using the following assumptions.

• (A1) MS stays in normal state during an exponential time of mean $λM-1$ before failure.

• (A2) The repair time of MS in failure is exponentially distributed with mean $μM-1$.

• (A3) The sojourn times of MS in normal state and the repair times are independent.

• (A4) The repair of MS starts immediately when the failure of MS is detected.

• (A5) The failure of MS is immediately detected when FDE operates.

If the FDE is assumed to operate with no failure, the system availability Av (RN) is given as following by the elementary property of alternating renewal processes (Ross, 1996).

$Av (RN)=λM-1λM-1+μM-1=μMλM+μM.$

The FDE can be out of order in real world and the following Markovian assumptions are about the failure.

• (B1) FDE operates normally during an exponential time of mean $λD-1$ before failure.

• (B2) The repair time for FDE is exponentially distributed with mean $μD-1$.

• (B3) The operation times and repair times of FDE are independent.

• (B4) The failure of FDE is immediately detected.

The assumptions of repair policy and the failure mechanism of FDE are required to represent the MS-FDE system exactly. The FDE is shut down and does not fail when the MS is detected to fail and is being repaired. In this situation, the following assumption is possible.

• (NF) The failure of FDE does not occur when the failure of MS is detected and MS is in repair.

If we represent the normal state as 1 and the state in failure as 0, the state of MS and FDE can be represented as (1, 1), (0, 1), (1, 0), or (0, 0). The MS-FDE system assuming the conditions (A1)–(A5), (B1)–(B4) and (NF) can be defined as a continuous-time Markov process with the state space {(1, 1), (0, 1), (1, 0), (0, 0)} (Na, 2011).

The transition diagram is Figure 1 and its system availability Av (NF) is given by the following equation.

$Av (NF)=μMλM+μM·μD(λM+μM)(λM+λD+μD)μD(λM+μM)(λM+λD+μD)+λMλDμM.$

Note that since the failure of FDE is assumed, the system availability (2.2) is less than (2.1).

It would be practical to assume that FDE can fail during the repair of MS if it costs to shut down FDE. The state (0, 0) can be divided into (0, 0)A where MS fails after FDE and (0, 0)B where FDE fails after MS. The sequential repair is the policy that MS or FDE is repaired in the order of failure and detection. The following is assumed for the sequential repair.

• (SQ) FDE fails with rate $λD′$ when MS is in repair and the sequential repair is performed.

The MS-FDE system with the assumptions (A1)–(A5), (B1)–(B4) and (SQ) can be represented as a continuous-time Markov process with the state space {(1, 1), (0, 1), (1, 0), (0, 0)A, (0, 0)B} and the transition diagram is given by

The system of Figure 2 has been studied in Na and Bang (2013) and the system availability is given by,

$Av (SQ)=μMμD(λM+λD+μD)(λD′+μM)·C-1,$

where the normalizing constant is equal to the following equation.

$C=μMμD (λD′(λM+λD)+λDμM)+λMμMμD (λM+λD+μD)+μMμD (λMμM+μMμD+λD′μD) +λMλD′μD (λM+λD+μD)+λMμM (λD′(λM+λD)+λDμM).$

It can be seen that the system availability (2.3) of sequential repair is less than (2.2). Meanwhile, the mean time between failures (MTBF) is given by the following equation regardless of the repair policy (NF) or (SQ).

$MTBF=λM-1.$

Another important reliability measure mean time to repair (MTTR) is equal to,

$MTTR=μM-1+μD-1·λDλM+λD+μD,$

under the condition (NF) and

$MTTR=μM-1+μD-1·λD(μD+λD′)+λMλD′(μM+λD′)(λM+λD+μD),$

under the condition (SQ). It is obvious that the MTTR increases as the failure of FDE is assumed to occur more frequently. Refer to Na (2011) and Na and Bang (2013) for details.

3. Repair policies for system reliability analysis

We can consider various policies for repair order when both MS and FDE get into breakdown. The repair policies in real situations are flexibly determined in consideration of system characteristics, operation mechanisms, the number of repairmen, and so on. In this section, the reliability results according to various repair policies such as priority repair, independent repair and simultaneous repair are examined in addition to sequential repair.

First, we consider the method of priority repair that the failed MS is repaired preferentially. It is assumed about the policy of priority repair that the failure of MS is immediately detected and the failed MS is repaired in advance of FDE. That is, if MS fails while FDE is in repair, the repairman repairs MS instead of FDE. This can be a practical policy in case that a special monitoring of the failed MS is performed and repairmen are not sufficient. The following assumption is about the priority repair.

• (PR) FDE fails with rate $λD′$ when MS is in repair and the priority repair is performed.

The MS-FDE system with the conditions (A1)–(A5), (B1)–(B4) and (PR) can be represented as a continuous-time Markov process with the state space {(1, 1), (1, 0), (0, 1), (0, 0)A, (0, 0)B} and the transition diagram is given by Figure 3. It can be also seen that the Markov process with Figure 3 is reduced to the process with Figure 4 where two states (0, 0)A and (0, 0)B are combined into (0, 0). The system availability is provided in the following theorem by analyzing the Markov process of

### Theorem 1

If we assume the MS-FDE system with the conditions (A1)–(A5), (B1)–(B4) and (PR), then the system availability is equal to

$Av(PR)=μMλM+μM.$
Proof

We obtain the following stationary equations for the Markov process of

• (1,1) : p(11)(λM + λD) = p(10)μD + p(01)μM

• (1,0) : p(10)(λM + μD) = p(11)λD + p(00A)μM + p(00B)μM

• $(0,1):p(01)(μ+λD′)=p(11)λM$

• (0,0)A : p(00A)μM = p(10)λM

• $(0,0)B:p(00B)μM=p(01)λD′$

Combining two states (0, 0)A and (0, 0)B into (0, 0), we obtain the following stationary equations which correspond to

• (1,1) : p(11)(λM + λD) = p(10)μD + p(01)μM

• (1,0) : p(10)(λM + μD) = p(11)λD + p(00)μM

• $(0,1):p(01)(μ+λD′)=p(11)λM$

• $(0,0):p(00)μM=p(01)λD′+p(10)λM$

The following equation is obtained by summing the stationary equations for states (1,1) and (1,0).

$p(11)λM+p(10)λM=p(01)μM+p(00)μM.$

Finally, the system availability is equal to

$Av (PR)=p(11)+p(10)=1-(p(01)+p(00))=μM(λM+μM).$

Note that the availability (3.1) is equal to (2.1). The state of MS behaves as in the alternating renewal process under the assumption (PR) that MS is repaired preferentially.

Under the independent repair policy, MS and FDE are repaired independently whenever the failures are detected. It is assumed that the repair of FDE starts additionally when FDE fails in the course of the repair of MS. This system can be possible when sufficient repairmen work.

• (ID) FDE fails with rate $λD′$ when MS is in repair and MS and FDE are repaired independently.

The MS-FDE system with conditions (A1)–(A5), (B1)–(B4) and (ID) can be represented as a continuoustime Markov chain with the state space {(1, 1), (1, 0), (0, 1), (0, 0)A, (0, 0)B} and the transition diagram is given by Figure 5. The system availability under the independent repair policy is given in the following theorem.

### Theorem 2

If we assume the MS-FDE system with the conditions (A1)–(A5), (B1)–(B4) and (ID), then the system availability is equal to,

$Av (ID)=C-1μMμD(λM+λD+μD)(λD′+μM+μD),$

where the normalizing constant is given by

$C=μMμD((λM+μD)(μM+μD)+λD′μD)+μMμD(λD(μM+μD)+λD′(λM+λD)) +λMμD(λM+λD+μD)(μM+μD)+λMμM(λD(μM+μD)+λD′(λM+λD))+λMλD′μD(λM+λD+μD).$
Proof

The following stationary equations can be obtained for the Markov process of

• (1,1) : p(11)(λM + λD) = p(10)μD + p(01)μM

• (1,0) : p(10)(λM + μD) = p(11)λD + p(00B)μM

• $(0,1):p(01)(μM+λD′)=p(11)λM+p(00A)μD+p(00B)μD$

• (0,0)A : p(00A)μD = p(10)λM

• $(0,0)B:p(00B)(μM+μD)=p(01)λD′$

We obtain the following equations from the above stationary equations.

• $p(10)=p(11)·λD(μM+μD)+λD′(λM+λD)(λM+μD)(μM+μD)+λD′μD$

• $p(01)=p(11)·λM(λM+λD+μD)(μM+μD)μM((λM+μD)(μM+μD)+λD′μD)$

• $p(00A)=p(11)·λM(λD(μM+μD)+λD′(λM+λD))μD((λM+μD)(μM+μD)+λD′μD)$

• $p(00B)=p(11)·λMλD′(λM+λD+μD)μM((λM+μD)(μM+μD)+λD′μD)$

Using that the probability sum is equal to 1, we have the following results.

• $p(11)=C-1μMμD((λM+μD)(μM+μD)+λD′μD)$

• $p(10)=C-1μMμD(λD(μM+μD)+λD′(λM+λD))$

• p(01) = C1λMμD(λM + λD + μD)(μM + μD)

• $p(00A)=C-1λMμM(λD(μM+μD)+λD′(λM+λD))$

• $p(00B)=C-1λMλD′μD(λM+λD+μD)$

Finally, the system availability for the policy of independent repair becomes

$Av (ID)=p(11)+p(10)=C-1μMμD(λM+λD+μD)(λD′+μM+μD).$

We observe that the system availability (3.2) of independent repair policy is similar to (2.3) of sequential repair. Meanwhile, it is easily expected that (3.2) is larger than (2.3). Note that the Markov process of Figure 5 of the policy of independent repair is not reversible, which is apparent from the stationary equations of (0, 0)A and (0, 0)B.

We can consider the policy of simultaneous repair under which failed MS and FDE are repaired simultaneously. In particular, if FDE fails in the course of the repair of MS, both MS and FDE are repaired together. This policy seems to be appropriate when both systems are initiated at the same time after repair. The transition diagram of simultaneous repair policy is given by

• (SM) FDE fails with rate $λD′$ when MS is in repair and MS and FDE are repaired simultaneously with rate μT.

The system availability is provided in the following theorem.

### Theorem 3

The system availability of the MS-FDE system under the policy of simultaneous repair satisfying the conditions (A1)–(A5), (B1)–(B4) and (SM) is given by

$Av (SM)=C-1μDμT(λM+λD+μD)(λD′+μM),$

where the normalizing constant is equal to

$C=μDμT(λM+μD)(λD′+μM)+λDμDμT(λD′+μM)+λMμDμT(λM+λD+μD) +λMλDμT(λD′+μM)+λMλD′μD(λM+λD+μD).$
Proof

The following stationary equations are valid for the Markov process of

• (1,1) : p(11)(λM + λD) = p(10)μD + p(01)μM + p(00B)μT

• (1,0) : p(10)(λM + μD) = p(11)λD

• $(0,1):p(01)(μM+λD′)=p(11)λM+p(00A)μD$

• (0,0)A : p(00A)μD = p(10)λM

• $(0,0)B:p(00B)μT=p(01)λD′$

The following equations are obtained from the above stationary equations.

• $p(10)=p(11)·λDλM+μD$

• $p(01)=p(11)·λM(λM+λD+μD)(λM+μD)(λD′+μM)$

• $p(00A)=p(11)·λMλDμD((λM+μD)$

• $p(00B)=p(11)·λMλD′(λM+λD+μD)μT(λM+μD)(λD′+μM)$

Here, we derive the following stationary distribution.

• $p(11)=C-1μDμT(λM+μD)(λD′+μM)$

• $p(10)=C-1λDμDμT(λD′+μM)$

• p(01) = C1λMμDμT (λM + λD + μD)

• $p(00A)=C-1λMλDμT(λD′+μM)$

• $p(00B)=C-1λMλD′μD(λM+λD+μD)$

Finally, the system availability is obtained as follows.

$Av (SM)=p(11)+p(10)=C-1μDμT(λM+λD+μD)(λD′+μM).$

The results of the system availability corresponding to various repair policies can be compared. It has been verified that Av (RN) > Av (NF) > Av (SQ). Meanwhile, it can be expected that Av (PR) > Av (SM) > Av (ID) but its direct verification seems difficult. The actual values of system availability under a simple condition that all failure rates and repair rates are equal are given as follows,

$Av (RN)=Av (PR)=12, Av (NF)=Av (SM)=37, Av (ID)=922, Av (SQ)=25.$

It can be seen that the above values are consistent with the intuitive expectation. A detailed comparison of usual situations can be done in a further study.

4. Conclusion

Various repair policies can be considered for the analysis of system reliability when FDE fails. System characteristics, operation method and the number of repairmen affect the determination of repair policy. In this paper, the system availability has been analyzed by assuming priority repair, independent repair and simultaneous repair. The state of MS-FDE system of each repair policy can be represented by a Markov process. We solved the stationary equations of each Markov process and derived the system availability for MS-FDE of each repair policy using the stationary probabilities. It can be observed that the system availability has different value for each repair policy.

The analysis of system reliability can be performed more efficiently by applying the system availability of exact MS-FDE reflecting real situations. In this view we have considered various repair policies in this paper. A new attempt can be made for the analysis of reliability systems with non-Markovian property, which may result from the non-exponential assumption of the system working time and repair time, and a numerical approach may be helpful. For complex systems, new reliability modeling which contains additional subsystems rather than MS and FDE can be possible.

Figures
Fig. 1. Transition diagram of MS-FDE system where no failure (NF) of FDE is assumed when MS is in repair. The state (i, j) denotes that MS is in state i and FDE is in state j for i, j =1(normal) or 0(failed).
Fig. 2. Transition diagram of MS-FDE system where sequential repair (SQ) is assumed. The state (i, j) denotes that MS is in state i and FDE is in state j for i, j =1(normal) or 0(failed).
Fig. 3. Transition diagram of MS-FDE system where priority repair (PR) is assumed. The state (i, j) denotes that MS is in state i and FDE is in state j for i, j =1(normal) or 0(failed).
Fig. 4. Transition diagram of MS-FDE system where priority repair (PR) is assumed.(reduced states) The state (i, j) denotes that MS is in state i and FDE is in state j for i, j =1(normal) or 0(failed).
Fig. 5. Transition diagram of MS-FDE system where independent repair (ID) is assumed. The state (i, j) denotes that MS is in state i and FDE is in state j for i, j =1(normal) or 0(failed).
Fig. 6. Transition diagram of MS-FDE system where simultaneous repair (SM) is assumed. The state (i, j) denotes that MS is in state i and FDE is in state j for i, j =1(normal) or 0(failed).
References
1. Guo H and Yang X (2008). Automatic creation of Markov models for reliability assessment of safety instrumented systems. Reliability Engineering and System Safety, 93, 807-815.
2. Na S (2011). Reliability analysis of repairable systems considering failure detection equipments. The Korean Journal of Applied Statistics, 24, 515-521.
3. Na S and Bang S (2013). The effect of failure detection equipment on system reliability. The Korean Journal of Applied Statistics, 26, 111-118.
4. Ross SM (1996). Stochastic Processes (2nd Ed), New York, Wiley.
5. Zhang T, Long W, and Sato Y (2003). Availability of systems with self-diagnostic components ? applying Markov model to IEC 61508-6. Reliability Engineering and System Safety, 80, 133-141.