Metrology_StatisticalProperties

Stability

It’s the change in bias over time. Without assessing stability it's not possible to assure reliable evaluation of the other statistical properties.

Test Elements

• 1 reference unit
• 1 gauge
• 1 operator
• Measurements over time

Example:
You are asked to evaluate a gauge used to measure the height of a certain aerospace component. One of such components is selected and using the master gauge [A measuring device of a standard size that is used to calibrate other measuring instruments] its height was measured giving 2.48 cms. Next, 100 measurements were performed by a single operator during 4 weeks, once a day and 5 measurements each day (20 subgroups of size 5).

 Sample Height (cms) Average Range Week 1, Day 1 2.51 2.51 2.43 2.57 2.58 2.520 0.150 Day 2 2.61 2.54 2.42 2.45 2.46 2.496 0.190 Day 3 2.43 2.57 2.44 2.49 2.55 2.496 0.140 Day 4 2.51 2.52 2.52 2.48 2.47 2.500 0.050 Day 5 2.36 2.58 2.40 2.31 2.36 2.402 0.270 Week 2, Day 1 2.56 2.53 2.42 2.38 2.50 2.478 0.180 Day 2 2.43 2.54 2.44 2.36 2.43 2.440 0.180 Day 3 2.48 2.49 2.55 2.48 2.49 2.498 0.070 Day 4 2.41 2.41 2.50 2.46 2.49 2.454 0.090 Day 5 2.40 2.53 2.53 2.43 2.59 2.496 0.190 Week 3, Day 1 2.47 2.54 2.44 2.48 2.50 2.486 0.100 Day 2 2.42 2.51 2.48 2.44 2.47 2.464 0.090 Day 3 2.42 2.55 2.48 2.56 2.42 2.486 0.140 Day 4 2.46 2.53 2.51 2.45 2.45 2.480 0.080 Day 5 2.36 2.57 2.43 2.41 2.53 2.460 0.210 Week 4, Day 1 2.55 2.47 2.50 2.44 2.61 2.514 0.170 Day 2 2.48 2.57 2.50 2.48 2.56 2.518 0.090 Day 3 2.45 2.50 2.54 2.59 2.42 2.500 0.170 Day 4 2.61 2.45 2.47 2.47 2.41 2.482 0.200 Day 5 2.49 2.42 2.53 2.38 2.46 2.456 0.150

The control chart of the data is given below:

Bias
Bias is the difference between the measurement’s average performed by an operator (MA) and a reference value (RV) obtained by using a master gauge.

Test Elements

• 1 operator
• 1 gauge
• 1 unit measured several times
• 1 master’s measurement

Control Chart Method for Assessing Bias

From the stability example there are 20 subgrups of size 5 (g=20, m=5). The reference value (master gauge) is 2.48.

Bias Formula:

\begin{align} Bias &= \bar{X}- Reference Value\\ Bias &= 2.4813 - 2.48 = 0.0013 \end{align} \begin{align} \sigma_{repeatability} & = \frac{ \bar{R}}{d_2^*} = \frac{0.1455}{2.3339} = 0.062342 \\ \sigma_{b} & = \frac{\sigma_{repeatability}}{\sqrt{g}} = \frac{0.062342}{\sqrt{20}} = 0.01394 \end{align}

d2 and d2* table

95% Confidence Interval (CI) for Bias:

\begin{align} Bias \pm \frac{d_2 \sigma_b (t_{v,\alpha/2})}{d_2^*} \end{align}
Where d2 and d*2 are constants. tv,a/2 is obtained from the t distribution. d2(m=5)=2.326 and d*2 (g=20, m=5)=2.3339, tv,a/2= t72.7,0.05/2=1.993 (from Minitab [A statistical package particularly designed for teaching purposes developed at the Pennsylvania State University]). Then,
\begin{align} Bias \pm \frac{d_2 \sigma_b (t_{v,\alpha/2})}{d_2^*} &= 0.0013 \pm \frac{2.326(0.01394)(1.9931)}{2.3339} \\ &= 0.0013 \pm 0.0266 \\ CI &= (-0.0253, 0.0279) \end{align}
Since “0” is included in the CI the bias is acceptable. Statistically speaking the hypothesis that the bias is zero is not rejected.

Linearity
Linearity

It’s the difference in bias between a master gauge and the observed average over the complete operating range of the testing gauge. It’s the change in bias with respect to the units’ variation of the characteristic being measured.

Possible Causes for Lack of Linearity:

• 1. The gauge isn’t calibrated properly on the extremes of its operating range.
• 2. The master gauge produces faulty maximum and minimum measurements.
• 3. Worn out gauge.

Steps for Assessing Linearity:

• 1. Take at least 5 units (g>=5) that cover the process operating range and measure them with the master gauge.
• 2. Randomly measure every unit several times (m  ≥ 10) by a single operator.
• 3. Fit a regression line [a smooth curve fitted to the set of paired data in regression analysis; for linear regression the curve is a straight line] y=b0+b1x, where b0=intersection with the y-axis and b1=slope. x represents the master gauge measurements and y is the units’ measurements (value).

Example

A sample of seven aerospace components of the same type was selected to cover the process’ operating range (variation). One operator randomly measured its height 10 times each with the testing gauge, and once with the master gauge as the reference value (see table).

 Unit Ref. Measurements 1 2.35 2.37 2.36 2.35 2.35 2.34 2.33 2.35 2.35 2.34 2.34 2 2.40 2.42 2.39 2.39 2.39 2.39 2.39 2.42 2.40 2.40 2.40 3 2.45 2.45 2.45 2.44 2.45 2.47 2.47 2.44 2.44 2.45 2.44 4 2.50 2.50 2.50 2.48 2.51 2.48 2.51 2.50 2.50 2.50 2.51 5 2.55 2.55 2.56 2.56 2.57 2.57 2.53 2.54 2.55 2.55 2.55 6 2.60 2.61 2.62 2.62 2.62 2.60 2.60 2.61 2.61 2.60 2.60 7 2.65 2.65 2.65 2.64 2.67 2.64 2.64 2.67 2.64 2.66 2.65

Height specification: 2.4-2.6 cms. g=7, m=10 (n=70=7x10).

Rearrange the information as follows:

 Unit Ref(x) Value(y) x2 y2 xy 1 2.35 2.37 5.5225 5.6169 5.5695 1 2.35 2.36 5.5225 5.5696 5.5460 1 2.35 2.35 5.5225 5.5225 5.5225 1 2.35 2.35 5.5225 5.5225 5.5225 1 2.35 2.34 5.5225 5.4756 5.4990 1 2.35 2.33 5.5225 5.4289 5.4755 1 2.35 2.35 5.5225 5.5225 5.5225 1 2.35 2.35 5.5225 5.5225 5.5225 1 2.35 2.34 5.5225 5.4756 5.4990 1 2.35 2.34 5.5225 5.4756 5.4990 2 2.40 2.42 5.7600 5.8564 5.8080 2 2.40 2.39 5.7600 5.7121 5.7360 2 2.40 2.39 5.7600 5.7121 5.7360 2 2.40 2.39 5.7600 5.7121 5.7360 2 2.40 2.39 5.7600 5.7121 5.7360

(…up to unit No. 7...)

 6 2.60 2.61 6.7600 6.8121 6.7860 6 2.60 2.60 6.7600 6.7600 6.7600 6 2.60 2.60 6.7600 6.7600 6.7600 7 2.65 2.65 7.0225 7.0225 7.0225 7 2.65 2.65 7.0225 7.0225 7.0225 7 2.65 2.64 7.0225 6.9696 6.9960 7 2.65 2.67 7.0225 7.1289 7.0755 7 2.65 2.64 7.0225 6.9696 6.9960 7 2.65 2.64 7.0225 6.9696 6.9960 7 2.65 2.67 7.0225 7.1289 7.0755 7 2.65 2.64 7.0225 6.9696 6.9960 7 2.65 2.66 7.0225 7.0756 7.0490 7 2.65 2.65 7.0225 7.0225 7.0225 Sum 175.000 175.080 438.2000 4386388 438.4150 Unit Ref(x) Value(y) x2 y2 xy

To obtain the regression line coefficients(b0 and b1):
\begin{align} b_1 &= \frac{\sum xy - \frac{\sum x \sum y}{n} } {\sum x^2 - \frac{(\sum x)^2}{n} } = \frac{438.415 - \frac{175 (175.08) }{70} } {438.2 -\frac{175^2}{70} } = 1.0214\\ b_0 &= \frac{\sum y - b_1 \sum x}{n} = \frac{175.08 - 1.0214(175) } {70} = -0.0524 \end{align}
then, Value (y)= -0.0524+1.0214*Ref(x)

The coefficient of determination R2(R-square) represents the fraction of variation in y explained by x (between 0 and 1). The higher, the better.
\begin{align} R^2 &= \frac{(S_{xy})^2}{S_{xx}S_{yy}} \\ S_{xy} &= \sum xy - \frac{\sum x \sum y}{n} = 438.415 - \frac{175(175.08}{70}=0.715 \\ S_{xx} &= \sum {x^2} - \frac{(\sum x)^2}{n} = 438.2 - \frac{175^2}{70}= 0.7\\ S_{yy} &= \sum {y^2} - \frac{(\sum y)^2}{n} = 438.6388 - \frac{175.08^2}{70}= 0.7387\\ R^2 &= \frac{(0.715)^2}{(0.7)(0.7378)} = 0.9899 \\ \end{align}

This regression model satisfies all the residual tests.

95% Confidence Interval (CI) for b1

\begin{align} b_1 \pm (t_{\alpha/2,n-2})se(b_1) &= b_1 \pm (t_{\alpha/2,n-2}) \sqrt{\frac{MSE}{S_{xx} } } \\ & = 1.024 \pm 1.9954 \sqrt{\frac{0.0001235}{0.7 } } = (0.995,1.048) \\ \end{align} where, \begin{align} SSE &= S_{yy} - b_1S_{xy} = 0.7387 - 1.0214(0.715) = 0.0084 \\ MSE &= \frac{SSE}{n-2} = \frac{0.0084}{68} = 0.0001235 \\ \end{align}

Since "1" is included in the CI, the hypothesis that the slope is one is not rejected.

95% Confidence Interval (CI) for b0

\begin{align} b_0 \pm (t_{\alpha/2,n-2})se(b_0) &= b_0 \pm (t_{\alpha/2,n-2}) \sqrt{(MSE)\frac{1}{n}+\frac{ \bar{X}^2}{S_{xx}} } \\ & = -0.0524 \pm 1.9954 \sqrt{(0.0001235) \frac{1}{70 } + \frac{2.5^2}{0.7} } = (-0.1187,0.014) \\ \end{align} Since "0" is included in the CI, the hypothesis that the regression line passes through the origin (0,0) isn’t rejected.

Graphical representation of Linearity

Linearity is acceptable when b1 is close to 1 and b0 is close to 0.

Linearity Exercise

A sample of five nails of the same type was selected to cover the process’ operating range. One operator randomly measured its length 10 times each with the testing gauge, and once with the master gauge as the reference value (see table).

 Nail Ref Measurements 1 1.96 1.98 1.95 1.96 1.97 1.96 1.95 1.97 1.96 1.98 1.96 2 1.98 1.98 1.99 1.99 1.97 1.97 1.97 1.97 1.97 1.98 1.98 3 2.00 2.02 2.01 2.01 2.00 2.00 2.00 1.99 2.00 1.99 2.00 4 2.02 2.02 2.03 2.02 2.02 2.02 2.02 2.01 2.03 2.01 2.01 5 2.04 2.03 2.05 2.03 2.03 2.05 2.05 2.04 2.04 2.02 2.02

Specification 1.96-2.04 inches. g=5, m=10 (n=50).

Rearrange the information as follows and fill in the blanks:

 Naif Ref(x) Value(y) x2 y2 xy 1 1.96 1.98 3.8416 3.9204 3.8808 1 1.96 1.95 3.8416 3.8025 3.8220 1 1.96 1.96 3.8416 3.8416 3.6416 1 1.96 1.97 3.8416 3.8809 3.8612 1 1.96 1.96 3.8416 3.8416 3.8416 1 1.96 1.95 3.8416 3.8025 3.8220 1 1.96 1.97 3.8416 3.8809 3.8612 1 1.96 1.96 3.8416 3.8416 3.8416 1 1.96 1.98 3.8416 3.9204 3.8808 1 1.96 1.96 3.8416 3.8416 3.8416 2 1.98 1.98 3.9204 3.9204 3.9204

(…up to nail No. 5...)

 4 2.02 2.01 4.0804 4.0401 4.0602 5 2.04 2.03 4.6116 4.1209 4.1412 5 2.04 2.05 4.6116 4.2025 4.1820 5 2.04 2.03 4.6116 4.1209 4.1412 5 2.04 2.03 4.6116 4.1209 4.1412 5 2.04 2.05 4.6116 4.2025 4.1820 5 2.04 2.05 4.6116 4.2025 4.1820 5 2.04 2.04 4.6116 4.1616 4.1616 5 2.04 2.04 4.6116 4.1616 4.1616 5 2.04 2.02 4.6116 4.0804 4.1208 5 2.04 2.02 4.6116 4.0804 4.1208 Sum 100.00 100.00 200.0400 200.0382 200.0368 Nail Ref(x) Value(y) x2 y2 xy

To obtain the regression line coefficients(b0 and b1):

 $b_1 = \frac { \sum xy - \frac{\sum x \sum y}{n} } { \sum x^2 - \frac {(\sum x)^2}{n} } =$ $b_0 = \frac { \sum y - b_1 \sum x } { n } =$ Then Value(y) = b0 + b1 *Ref(x) $R^2 = \frac{ (S_{xy})^2 } { (S_{xx})(S_{yy}) } =$ > $S_{xy} = \sum xy - \frac {\sum x \sum y } { n } =$ $S_{xx} = \sum x^2 - \frac {(\sum x)^2} { n } =$ $S_{yy} = \sum y^2 - \frac {(\sum y)^2} { n } =$ $R^2 =$

95% Confidence Interval (CI) for b1

\begin{align} b_1 \pm (t_{\alpha/2,n-2})se(b_1) &= b_1 \pm (t_{\alpha/2,n-2}) \sqrt{\frac{MSE}{S_{xx} } } \\ & = 0.92 \pm 2.0106 \sqrt{\frac{0.0000905}{0.04 } } = (0.824,1.015) \\ \end{align} where, \begin{align} SSE &= S_{yy} - b_1 S_{xy} = 0.0382 - 0.92(0.0368) = 0.004344 \\ MSE &= \frac{SSE}{n-2} = \frac{0.004344}{48} = 0.0000905 \\ \end{align}

Since “1” is included in the CI, the hypothesis that the slope is one isn’t rejected.

95% Confidence Interval (CI) for b0

\begin{align} b_0 \pm (t_{\alpha/2,n-2})se(b_0) &= b_0 \pm (t_{\alpha/2,n-2}) \sqrt{(MSE)\frac{1}{n}+\frac{ \bar{X}^2}{S_{xx}} } \\ & = 0.16 \pm 2.0106 \sqrt{(0.0000905) \frac{1}{50 } + \frac{2^2}{0.04} } = (-0.0313,0.351) \\ \end{align}

Since “0” is included in the CI, the hypothesis that the regression line passes through the origin (0,0) isn’t rejected.

Repeatability and Reproducibility

Repeatability

Variation among repeated measurements performed by one operator to a single unit and with the same measurement instrument.

Test Elements

• 1 gauge
• 1 operator
• 1 unit measured several times

RV: reference value

Possible causes for lack of repeatability (measurement equipment): Dirtiness, friction, not adjusted, worn out

Reproducibility

Variation among the average measurements performed by several operators using the same units and the same measurement instrument

Test Elements

• 1 gauge
• 2 or 3 operators
• 10 units

Possible causes for lack of reproducibility: Variability among operators, statistical variance around the average measurements, and possible equipment calibration issues

Repeatability and Reproducibility (Gauge R&R, GR&R)

This is a combination of the repeatability and reproducibility studies. It’s the classic or ($\bar{X}-R$) method.
Steps:

• Calibrate the gauge (if this part of the normal operation of the gauge).
• Select 2 or 3 operators to measure at least twice the same 10 units in random order.
• Select the units in a way that cover all the specification range.
• Fill in the GR&R format or use a software (Minitab® was used here).

Conclusions from the GR&R Study

1. If repeatability is poor compared to reproducibility, the possible causes are:

• The gauge needs maintenance.
• The gauge should be redesigned to be more rigid.
• Improve the location of the units.
• There exists a large internal variation in the samples.

2. If reproducibility is poor with respect to repeatability, the possible causes are:

• The operator needs training in the use of the gauge.
• The scale readings are not clear enough.
• Possibly a fixture is needed.

Analysis of Variance (ANOVA)
• An alternate method to the classic Gauge R&R is the Analysis of variance or ANOVA (see Montgomery and Runger, 1993, and MSA, 2002).
• The advantages of the ANOVA are: a) The variances can be estimated with greater precision, and b) More information is obtained such as the interaction between the units and the operators.

The analyzed variation components are:

\begin{align} \sigma^2_{Gauge(R \& R)} &= \sigma^2_{Reproducibility} + \sigma^2_{Repeatability} \\ \sigma^2_{Reproducibility} &= \sigma^2_{op} + \sigma^2_{u,op} \\ \sigma^2_{Repeatability} &= \sigma^2\\ \sigma^2_T &= \sigma^2_u + \sigma^2_{op} + \sigma^2_{u,op} + \sigma^2\\ \end{align}

Total Variance = Units variance + Operators variance+ Units-operators(interaction) variance + Repeatability variance(error)

Example Nails GR&R example revisited using the ANOVA method.

 NAIL OPER REPL 1 2 3 4 5 6 7 SUM A 1 2.67 2.45 2.50 2.61 2.35 2.55 2.40 52.56 2 2.67 2.44 2.50 2.61 2.35 2.56 2.39 3 2.66 2.44 2.49 2.61 2.35 2.56 2.39 B 1 2.65 2.45 2.51 2.60 2.35 2.56 2.39 52.48 2 2.64 2.46 2.49 2.60 2.33 2.55 2.39 3 2.64 2.46 2.51 2.61 2.34 2.54 2.41 C 1 2.65 2.44 2.50 2.60 2.34 2.54 2.40 52.47 2 2.67 2.44 2.50 2.60 2.34 2.55 2.40 3 2.66 2.45 2.50 2.60 2.34 2.55 2.40 Sum 23.91 22.04 22.50 23.44 21.09 22.96 21.57 157.51 P1 P2 P3 P4 P5 P6 P7 T

The individual hypotheses are:

\begin{align} \sigma_p^2 = 0, \sigma^2_0 = 0, \sigma^2_{p,op} =0 \end{align} \begin{align} SS_{unit} &= \frac{P_1^2 + P_2^2 + ... + P_7^2} {ro} - \frac{T^2} {nor} = \frac{23.91^2 + 22.04^2 + ... + 21.57^2}{3(3)} - \frac{151.51^2}{7(3)(3)}\\ &= 0.6831 \\ \end{align} r=number of replicates=3 | o=number of operators=3 | n=number of units=7 | SS=sum of squares
\begin{align} SS_{op} &= \frac{A^2 + B^2 + C^2} {rn} - \frac{T^2} {nor} = \frac{52.56^2 + 52.48^2 + 52.47^2}{3(7)} - \frac{157.51^2}{63}\\ &= 0.000232 \\ SS_{u,op} &= \frac{A_1^2 +...+A_7^2 + B_1^2 +...+B_7^2 + C_1^2 +...+C_7^2} {r} - \frac{T^2} {nor} - SS_{unit} - SS_{op} \\ &= \frac{(2.67 + 2.67 + 2.66)^2 + ... (2.40 + 2.40 + 2.40)^2 }{3} - \frac{157.51^2}{62} - 0.63831 - 0.000232 \\ &= 0.001567 \\ SST &= \sum(every \text { } data \text { } value)^2 - \frac{T^2}{nor} = 2.67^2 +...+2.40^2 - \frac{157.51^2}{63} \\ &= 0.686698\\ SS_{Repetability}&= SSE - SST - SS_p -SS_{op} -SS_{p,op} \\ \end{align}

 ANOVA TABLE Sources of Variation SS df MS F Units(Nails) 0.6831 6 0.11385 871.7* Operators 0.000232 2 0.000116 0.888 Units x Operators 0.001567 12 0.0001306 3.04* Repeatability(Error) 0.0018 42 0.0000429 TOTAL 0.686698 62

MS=SS/df
df=degrees of freedom
df(units)=n-1
df(operators)=o-1
df(u.op)=df(units)df(op)
df(T)=ron-1
df(Repeat.)=by difference
(*) statistically significant effects

\begin{align} F_p &= \frac{MS_p}{MS_{p,op}} \text{, } F_o = \frac{MS_{op}}{MS_{p,op}} \text{, } F_{p,op} = \frac{MS_{p,op}}{MSE} \\ F_{\alpha,df1,df2} &= F_{0.05,6,12} = 3 \\ F_{0.05,2,12} &= 3.89 \text{, } F_{0.05,12,42} \approx 2 \\ \end{align}

Interpretation

By comparing ANOVA’s F-values versus F-table values (see F-Table), it is observed that the statistically significant factors are the units, and the interaction between them and the operators. This means that there exists a statistically significant difference between the sampled nails and the interaction is also important. The meaning of the latter is that how an operator measures a nail depends on the nail.

Study of Variation

\begin{align} \hat{\sigma}^2_{unit} &= \frac{MS_p - MS_{p,op}} {or} = \frac{0.11385 - 0.0001306} {3(3)} = 0.0126355 \\ \hat{\sigma}^2_{op} &= \frac{MS_{op} - MS_{p,op}} {nr} = \frac{0.000116 - 0.0001306} {7(3)} \approx 0\\ \hat{\sigma}^2_{u,op} &= \frac{MS_{u,op} - MSE} {r} = \frac{0.0001306 - 0.0000429} {3} = 0.00002923\\ \hat{\sigma}_{Repeatability}^2 & = MSE = 0.0000429\\ \end{align}

 Sources of Variation Variance Std. Dev. Study Var. %Study Var. %Contribution %Tol Total GR&R 0.00007213 0.0084929 0.0509574 7.534 0.5676 25.48 *Repeatability 0.0000429 0.0065498 0.0392988 5.810 0.3376 19.65 *Reproducibility 0.00002923 0.0054065 0.0324390 4.796 0.2300 16.22 **Operator 0.00000 0.00000 0.00000 0.000 0.0000 0.00 **Unit Oper. 0.00002923 0.0054065 0.0324390 4.796 0.2300 16.22 Unit 0.0126355 0.1124077 0.6744462 99.72 99.44 337.22 Total Variation 0.0127076 0.1124077 0.6763674 100.00 100.00 338.18

\begin{align} Variance &= \hat{\sigma}^2_i \text{, } \%Contrib=\frac{(\%Study \text{ } Var.)^2}{100} \text{, } Std. \text{ } dev. = \hat{\sigma} = \sqrt{\hat{\sigma}^2_i}\\ Study \text{ } Var. &= 6\hat{\sigma} \text{, } \%Study \text{ } Var.=\frac{(100(Study \text{ } Var.))}{Total \text{ } Var._{(Study \text{ } Var.)} } \text{, } \%Tol. = \frac{100(Study \text{ } Var.)}{Tolerance}\\ \end{align}

Conclusion

• The percent of R&R variation is marginal with respect to the specifications (%Tol.), and adequate with respect to the percent of study variation (7.534)
• Note: In case of having negative variance components, report these values as zero. In the case where the interaction effect is not statistically significant, those terms can be eliminated and the new computed F-values are obtained by dividing the corresponding MS by MSE. The formulas are as follows.

\begin{align} \hat{\sigma}^2_{unit} &= \frac{MS_u - MSE}{or} \text{, }\\ \hat{\sigma}^2_{op} &= \frac{MS_{op} - MSE}{nr} \text{, }\\ \hat{\sigma}^2_{Repeatability} &= MSE \\ \end{align}

Comparison between the classic GR&R and ANOVA

 %Study Var. Source of Variation ANOVA GR&R Total GR&R 7.354 5.35 *Repeatability 5.81 5.07 *Reproducibility 4.796 1.69 **Operator 0.0 1.69 **Unit Oper. 4.796 N/A Unit 99.72 99.86

According to this comparison, the classic GR&R is underestimating the variation because it doesn’t include the interaction between units and operators.

Attribute Analysis

The analysis of attributes is the assessment of a measurement system in which the data are attributes of the following types:

a) Nominal scale – Observations classified into two or more non-ordered categories to assess a characteristic such as a preferred political party, supermarket, and so on. A specific case is based on the binary scale: defective, non-defective, success or failure, etc.

b) Ordinal scale – Observations classified into three or more ordered categories to assess a characteristic like annual income, a service evaluation, whether the scale is numeric or not.

Attribute Agreement Analysis (AAA)

Take a minimum of three operators and thirty units or events, some of which slightly outside of their specification limits.

Example

The AAA will be applied to a measurement system to evaluate a certain characteristic using a binary scale. Three operators and a sample of thirty units were selected. Each operator measured each unit three times in random order. Additionally an expert’s evaluation was also included as a reference.

 Unit Operator 1 Operator 2 Operator 3 Expert 1 ND ND ND ND ND ND ND ND ND ND 2 ND D D D ND D D ND ND D 3 ND ND ND D ND ND ND ND ND ND 4 ND ND ND ND ND ND D ND ND ND 5 D D D D D D D D D D 6 D D D D D D D D D D 7 D ND ND ND ND ND ND ND ND ND 8 D D D D D D D D ND D 9 ND ND ND ND ND ND ND ND ND ND 10 D D D D D D D D D D 11 ND ND D ND ND ND ND ND ND ND 12 ND ND ND ND ND D ND ND ND ND 13 D D D D D D D D ND D 14 ND ND ND ND ND ND ND ND ND ND 15 D ND D D ND ND ND D D ND 16 D ND ND ND ND ND ND ND ND ND 17 ND ND ND D ND ND ND ND ND ND 18 ND ND ND ND ND ND ND ND ND ND 19 D D D ND D D ND ND ND ND 20 ND ND D ND ND ND ND ND ND ND 21 ND ND ND ND ND D ND ND ND ND 22 ND ND ND D ND ND ND ND ND ND 23 ND ND ND ND ND ND ND ND ND ND 24 D D D ND D D ND ND ND D 25 ND ND ND ND ND ND ND ND ND ND 26 D ND D ND ND ND ND D D ND 27 D ND ND ND ND ND ND ND ND ND 28 ND ND ND D ND ND ND ND ND ND 29 ND ND ND ND ND ND ND ND ND ND 30 D D D ND D D ND ND ND D Unit Operator 1 Operator 2 Operator 3 Expert

The 95% confidence interval (CI) used by Minitab® is: (LL=lower limit of the CI, UL=upper limit of the CI)

\begin{align} LL &= \frac{v_{1i} F_{0.025,v_{1i},v_{2i}}} {v_{2i} +v_{1i}F_{0.025,v_{1i},v_{2i}} }\\ UL &= \frac{v_{1s} F_{0.975,v_{1s},v_{2s}}} {v_{2s} +v_{1s} F_{0.975,v_{1s},v_{2s}}} \\ v_{1i} &= 2m \text{, } v_{2i} = 2(N-m-1) \\ v_{1s} &= 2(m+1) \text{, } v_{2s} = 2(N-m) \\ \end{align}
m=No. de successes, N=Total number of tests

If the % is zero, LL = 0. If the % is 1, use alpha instead of alpha/2

F transformation from high to low alpha values or vice versa
\begin{align} F_{1-\alpha,n_1-1,n_2-1} = \frac{1} { F_{\alpha,n_1-1,n_2-1} } \end{align}
1. Internal agreement (within operators)
-Operator 1: 22 of 30, 73.3%
\begin{align} v_{1i} &= 2(22)=44 \text{, } v_{2i} = 2(30-22-1)=18 \\ v_{1s} &= 2(22+1)=46 \text{, } v_{2s} = 2(30-22)=16 \\ LL &= \frac{44 F_{0.025,44,18}} {18 +44F_{0.025,44,18} } = \frac{44 (0.4824)} {18 +44 (0.4824) } = 0.5411\\ UL &= \frac{46 F_{0.975,46,16}} {16 +46 F_{0.975,v_{1s},v_{2s}}} = \frac{46 (2.485)} {16 +46 (2.485)} = 0.8772\\ \end{align}

then, CI=(54.11, 87.72)%
Similarly,
-Operator 2: 18 of 30, 60%, CI=(40.6, 77.34)%
-Operator 3: 24 of 30, 80%, CI=(61.43, 92.29)%

2. Each operator versus the expert (%AOE):
-Operator 1: 21 of 30, 70%, CI=(50.6, 85.27)%
-Operator 2: 18 of 30, 60%, CI=(40.6, 77.34)%
-Operator 3: 22 of 30, 73.3%, CI=(54.11, 87.72)%

Detailed errors’ analysis:

-Operator 1 (9 mistakes): Mixture=8, D-ND=1, ND-D=0
-Operator 2 (12 mistakes): Mixture=12, D-ND=0, ND-D=0
-Operator 3 (8 mistakes): Mixture=6, D-ND=0, ND-D=2

 Op D-D ND-D Total D-ND ND-ND 1 23 1 24 12 54 4.17% 18.2% 2 21 3 24 10 56 12.5% 15.2% 3 14 10 24 5 61 41.7% 7.6%

(Table based on the 90 individual evaluations from each operator)
D-ND means the operator said the unit was D when it actually was ND (expert)
3. Agreement Between Operators
10 of 30, 33.3%, CI=(17.29, 52.81)%

4. All Operators Versus the Expert
10 of 30, 33.3%, CI=(17.29, 52.81)%
Decision table(MSA,2002):

 Decision %AOE %ND-D %D-ND Acceptable ≥90 ≤2 ≤5 Marginal ≥80 ≤5 ≤10 Unacceptable <80 >5 >10

For this example,

 Op %AOE %ND-D %D-ND Conclusion 1 70 4.17 18.2 Unacceptable 2 60 12.5 15.2 Unacceptable 3 73.3 41.7 7.6 Unacceptable

USING MINITAB®
Attribute Agreement Analysis for 1, 1_1, 1_2, 2, 2_1, 2_2, 3, 3_1, 3_2
Within Appraisers

Assessment Agreement

 Appraiser #Inspected #Matched Percent 95 %CI 1 30 22 73.33 (54.11, 87.22) 2 30 18 60.00 (40.60, 77.34) 3 30 24 80.00 (61.43, 92.29)

# Matched: Appraiser agrees with him/herself across trials.

Each Appraiser vs Standard
Assessment Agreement

 Appraiser #Inspected #Matched Percent 95 %CI 1 30 21 70.00 (50.60, 85.27) 2 30 18 60.00 (40.60, 77.34) 3 30 22 73.33 (54.11, 87.72)

# Matched: Appraiser's assessment across trials agrees with the known standard.
Assessment Disagreement
This is a different table than the “detailed errors’ analysis” of the example

 Appraiser #ND / D Percent #D / ND Percent #Mixed Percent 1 0 0.00 1 4.55 8 26.67 2 0 0.00 0 0.00 12 40.00 3 2 25.00 0 0.00 6 20.00

# ND / D: Assessments across trials = ND / standard = D.
# D / ND: Assessments across trials = D / standard = ND.
# Mixed: Assessments across trials are not identical.
Between Appraisers

 #Inspected #Matched Percent 95 % CI 30 10 33.33 (17.29, 52.81)

# Matched: All appraisers' assessments agree with each other.

All Appraisers vs Standard

 #Inspected #Matched Percent 95 % CI 30 10 33.33 (17.29, 52.81)

# Matched: All appraisers' assessments agree with the known standard.

Uncertainty

Uncertainty of Measurement

Uncertainty of measurement is defined as “a parameter associated with the result of a measurement that characterizes the dispersion of the values that could reasonably be attributed to the measurand” (VIM 1993; GUM 1993). Uncertainty is an important factor of the measurement process, ranging throughout all steps involved in metrology. Uncertainty leaves room for analysts to prove or disprove the results that are currently believed to be true rather than accepted the data as unbiased facts.

VIM: International Vocabulary of Basic and General Terms in Metrology, 2nd ed. 1993. ISO Technical Advisory Group.

GUM: ISO Guide to the Expression of Uncertainty in Measurements, 1993. ISO Technical Advisory Group and the International Committee on Weights and Measures (CIPM) (corrected and reprinted in 1995).

See also the Guide for Evaluating and Expressing the Uncertainty of NIST Measurement Results. NIST Technical Note 1297, 1994 Edition by Taylor B. and Kuyatt C.

Classification of Uncertainty Components

Type A evaluation: Uses statistical methods for a series of observations.

Type B evaluation: Other than statistical methods of a series of observations.

Standard Uncertainty

It’s defined as the positive square root of the estimated variance.
Essentials of expressing measurement uncertainty (NIST)
http://physics.nist.gov/cuu/Uncertainty/basic.html

Type A standard uncertainty

It’s defined as the standard error of the mean of a series of measurements.
\begin{align} U_a = \frac{s}{ \sqrt{n} } \text{, } s= \sqrt{ \frac{\sum (X_i - \bar{X})^2 } {n-1} }\\ \end{align}

Type B standard uncertainty

It’s obtained for instance from a manufacturer’s handbook or from a calibration certificate (outside sources) or from an assumed distribution.
Essentials of expressing measurement uncertainty (NIST)
http://physics.nist.gov/cuu/Uncertainty/basic.html

Example (re-visited)

A sample of seven aerospace components of the same type was selected to cover the process’ operating range (variation). One operator randomly measured its height 10 times each with the testing gauge, and once with the master gauge as the reference value (see table).

 Unit Ref. Measurements 1 2.35 2.37 2.36 2.35 2.35 2.34 2.33 2.35 2.35 2.34 2.34 2 2.40 2.42 2.39 2.39 2.39 2.39 2.39 2.42 2.40 2.40 2.40 3 2.45 2.45 2.45 2.44 2.45 2.47 2.47 2.44 2.44 2.45 2.44 4 2.50 2.50 2.50 2.48 2.51 2.48 2.51 2.50 2.50 2.50 2.51 5 2.55 2.55 2.56 2.56 2.57 2.57 2.53 2.54 2.55 2.55 2.55 6 2.60 2.61 2.62 2.62 2.62 2.60 2.60 2.61 2.61 2.60 2.60 7 2.65 2.65 2.65 2.64 2.67 2.64 2.64 2.67 2.64 2.66 2.65

Height specification: 2.4-2.6 cms. g=7, m=10 (n=70=7x10).

Evaluation of Type A Standard uncertainty:

 Unit Ref Average s Ua 1 2.35 2.348 0.0113529 0.003590 2 2.40 2.399 0.0119722 0.003786 3 2.45 2.450 0.011547 0.003651 4 2.50 2.499 0.011005 0.003480 5 2.55 2.553 0.0125167 0.003958 6 2.60 2.608 0.0078881 0.002494 7 2.65 2.651 0.0119722 0.003786

\begin{align} U_a &= \frac{s}{ \sqrt{n} } \\ U_a(Unit \text{ } 1) &= \frac{0.0113529}{ \sqrt{10} } = 0.00359\\ \end{align}

For instance, for Unit 1, its reported value including uncertainty type A (Ua) is 2.348 $\pm$ 0.00359 or (2.3444, 2.3516).

Evaluation of Type B Standard Uncertainty:

Since no manufacturer's information is given about the measurement instrument such as its uncertainty, an assumed distribution will be used

Note: The TUR or TAR can be computed because of the above assumption.

TUR: Test Uncertainty Ratio is the ratio of the accuracy tolerance of the unit under calibration to the accuracy tolerance of the calibration standard used. (NCSL, 1999). This can calculated by divided the accuracy tolerance from the assumed distribution by the accuracy tolerance of the selected standard.

TAR: Test Accuracy Ratio is the ratio of the accuracy tolerance of the unit under calibration to the uncertainty of the calibration standard used. NCSL, 1999). This can calculated by divided the accuracy tolerance from the assumed distribution by the uncertainty of the selected standard.

Several distributions can be assumed in computing the standard type B uncertainty (Ub).

For the normal distribution

\begin{align} U_b = \frac{ |Average - Ref.| } { 3 }\\ \end{align}
For the uniform distribution

\begin{align} U_b = \frac{ |Average - Ref.| } { \sqrt{3} }\\ \end{align}
For the triangular distribution

\begin{align} U_b = \frac{ |Average - Ref.| } { \sqrt{6} }\\ \end{align}

The values in the denominator may be considered an approximation to the corresponding standard deviation $\pm 3 \sigma$. The positive square root of the number, which may be considered an approximation to the corresponding variance and is obtained from an assumed probability distribution.

As an illustration, assume the measurements can be modelled by a normal distribution, then

 Unit Ref Average s Ub 1 2.35 2.348 0.0113529 0.006667 2 2.40 2.399 0.0119722 0.003333 3 2.45 2.450 0.011547 0.000000 4 2.50 2.499 0.011005 0.003333 5 2.55 2.553 0.0125167 0.001000 6 2.60 2.608 0.0078881 0.002667 7 2.65 2.651 0.0119722 0.003333

For unit 1,
\begin{align} U_b = \frac{ |Average - Ref.| } { 3 } = \frac{ |2.348 - 2.35| } { 3 } = 0.000667\ \\ \end{align}

For instance, for Unit 1, its reported value including uncertainty type B (Ub) is 2.348 $\pm$ 0.0006667 or (2.3473, 2.3487).

Combined Uncertainty

Ua and Ub can be combined to have a single evaluation of both types of uncertainties:
\begin{align} U_e = \sqrt{ U_a^2 + U_b^2 }\\ \end{align}

For Unit 1,

\begin{align} U_e = \sqrt{ U_a^2 + U_b^2 } = \sqrt{ 0.003590^2 + 0.0006667^2 } = 0.003651\\ \end{align}

Similarly

 Unit Ref Average s Ua Ub Uc 1 2.35 2.348 0.0113529 0.003590 0.006667 0.0036515 2 2.40 2.399 0.0119722 0.003786 0.003333 0.0038006 3 2.45 2.450 0.011547 0.003651 0.000000 0.0036515 4 2.50 2.499 0.011005 0.003480 0.003333 0.003496 5 2.55 2.553 0.0125167 0.003958 0.001000 0.0040825 6 2.60 2.608 0.0078881 0.002494 0.002667 0.0036515 7 2.65 2.651 0.0119722 0.003786 0.003333 0.0038006

Expanded Uncertainty

It is customary to include a coverage (or security) factor, k, to the combined uncertainty to define a range of values in which the measurand will lie with a certain confidence level. If 2 is used, the confidence level will be 95.44% assuming a normal distribution.
\begin{align} U_e = 2\sqrt{ U_a^2 + U_b^2 }\\ \end{align}

For Unit 1,

\begin{align} U_e = 2 \sqrt{ U_a^2 + U_b^2 } = 2 \sqrt{ 0.003590^2 + 0.0006667^2 } = 2(0.003651) = 0.007303\\ \end{align}

Similarly,

 Unit Ref Average s Ua Ub Uc Ue 1 2.35 2.348 0.0113529 0.003590 0.006667 0.0036515 0.007303 2 2.40 2.399 0.0119722 0.003786 0.003333 0.0038006 0.007601 3 2.45 2.450 0.011547 0.003651 0.000000 0.0036515 0.007303 4 2.50 2.499 0.011005 0.003480 0.003333 0.003496 0.006992 5 2.55 2.553 0.0125167 0.003958 0.001000 0.0040825 0.008165 6 2.60 2.608 0.0078881 0.002494 0.002667 0.0036515 0.007303 7 2.65 2.651 0.0119722 0.003786 0.003333 0.0038006 0.007601

Decision Rule

To assess if the expanded uncertainty is acceptable, a rule-of-thumb is that the ratio of the tolerance of the measured unit to the maximum Ue be at least 4.
\begin{align} D = \frac{ Tolerance } { Maximum(U_e) } = \frac{ 2.04 - 1.96 }{ 0.00789 } = 10.14\\ \end{align}

Therefore the expanded uncertainty level is appropriate for these measurements.

Height specification: 2.4-2.6 cms.

Exercise

Apply the decision rule to the measurements of the linearity exercise for nails. The specification is 1.96-2.04”. Assume a normal distribution is appropriate.

 Nail Ref Measurements 1 1.96 1.98 1.95 1.96 1.97 1.96 1.95 1.97 1.96 1.98 1.96 2 1.98 1.98 1.99 1.99 1.99 1.97 1.97 1.97 1.97 1.98 1.98 3 2.00 2.02 2.01 2.01 2.00 2.00 2.00 1.99 2.00 1.99 2.00 4 2.02 2.02 2.03 2.02 2.02 2.02 2.02 2.01 2.03 2.01 2.01 5 2.04 2.03 2.05 2.03 2.03 2.05 2.05 2.04 2.04 2.02 2.02

Uncertainty-Decision Rule Solution

 Nail Ref Average s Ua Ub Ue 1 1.96 1.964 0.0107497 0.0033993 0.001333 0.00730 2 1.98 1.979 0.0087560 0.0027689 0.000333 0.00558 3 2.00 2.002 0.0091894 0.0029059 0.000667 0.00596 4 2.02 2.019 0.0073786 0.0023333 0.000333 0.00471 5 2.04 2.036 0.0117379 0.0037118 0.001333 0.00789

Tolerance=2.04-1.96=0.08

\begin{align} D = \frac{ Tolerance } { Maximum(U_e) } = \frac{ 2.04 - 1.96 }{ 0.00789 } = 10.14\\ \end{align}
Since D ≥4, the expanded uncertainty level is appropriate for these measurements.