Six Sigma – Jason Walker Consulting LLC

Pp and Ppk Indices

The process capability indices Pp and Ppk measure how well your data vehicle fits inside your customer specification garage. Pp is a measurement of how well the process data fits within the customer specifications (the difference between the width of the vehicle and the width of the garage). Ppk measures the distance to the nearest specification limit (how close the vehicle is to one wall or another).

One of the great things about using Pp and Ppk is, because they are indices, they allow you to compare one process’ performance against another regardless of unit of measure. A process with a Pp index of 1.43 is performing better relative to customer specifications than a process with a Pp index of 1.21…every time. So, all else being equal, Pp and Ppk can help a team decide which process requires attention first.

Calculating Pp and Ppk

You need a few bits of information to calculate Pp and Ppk. Sorry, but a little light-weight statistics are involved but fortunately, Excel can do all the heavy lifting.

Mean (Excel: =average(X1, X2, X3, X4…)

Standard Deviation (Excel: =stdev.s(X1, X2, X3, X4…)

Upper Specification Limit (USL provided by customer)

Lower Specification Limit (LSL provided by customer)

Pp (Fit)

Pp = (USL – LSL)/(6 * Standard Deviation)

So, lets break this down a tad starting with the numerator (USL-LSL). This is the distance between the upper specification limit and the lower specification limit or, the width of the garage.

The denominator, (6 * Standard Deviation) is almost the full range of all of your data (99.7%). This is based on the empirical rule which states that normally distributed data will fall within +/- 1, 2 and 3 standard deviations of the mean as follows:

68% within +/-1 standard deviation

95% within +/-2 standard deviations

99.7% within +/-3 standard deviations

So, for our purposes, the vehicle we are parking in the garage is 6 standard deviations wide (provided you don’t have a 2×4 sticking out the window). A Pp index of 1.5 and above means the data fits well within the specification limits with room to spare. In other words, the garage is 1.5 times the size of the car, so we’re good. Any lower and you have to take your groceries out of the back seat before parking in the garage.

Ppk (Centering)

Ppk (upper) = (USL – Mean)/(3 * Standard Deviation)

Ppk (lower) = (Mean – LSL)/(3 * Standard Deviation)

Because Ppk measures the distance to the closest limit, there are 2 formulas. Since the closest wall of the garage is the one that is going to scratch the paint of our new car, that is the only one we are concerned with. The numerators, (USL – Mean) and (Mean – LSL) tell use how far the specification limit is from the center of the data (distance between the “H” emblem on the hood of the Humvee and the closest wall).

Since we are only talking about measuring from the midpoint of the data for Ppk, we are only concerned with half of the distribution curve (so 3 standard deviations instead of 6). If the Pp index is good (>1.5) but the Ppk is low, say 1.2, that tells us that the data could fit within the specification limits but the data is not centered and is therefore in danger of exceeding one of the specification limits. Grandpa parked too close to one side of the garage.

Pp and Ppk are really valuable metrics to gauge process performance. They are often misunderstood and so get ignored despite their obvious value. Dust them off and give ’em shot in your process metrics. You may never park a car the same way again.

In Six Sigma we talk a lot about X and Y variables. We have to talk this way because, after all, Six Sigma is statistics. Then you hear about ‘lead’ and ‘lag’ measures. What are these and how do they relate to X and Y variables. And what about ‘dependent’ and ‘independent’ variables. Where do they fit in? How do I know what to measure and when? Relax. It’s really not that complicated so lets step away and pull the covers back a little on some of the mystery.

X and Y and Independent and Dependent Variables

Ok…maybe you already knew this but these are the same thing. X is the algebraic representation for the independent variable and Y represents the dependent variable. Dependent or independent of what? Well…of each other. Let’s use a simple example to illustrate:

Let’s say it’s the New Year and you just made a resolution to loose weight (the most common of all resolutions and the one least likely to be kept by the way). But your weight loss or gain doesn’t just happen in a vacuum. There are things that you decide to do to help you loose weight. You work out, control your calories and your eating patterns for instance. I have just described one dependent variable (Y) and three independent variables (X’s).

Dependent (Y) Variable – Your weight. It is dependent on the other variables of how often and long you work out, how many calories you consume and whether you eat half a pint of Ben and Jerry’s Urban Bourbon ice cream at 11pm like I just did (I feel shameful but it was really gooood).
Independent (X) Variables – These are the variables that drive the Y variable. The more you work out, the more you control your calories and the less you eat a large meal right before bed, the more you move the needle on the scale.

What is Y=f(X)

Do you suffer from algebraphobia? Same here but this is really easy. This equation is used commonly in Six Sigma to represent the relationship between the X and Y variables. Read ‘Y=f(X)’ as ‘Y equals f of X’ or ‘Y is a function of X’. Think of a function as a process. X’s go into your function and a Y comes out. The type, magnitude or other characteristic of X, determines the output of the function Y. My weight loss (Y) is a function of my exercise habits (X1) or the extent to which I control my calories (X2) or how often I eat a big dinner late at night (X3). Got it? Cool.

Lead Versus Lag Measures

Ok…so we went from algebra to this lead and lag thing. Again this is easy peasy. ITS THE SAME THING! Lead measures are the X values and lag measures are the Y values. The reason we use lead and lag in this way is to describe their temporal relationship. A lead measure is something you measure that is predictive of what will come after, or, the measurement that lags behind. If I track my calories and keep them under a certain threshold everyday, I can reasonably expect the effect that lags behind will be weight loss. The lag measure (weight loss) depends on the extent to which the lead measure (controlling calories daily) is met .

Think of lead measures as the lever that moves a big rock. Consistent pressure on the lead measure, will eventually lead to the desired lag measure. We act on and measure the lead measures as we are trying to affect change. Measuring the lag measure doesn’t have any effect. It only tells us at the end if we met our goal or not. If I woke up everyday and all I did was weigh myself and say: “Boy, I’m really going to loose some weight today”, I probably would not be very successful. But if I woke up every day and focused on the things in my control, my three X variables, it’s likely I’ll drop some lbs.

When to Measure What

In your improvement efforts, there are a lot of things that you will find that can be measured. The key is to find the primary Y that you want to affect first and then find the levers to pull that are most likely to affect it…the X’s. Once you have identified these key relationships, you first measure the Y (185lbs) before making any changes to the X’s. This is your baseline measurement taken in the…yep…you guessed it…Measure phase of the DMAIC process. Once you establish the baseline, then measure the X’s.

How many calories am I currently consuming daily? (3,000…yikes)
How often do I exercise each week? (2 times…er…ok 1 time)
How often do I eat after 9:00pm each week? (4 times…not including Ben & Jerry’s)

These are your X baselines; your levers for changing the Y. Adjust them up or down to affect a change from the baseline in the target Y variable. Then, measure the Y again to see the effect.

These concepts are cornerstones of process improvement and quality management principles. Try to think of some other examples in every day life of lead and lag measures, X and Y variables and instances where Y=f(X). Once you start to play with these concepts a bit they get easier and easier to see. You begin to see causal relationships as levers to be pushed and pulled to deliver an outcome.