skip to Main Content

Properly collecting data involves Data Collection Plans, Check Sheets and Spreadsheets, but how do they all work together? Data collection is critical to every phase of DMAIC; Creating a measurable problem statement, establishing the baseline, searching for causal data, measuring improvement and, finally, monitoring to maintain the gains. There are some key steps to gathering the right data in the best way.

This 1-hour Introductory Webinar provides a few easy-to-follow guidelines to help you make the most of your improvement effort.

Webinar Level

  • Introductory


  • Building the Data Collection Plan
  • Designing a Data Check Sheet
  • Collecting the Data
  • Populating the Data Spreadsheet
  • Using the Tips and Tricks of Data Collection

Tools & Templates

Webinar Transcript

Tracy O’Rourke: Hi, everyone. Welcome to’s webinar. Thanks for spending some quality time with us today. Hundreds of people have registered for this webinar and we are very excited that you’re all here.

Lean and Six Sigma are the go-to improvement methods used by leading organizations all over the world to delight customers, minimize costs, maximize profits, and develop better teams. Every month, we craft webinars just for you, our global learner community that simplify concepts and tools of Lean Six Sigma so you can understand and apply them more easily and be more successful.

Today’s webinar is titled How to Successfully Collect Data for Your Lean Six Sigma Project.

I’m Tracy O’Rourke, managing partner at And today’s presenter, also a managing partner, my colleague, the wonderfully talented, innovative and consummately passionate about learning, Elisabeth Swan. Hi, Elisabeth. How are you today?

Elisabeth Swan: Hi, Tracy. I’m good. I’m better now hearing you described me.

Our Expert: Elisabeth

Tracy O’Rourke: So Elisabeth is an executive advisor, a Master Black Belt, consultant, coach, and trainer. For over 25 years, she has helped leading organizations like, Charles Schwab, Target, Volvo, Alberta Health Services, Starwood Hotels, and many others successfully apply Lean Six Sigma to achieve their goals. This girl is an expert and she knows how to live. She lives on Cape Cod with her husband and just recently had pig roast. Tell us a little bit about the pig roast, Elisabeth.

Elisabeth Swan: Well, it was a delicious pig roast. But what was most striking to me and my husband who actually ran the pig roast was we were supposed to get a 75-pound pig, something happened. We had to get another farmer. It turned into a 120-pound pig. And apparently, that makes it a lot harder to lift, turn, you name it. So it was a very big pig roast. Thanks for asking.

Tracy O’Rourke: So, it sounds to me like you burned a lot of calories, that calories you are about to eat while preparing the meal.

Elisabeth Swan: Yeah. It was a beautiful system.

Tracy O’Rourke: Wow! That’s efficient. Good job.

Elisabeth Swan: Thank you.

How to Interact

Tracy O’Rourke: OK. Just a few housekeeping notes before we begin. During this webinar, all attendees will be in listen-only mode. At the end of the presentation, we will have a question and answer session but you can feel free to ask questions at any time by entering the question in the question area.

We will also ask you to participate in some polls. And we also have an open-ended question we’re going to be asking you. We’ll love it if you could type an answer in the question there as well.

Let’s Interact!

So, let’s do our first activity. Let’s share where are your from. Hundreds of people have registered for this webinar and we’d love to share where everyone is located. So go ahead and click on Ask a Question and type in where you’re joining us from today. All right. Looks like we’ve got some people joining us. We know your time is valuable so let’s just see where some of these people are.

So Jurija is joining us from Oregon. Hello, Jurija! And we have Miriam from Florida, Lyn from Arizona, Ibrahim from South Africa. We’ve got people from Nigeria. And a lot of people joining us today from San Diego is from what I can tell as well as Washington. So, welcome everyone. We know your time is valuable. So I am going to hand it over to Elisabeth.

Who Is

Elisabeth Swan: Thanks again, Tracy, for the incredibly warm introduction. And I can’t see all of you but I’m really happy you joined us. As Tracy said earlier, we’ve both been with since it’s inception and our mission is to make it easy for you to build your problem-solving muscles. So that means we simplify the complex, we’ve made our training both practical and fun. We provide a running case study at the Bahama Bistro, our restaurant team applies all tools.

Aside from this webinar series, we put out blogs, there is a podcast series, there are book reviews, there’s lots of other information to help you get where you need to go.

We’ve used and taught Lean Six Sigma for decades because it’s the best toolkit for problem-solving. And thankfully, there’s a growing list of companies who agree with us. Here are some of the organizations that we’ve helped.

We’ve Helped People From…

So could see we’ve got bricks and mortar as well as online companies. There are diverse industries. It goes from healthcare to financial services, manufacturing, and of course cities, states, and local governments.

And why? It’s because Lean Six Sigma is about problem-solving. And once you have an organization, you’ve got problems. So like all of you, these companies want to be the best at problem-solving. So you’re in good company.

More on benefits later, but let’s review our agendas for today’s webinar.

Today’s Agenda

We’re going to work through what goes into data collection starting with a plan. We’ll talk about a classic tool, the Data Collection Sheet to actually go get data, collecting data, what’s the people’s roles in it, and then how to populate a data spreadsheet. We’ll end with some tips and tricks of data collection. And we’ll hear a little bit from you and your tips and tricks.

Starting With a Problem

Start with a problem. We don’t often know the extent of the problems we’ve been handed or we’ve discovered. We often have to guess or estimate how bad it is. Right here, they’ve estimated about 6 pieces of the dishware are breaking per day. And that’s a lot for the Bistro. So they don’t exactly know what’s going on but that’s where they’re going to collect data. That’s what they’re here to learn.

What to Do?

So here’s what we often see happens. Waitstaff are breaking dishes, time for some re-training. So there’s a first go-to. Then hey, it’s time to replace the dishes. They are too breakable. And hopefully there’s somebody around saying, “Wait a second. Shouldn’t we find out the real cause?” And this is the way a lot of quick wins start out. We think we know what’s wrong. We might be basing it on our own observations. It might be from previous experience.

We think we know what’s wrong. We might be basing it on our own observations. It might be from previous experience.

And one of the things that we see is that people are often told, “Don’t bring me a problem. Bring me a solution.” And it sounds great. It sounds very action-oriented and I’ll do that for you. I will bring you solution.” But what it does is it just shortcuts analysis. It means that people bypass any kind of critical thinking and come up with some kind of a solution that’s usually a quick fix. It might make the problem worse. It might increase variation. It generally does not get to the root cause.

“Don’t bring me a problem. Bring me a solution.”

Why Collect Data?

So, what do we do instead? We collect data. Lots of reasons to collect data. Measures do a lot for us. They reveal our values. If you are collecting data around customer satisfaction, it’s because you value it. If you value the speed of your operation, you measure it. So it shows your values.

It also drives behaviors. This can be good and bad. There’s an example of something you might experience yourselves, people in call centers. It often can be measured on how long they’re on the call and the goal is to keep the call short. So if you’re on the call and you’re a customer service rep and this is a long call, seemingly complicated, you might be – you have an incentive to keep that call short, which means you might not solve the customer’s question. You might not get to the root of a problem.

Or in some cases, you may experience this yourself, sometimes you get disconnected. And that may mean that a customer service rep has decided, “This is going to look too bad for me in terms of how I’m being measured so I’m just going to hang up now. They’ll call back. They’ll get somebody else. They’ll get is solved. But this is looking bad for me.” So it drives behaviors.

Measures can also inspire us. And recently, we’ve had a number of hurricanes. And one of the measures is people who have donated. That’s an inspiration. It’s heartwarming to see that. You see that with blood drives. They let you know how well they’re doing with the blood drive. It’s inspiring.

But the reason we’re here today to measure is to help us learn. This is the true purpose for us. Help us answer important questions. How long does this process take? Is this process working? Are we having the impact we want? What happened in the process if things have gone wrong?

How long does this process take? Is this process working? Are we having the impact we want? What happened in the process if things have gone wrong?

So as long as these questions don’t start with who, and I’m not going for blame, they can transform the intelligence of an organization.

And some of these examples and some of this comes from a book called We Don’t Make Widgets” by Ken Miller, a great treatise on how to improve government processes. It’s a short book. It doesn’t just apply to government. It’s a great simple book.

We’ll have a book report coming out or a book review coming out about that later the next month.

So measures are going to help us learn. So what do we do? OK. Everybody collect data on the broken dishes. I can do it in the kitchen. You can get it in the dining room. Shouldn’t we have a plan?

We’ve seen this too. Tracy and I both worked with a financial services firm and they were trying to get a head start trying to collect data on cycle time to open new accounts. Customers complain. From the time they gave an application to the time they actually had an open account, it was taking over two weeks which seems too long, too long to them. It seemed too long to the organization as well.

And to get a jumpstart, this team just said, “Hey, let’s do it. Go out there. Get cycle time data.”

But the team members went to two different systems based on their – what they were used to. And the two different systems measured cycle time differently. So they had apples to oranges. They had to go dumped the data and start all over again. And starting all over again is never fun with data collection. And if you’re asking people to help you, they might not want to help you the next time.

Poll #1: What is your biggest data collection issue?

So, those are some of the issues we ran to right from the start. So one thing we want to ask you upfront is what your biggest issues with data collection? And the options are:

  • A. Deciding what to measure
  • B. Trying to define measures
  • C. Figuring out how to get the right data
  • D. Deciding how much data to get
  • E. Combination of the above

And Tracy, you have the results of the poll that we did on this. Do you want to listen on that?

Tracy O’Rourke: Yes. So the poll results are that the highest selected letter was combination of all of the above, and which was obviously any of A, B, C or D followed by C, figuring out how to get the right data.

In my experience, I’ve seen this as well. That people do really struggle with this all-in – all together and I think some of the most frustrating situations are when we jump the gun. We ask IT for data and then we discover it’s not the data we needed or the data that we wanted was incomplete. Now, we have to ask again.

So now, we’re actually burdening the IT group which typically they are already overburdened. And now, they don’t want to help us because we don’t know what we want. So this is really important and hopefully this can help be very specific about what you’re going to be asking for.

Elisabeth Swan: Great. Great input, Tracy. Thanks for answering the poll, folks.

What’s the Plan?

Now, let’s take a look at how do we start planning? So one of the ways to deal with some of these questions is to sit down and make a plan. This is the operational – excuse me. This is the Data Collection Plan. It’s a template that you can download from the website. All the templates are free. They all have one or two tabs beyond the template that give you examples of how to fill them out. So we also show you here’s what we mean in this field. Here’s what we’re asking for.

We’re going to focus today on just our main measure, the broken dishes. But you could put all your measures here. There’s more than one thing you’re going to measure for any given project. But the main measure will illuminate a lot about the process and certainly help you get the baseline. So we’re going to start there.

The first thing you fill in is what’s the name of the measure? And I’ve seen all kinds of – it could be way too simple or it can be they’re packing the whole operation definition in here. But what’s the name? What are we going after?

The next is, is it continuous or discrete data? That’s going to have an impact on the charts we use. We like to get a mix. If we see that all we’re collecting is the discrete data, did we – is there some continuous data we can get? It might not be the case but we you want to think about that upfront.

And then a key piece of this is how would you define this measure? If it’s time data, what are the start/stop points? Where does the clock start? Where does the clock end? And does it involve some kind of calculation? So we’re trying to get as much information in this operation definition as we can so data collectors understand what it is we’re collecting. And we’ll dive into that more.

This next one is stratification factors. So this is your main measure by who or by what, by where, by when. This really helps with analysis. This is an important thing to think through. Just getting stratification factors on your main measure can open all kinds of windows on what’s happening in the process. So spend some time here.

Sampling notes, we’ll do a whole separate webinar or sampling and sample sizes. But on a simple level, what are we looking at? Are we looking at last month’s data? Are we looking at three months’ worth of data? What’s the timeframe we’re looking at? What have you pulled together?

And the last one is who is collecting this data and how are they going to do it? And this may involve something called a check sheet. So we’ll dive into that too and show you that.

Just a little example. I had a team working on the percent of waste in a printing facility. So they had to come up with an operational definition. And it sounds – it sounded fairly simple to them. We’re just going to measure the percent of wasted paper. We do print jobs.

Now, if we’re going to do a percent, you need numerator and a denominator. Well, the numerator is easy. Well, it seemed easy. It was just, what are we throwing away? Well, they looked at print jobs and they said, “Well, how are we going to define that?” And they ended up using a ruler to measure the stack of paper so they could base the amount of paper, the cost of paper just on how tall was that stack.

OK. Now, they’ve got the denominator and that’s all the paper for a specific time period. So if you had the numerator was the amount of paper tossed for the month, the denominator had to be the total amount of paper for the month. Well now, that got trickier because some of their paper was what they call cut sheet. So individual sheets of paper and some of it was on rolls. But the main metric in their system was a job, like how many jobs.

So this they had to work through those elements to get at their operational definition. So really interesting in terms of what this brings up. Planning will force you to get into these kinds of specifics.

What’s the Measure Name?

So let’s get back to the Bahama Bistro and their broken dishware. So what’s the measure name? Well, we see this a lot. That’s broken dishes. Same simple. We’re just measuring broken dishes. Now, for anyone looking at this plan, they’d say, “What? We measure the cost of broken dishes, the percent of broken dishes, the number of broken dishes?” Give me a clue. So getting some kind of a unit of measure is critical here. So what is the measure name?

What’s the Measure and Data Type?

All right. So we’re going to get a little more specific. They’re just going to measure the number of broken dishes. Now, they could track the percent but the team figured out that since they’re volumes don’t change much throughout the year in terms of diners, number of meals ordered, the variation is pretty consistent. So that number, just getting the number of broken dishes would be good for them. And they felt like trying to track the total number of dishes. That’s a lot of work.

You’ve got to balance effort with reward. How much effort is this data collection going to take you for the reward of the information you’re going to get and how it’s going to teach you. They’ve also added that this is discrete. You can see and count these dishes. That’s our discrete data.

How much effort is this data collection going to take you for the reward of the information you’re going to get and how it’s going to teach you.

What’s the Definition?

So we’re going to look at the next field, which is that critical operational definition. Again, people think this is just really a simple thing. It’s the number of broken dishes in a day. Like what could be so difficult? Well, what constitutes a dish? What do you mean by broken? And those questions have to get answered so that people can get your data right.

So if you just left it as broken, would it matter how broken it was? So passing this to the team and letting them look at the operational definition, they’ll give you some feedback. So you take that feedback, critical piece of this process, taking in feedback.

Test the Operational Definition

And you can refine your definition. Now, they said the amount of dishware including plates, bowls, glasses, cups, salt and pepper shakers that are broken into two or more pieces in a single day. And then they defined the piece. The piece is defined as anything greater than half an inch.

OK. So that’s helpful. But then the team went out and tested. They said, “OK. Let’s try collecting some of this.” And they said, “You know what? We kind of get it when it’s broken. But how do you count the pieces? Like if a dish breaks in two pieces, does that count as two or is that one?”

And also, what about cracked or chipped? Do we really want to be serving cracked and chipped dishware to our customers? It’s less than half an inch but it doesn’t look good. So OK, let’s test and refine this again.

You can have multiple rounds here. It’s important to get it right because you want people to get good data.

Refined Operational Definition

So now, we’ve got a much more expanded. And they said, “You know what? If it’s broken, chipped, cracked, or into two more pieces in a single day. And broken pieces of dishware only counts as one regardless of how many pieces and hey, here are some handy photos. That’s what a chip looks like, that’s what a crack looks like. If you see that then you’re going to count it.”

Now, this is really helpful and that’s what you want. Any kind of what the operational definition right there for people to use when they’re collecting your data, you want photos if it’s a physical characteristic or it could be lists.

If anyone has been to a new company and someone said, “Hey, come to this new outing. Come to this outing for new employees. Just dress business casual.” OK. Every corporate culture is different. What do they mean by business casual? Am I wearing shorts? Is it khaki? Is it jeans? What exactly is business casual?

So pictures might help. List might help. Whenever you think it’s some kind of an aid for the data collector, that’s great.

How Will You “Slice” the Data?

So we have refined the definition. Now, we’ve got to talk about how are we going to “slice” the data? So as you recall, it’s what, when, who, where. Those are stratification factors. And we will look at number of broken dishes by type of dish. We’re going to look at by shift. We got breakfast, lunch, and dinner, so that’s a when. They want to look at it by server. That’s a who. And then the team said, “Well, what about where? Doesn’t it make it difference where these things break in the Bistro?” And also, you said by server, but what if somebody in the kitchen broke it? What if it was a prepped cook or the bus boy? They’re not servers. Shouldn’t it just be employee?

Plan for Stratification

Great. Great input. Let’s come back and let’s refine it. So broken dish is by type of dish, by shift by day, so that’s the when, you want to have people marked down because we’re going to look at broken dishes by day so we’re going to have to be able to pull it together all the data we get in a day. And then employee, instead of just server, and by area in the bistro. Very helpful.

How Much Data?

OK. How much data? So sampling we’ll get to later but for this group, they’re going to get the breakage for the month of November. And that’s helpful. So one month’s worth of data.

Who Will Collect the Data – How?

Now, who is doing it? How are they doing it? All the servers are going to track their own breakage on a weekly check sheet. And we’re going to attach that right to this plan posted by the manager at the wait station so they where it is, who is responsible for posting it, and it’s a weekly check sheet. So every week, they’re going to enter in the data and then they’re going to replace that the next time.

Potential Charts?

So now, we have a full plan. Let’s take a look at our whole plan. We got the measure type. We got discrete data. We got a really robust, tested, refined operational definition. We have a very good set of stratification factors. You always want to get those upfront. You cannot go back and get those later.

Sampling notes, getting it for November. We know who is doing it and we know how they’re going to do it with the check sheet.

The other thing you’re thinking about upfront with data collection is what kind of charts am I going to get? I could do a run chart because I’m going to look at overtime, breakage per day. I can get a histogram, what’s the spread, how much you could actually break in a day and what’s the average breakage. What’s the minimum breakage?

And then Paretos, there are tons of them here. We could do what’s the most broken type of dishware? What’s most breakage happening – most breakage happens on which shift in which area? So this is all helpful information.

We’re probably not going to go after employee although we’ll probably look at it. But that’s more about – so we can go back to that person who reported it to ask them what happened there unless somebody truly is butter finger then we want to figure out what – can we give you some gripper gloves or something?

So sampling notes, to now all here, box plot, we can also get some of those. And we’ve got a whole another webinar for charting. We’ll remind you of that later.

What’s a Check Sheet?

Now, we mentioned the check sheet. So this one is called a Standard Event Occurrence Check Sheet. This is also downloadable. There are really kind of 5 or 6 standard types of check sheet. This is a very commonly used one. And the name Standard Event Occurrence tells you, you only record events. You’re not looking at every single point of data. You’re not looking at every unit. You’re looking when something goes wrong. And that’s what we’re looking at. We want to look at when things break.

We want to look at when things break.

So this one is the example, that’s on that second tab of the template like I mentioned. And this one is telling you which vendor had an issue, what the issue was, when it happened. And then they got defect categories that have been predefined. So ripeness, low count, wrong order. And these are critical because if you don’t tell people what the categories are then you’re going come up with stuff like, well, it could be low count, it could be too few, it could be under order, it could be a short shipment. And then you’re left with a job of translating, OK, what do they mean? Which category it’s at? How do I make this in a form?

And you don’t want to have extra work. You want to keep this lean. You keep it fast, predefined categories. They predefined the follow-ups too and they gave a follow-up date in this one. But this one was specific for a Bahama Bistro issue around vendor problems.

So let’s look at what we would do. So that’s just telling you good to always have room for detail. We’ll come back to that and just a reminder of what has been predefined.

Build a Check Sheet

Building our own check sheet, so we got the options right in the title. Type of dishware. There’s your list. Shift. There’s your list. Area. There’s your list. So we have given people all the clues to the predefined categories. It makes it easier to use. It gives you better data. And again, we’ve got to be open to feedback. We’ve built this check sheet but we’ve got to test it with folks. So let’s see how they filled it out when we tested it.

Test the Data Collection

All right. So we haven’t ruled it out. We’ve just asked the staff, OK, give it a shot. How is this thing going to work? So we looked at it and we said, “All right. Wait a second. Where did platter come from?” That is not an option. But it’s important. Platters are big. Platters actually cost more if you break those than anything else. So let’s adjust our check sheet to include them.

And what about this? We got seven. And then the person said all day. Not one of the shifts. And there’s no notes. That’s the highest amount broken. So maybe we’d like to know what happened. It would help us with our analysis.

And all day, we got to get that fixed. And then somebody left their name out. And this is something that you’ll probably experience when people don’t put their name down. And that means – well, it could mean just forgetfulness. But it often means they fear retribution. When people think that by putting their name next to data that someone may come back and it’s going to haunt them and somehow will reflect poorly on them, anyone of those. Then you’re going to get not great data or you’re going to get incomplete data. You want to make sure upfront you let people know this is not about assigning blame. This is about problem-solving. And make sure management also sends that message. You got to get this message out early. Get it out often. Let people know why you’re doing this, what’s the point of it. It certainly isn’t to target people.

Refine the Data Collection

So, let’s take a look at how do we fix this. Well, we updated the categories. We separated that first line of data that had seven into the two shifts. We went back to Sean and we said, “What was it?” He said, “Well, it was two at breakfast and was five at lunch.” Well, what happened with those five? That’s still big. He said, “Well, he slipped. There was some water coming in the kitchen. I reported to the chef. He’s going to make sure that doesn’t happen again.”

And then when we talk to Julius to make sure he felt more comfortable with putting his name down, he said, “Honestly, I just forgot. I know we’re doing good stuff.” So it’s OK. But this is about studying the process and helping us understand it.

Number of Categories

Here is a note of caution. The number of categories. Now, what if we said, “Well, you could have chipped, scraped, a little broken, smashed dishes, you have striated dishes, you have scratched dishes, blemished, maybe they are grooved. What if they are marred or scoured? This might be a great brainstorming session but narrow this down to 5 to 7.

We have found that the human brain and just the energy people have to try to understand what happen in this situation really pretty much extends to 5 to 7. And you can see this when you collect data. You might give a huge amount of options but you’ll find they’ll select fewer.

A great example of this, I worked with a health insurance claims processors and we were trying to understand why did claims get kicked out? Most of them were auto-adjudicated. So it’s systematically processed. But if they got kicked out of the system, that means a person had to adjust the claim. That cost money. That took time. That was not as helpful. So we want to understand when that happened, why it happened.

And there were 27 reasons codes. And the main category that came back was other. So really not helpful, didn’t help with the analysis at all. It didn’t help them understand what was happening in the process. So this became useless. But they got pushback.

It didn’t help them understand what was happening in the process. So this became useless.

We went back to the people in charge of the dropdown list in the system, they said, “Well, customers require it.” And we try to work with them a little bit but honestly, they are not getting anything because the people are picking other as the main category. That doesn’t help them understand either.

We had trouble there. We had less trouble with the project I just mentioned to you. That’s the group that was tracking waste of paper in the print facility. In there, they had an even worst situation. They had 84 predefined reason codes for quality issues. And they worked with the teams because they knew they would never know whenever they were going to look through 84 reason codes every time. They tried to figure out with the team what are the ones that applies strictly to paper waste? And they made a subcategory. They got it down to at least 15, which helped. But honestly, they also experienced people picked really five to seven codes in the bulk in the beginning.

So both cases, there’s pushback. So if you can try to reduce those, it’s important but I know people were into this. So let’s find out how many of you have run into this.

Poll #2: Have you ever had to work with too much categories?

So Tracy, this poll is have you ever had to work with too many categories? And they just are options, it’s a yes/no. So what did you get from the polling on this one?

Tracy O’Rourke: So 80% said they have been working with too many categories, option A. And I could see that as well. Too many options and that could create a lot of confusion. So as you’ve mentioned in your examples, Elisabeth, that sometimes people just – I mean I was just looking at the things you mentioned on the prior slide. What’s the difference between marred and blemished and striated or grooved?

And so, you don’t want people spending too much time trying to figure out what those definitions are. And then some people just start checking them all. So is this like as many as applied or is this just one? And so, the more categories you have, the more confusion is can create and that means that your data is less reliable. So I love this idea that you’re not only creating categories but obviously, testing the sheet to make sure people will have a good understanding because too many categories can create havoc in your data collection.

Just as a short example of something I experienced is I worked with an organization. We were working with their inside sales team. And the inside sales team would receive calls from customers that wanted to cancel the orders. There were 35 choices for why was the order cancelled. And guess which ones were used the most? The ones at the top of the list because they didn’t want to scroll, so the first five were the top ones being used even if that wasn’t the problem.

Elisabeth Swan: Yup. No, it’s so true, Tracy, that you’ve got to make it easy for people to collect data. And if you make them scroll, you’re not making it easy.

…you’ve got to make it easy for people to collect data.

Tracy O’Rourke: Exactly. It’s crazy how much we don’t want to scroll. We just pick the one – one of the ones that are in the screen that we could see.

Elisabeth Swan: Yeah. No, a great example. Thank you. It’s different from the two that I had.

Put the Data in a Spreadsheet

So then you got to put your data in a spreadsheet. So, most people are going to be using Google sheets of Excel. Spreadsheets can resemble the check sheets you built on paper. Now, you can put in the dropdown menus to make it easy for you to enter the data. We set this up in columns. That’s what most charting and graphing packages require. Make it easy to transcribe from the check sheets into the spreadsheet. Keep the order that you use so you can just make it a logical flow. So that’s one pretty simple. I can see exactly the data we just got collected from the team.

Tips and Tricks

So let’s recap.

Number one, plan. Don’t wing it. Create a data collection plan. Create operational definitions for all the measures and test it.

Define. Define everything. Define the measure. Define how you’re going to get it. Define who gets it.

And stratify. So stratify, as I said, get that upfront. It’s real easy. It’s there a who, a what, a where, a when? Can I get my main – my measure by any of these stratifications? If I don’t get it upfront, you can’t go back and get it. And these are great clues to analysis.

Next. Clarify. Why are we getting this data? Where is it going to go? What’s the purpose? What’s going to happen with it? Clarify and then test it. Involve others. Testing involves others so you can get those two done at once. Make your data collection better and give people respect by involving them. That’s all – that makes a huge difference in any of these project efforts. When people feel like their opinion matters, they’ve been part of it somehow, they’re more inclined to help you out with whatever problem you’re trying to solve. Then it becomes their problem as well. So these are really key.

Why are we getting this data? Where is it going to go? What’s the purpose? What’s going to happen with it? Clarify and then test it.

Question for You

But we’d also like to hear from all of you. So what rules of thumb or data collection best practices would you like to share? So enter your suggestions into the question box. And Tracy, what did you get from folks?

Tracy O’Rourke: So, Michael says one of his rules of thumb for best practices are make sure that data collection is relevant to the problem.

I really like this one because I have seen lots of data collected and it never answered the question for the hypothesis or actually with relevant to the problem. We tend to be – we could be very data-rich but information-poor. And if you’re not collecting data that is relevant to the problem, you’re checking that box. Right?

Elisabeth Swan: Yeah. No, that’s great.

Tracy O’Rourke: Another one from Steven is, “Use radio button selection as possible to simplify the data collection.”

So this is when there’s a forcing function to select only one of the options. So as you have put on your data sheet example earlier, Elisabeth, breakfast, lunch, and dinner was an option or just like luncheon dinner. And instead of having write it, if it’s electronic, you just have them select and those are forcing functions. They can’t put everything else in there. So that’s a great idea, Steven. Thank you.

Elisabeth Swan: That’s great. I love that, radio buttons.

Tracy O’Rourke: Let’s see. Daniel writes, “Run the collection plan by someone who is unfamiliar with the process to ensure it makes sense to anyone.”

I love this too. We tend to be very much in the weeds in our own processes and I see all the time. Any time I go to a client site, sometimes there are so many acronyms, I think they are talking in a different language besides English because I can’t follow what they’re saying because there’s too many acronyms because they’re so in their own world and it’s not making sense to somebody that’s not in their world. So I think this applies absolutely to data collection as well.

Elisabeth Swan: That’s great because sometimes people don’t ask you about acronyms or things that aren’t clear because they don’t want to appear as if they don’t know the process well. So that kind of test would get rid of that stuff and that’s great. Thank you, Daniel.

Tracy O’Rourke: From Lyn, “I like to use Excel’s data validation feature and then use comments at the top to include the operational definition.” I love that way, if people are wondering at that moment what the definition is, it’s in there in the comments.

Elisabeth Swan: That’s lovely. I used data validation a lot. I hadn’t used it for operational definitions but I’m going to incorporate that one. It’s a great one, Lyn. Thank you.

Tracy O’Rourke: And Seth says, “I was trying to get data for onboarding projects. And since those don’t occur too often, I had to rely on a heavy amount of documentation from previous events. This is the importance of putting detailed documentation over a long span of time.

Yeah, I agree. I think making sure that our data collection effort is not only well-documented but well-understood. Often, I have seen projects where new people are collecting data that’s ongoing and they misinterpret the categories or the definitions and now the data isn’t usable because it’s not reliable. So that’s a really great point.

Elisabeth Swan: Yeah, great. You want stability. That’s really great.

Tracy O’Rourke: And then Kim also sort of put suggestion in there that you had mentioned as well, Elisabeth, have a small group try to perform the measurement. And I think that’s really important because again, you get to see from their eyes what they’re interpreting the instructions to be, what the measurements are going to be. They’re going to tell you what’s missing. I think that could be really helpful.

I did a data collection plan with a bunch of truck drivers. So we were trying to figure out why the date – the deliveries were not complete. And the best people to answer that was the driver. And we started out with having them fill out and they hated it. And we had to make a couple of adjustments and based on their feedback.

And so, a lot of these best practices, we actually incorporate into the new data sheet and they were much happier and we had less variation in how they are interpreting the columns.

Elisabeth Swan: Nice, Tracy. Good job.

Tracy O’Rourke: Yes. So those are some of the best practices.

Elisabeth Swan: Yeah, these are great. Thanks so much for adding those into this webinar. Those were all great and useful.

How Do We Make Charts?

So from here, the question would be, “Well, we’ve got all this data, how do we make charts?” And the good news is we’ve got a webinar for that. So this webinar will also give you the solutions to the mystery of the broken dishware. So there’s a little bit of a payoff at the end of this when you tune in and find out.

Today We Covered

All right. We talked today about how do you build your data collection plan? How to design a data check sheet? What happens when you collect data, how you refine your operational definitions, and your check sheets and how you populate a data spreadsheet. And we also had some wonderful tips and tricks not only from us but from our learner community. Again, thank you all.

Q & A

And that takes us to some questions. Folks have been adding questions in throughout the webinar. So now, we are going to address some of those questions. So Tracy, what have you got from the learners?

Tracy O’Rourke: Well, we have a couple of questions. But before we do that, we should probably review a couple of things just to give people a little bit more time.


Upcoming Webinar – October 11th, 11AM PT

Elisabeth Swan: OK. Next up, we’ve got a webinar coming up. Tracy, you are going to talk about 5 Ways that SIPOC Helps You Understand and Improve Your Process. Can you give me – give the group a little bit of an overview of what’s that going to be about?

Tracy O’Rourke: Yes. So SIPOC is actually one of my favorite tools in the toolkit of Lean and Six Sigma. And SIPOC could do so many things that really help you understand your process and improve it. And I’m going to talk about the 5 ways that you can use a SIPOC because right now, I feel like most people only use a SIPOC for one or two of the ways and they’re underutilizing this tool. So we’re going to talk about how you can optimize your SIPOC in this webinar.

Elisabeth Swan: Great, Tracy. I am looking forward to your webinar.

Just-In-Time Podcast

Then we’ve got also a series of the Just-In-Time Café Podcast. And this month’s episode, the featured guest is John Guaspari. He has written his most recent book about employee engagement. It’s called “Otherwise Engaged”. And it’s a great just exploration of what pulls people into process improvement efforts, what makes them want to be part of what’s happening in a culture. It’s a great conversation. It’s a great podcast with lots of fun. There are apps, the usual book reviews and Six Sigma industry news. It’s a good one. So tune in to the Just-In-Time Podcast.

And now we’re back to our questions, Tracy.

Q & A

Tracy O’Rourke: All right. So we’ve got a few questions coming in for you, Elisabeth. So the first question from Sue is, what would be a good number of individuals for collecting data?

Elisabeth Swan: That’s a good question, Sue. And it depends on the data collection task. So if you can get it done with minimal effort then you would not need a lot of people. If data collection is going to take some time, you’re probably going to need more people.

So if you can get it done with minimal effort then you would not need a lot of people. If data collection is going to take some time, you’re probably going to need more people

An example would be, if you had to spend a few minutes reviewing an application to determine if there were any errors on it and you had to review hundreds to have a decent sample size then it’s a good idea to enlist more people.

Another example that might require more folks to collect data is if you have different locations as stratification factors in order to get once again, a decent sample. You have the sample from different areas. Now, you’re going to need operators at each location to ensure it’s a balanced sample. That’s going to up the amount of data collection.

So it really depends on the data being collected, the effort required for each unit when you’re assessing it and how to accommodate those stratification factors. Again, great question, Sue.

Tracy O’Rourke: Nice response too, Elisabeth. So here’s another one for you. Is there a series of previously recorded webinars? And if so, where are they available?

Elisabeth Swan: Well, this is a great question. So please check out the library of Concepts and Tools webinars right on our website. We could review one – we record one every month and develop them based on user demand. So you let us know if there’s a topic you’re interested in and we’ll work to add it in there.

There’s also a series of success story webinars where project leads walk us through their storyboards and describe how they overcame obstacles and reached their goals. Those are great as educational pieces, as examples to help you if you have a similar process. So right on the main website, you’ll see one of the menus is webinars. And there are two types. So you’ll see it right there.

Tracy O’Rourke: And I love them too. I always check them out. OK. Another question from Lyn, is there a research behind the 5 to 7 categories? And I believe she’s speaking to the 5 to 7 categories that you said don’t have too many categories. I’d love to share that with our external regulator/auditor. It sounds like she might be having a problem there.

Elisabeth Swan: Yeah, and I’m not surprised. This often gets a lot of pushback. People are very buoyant. They have to have those categories. So this is a great question, Lyn. I’m working to find the original source for you. But in the meantime, one way to prove it internally is just to show which categories people are choosing.

Tracy gave that example of they’re really going for the top 5 that are on the first part of the list. The same thing happened with the group trying to get a handle on paper waste. People were only getting 5 to 7 of them. So the data itself can show you what’s being selected and what’s never selected or in a blue moon. So that’s going to help you.

In the meantime, I often find it’s the 80-20 rule, 80% of the responders choose 20% of the categories and then it just trails off. But I’m not done. I’m going to move this out because it has been a while but I know there’s a source and I’ll get back to it.

Tracy O’Rourke: Wonderful. Thank you. Maria Victoria wants to know, is the Green Belt and  recognized by other organizations? So, this isn’t directly related to your data webinar but probably a good question. She is probably curious.

Elisabeth Swan: Yeah, it’s a great question. And the answer is yes. Green Belt, Black Belt and for that matter, Yellow Belt, Master Black Belt, they’re all recognized levels of competence around problem-solving skills. These terms have been in use since the mid-‘80s. This is back with Bill Smith at Motorola and Michael Harry. And the recognition is very broad across industries at this point.

Green Belt, Black Belt and for that matter, Yellow Belt, Master Black Belt, they’re all recognized levels of competence around problem-solving skills.

And that’s one of the advantages of the belt level terminology. It allows leadership from different industries to recognize and hire the level of expertise that they need. Do they want people to all have Yellow Belt level understanding. People generally want more Green Belts than Black Belts but they want to go a few Black Belts in the organization to make sure they can handle the bigger, tougher, more statistically analytical projects. So I hope that helps but absolutely the simple answer is yes.

Tracy O’Rourke: Yes. And I actually do like it as well. I think we had – one of our first podcasts ever, Elisabeth, we were talking about the benefits of Lean and we are comparing Lean and Six Sigma. And ultimately, we really did like the belt level terminology because it really does help people understand how much training people have had versus just what do you know, all of these tools in Lean. And I think actually they’ve even created some levels of Lean training too that mirror the belt level terminology that originally came from Six Sigma. So, thank you for that response.

Elisabeth Swan: Yeah, it’s helpful.

Tracy O’Rourke: OK. Another question for you, this is a – what is a good way to figure out what data to collect? I think you just covered that. But I think this is just a good clarification question.

Elisabeth Swan: Yeah. So the first up is to clarify what issue you’re trying to understand. So in our example, the problem of the breaking dishware. And the next step is to list all the questions you want answer in order to conduct proper processes analysis.

So, where are the dishes breaking? What types of dishes are breaking? And those questions are going to drive your data collection efforts and help you determine the root cause. So that’s why I really pushed stratification because that’s often how you’re going to get information and analysis to answer those questions. But that’s a great question. Thank you.

Tracy O’Rourke: I love that response too because I agree, figuring out what questions you’re trying to answer, that helps you figure out what data to collect and it really hounds in on collecting the right data. I think someone had mentioned earlier, one of the questions was something about well, how do you know what to go collect? And so I think that’s great, great response. Thank you, Elisabeth.

How do keep people motivated to collect data over a time period of a year? People get tired of collecting data and then data analysis becomes difficult.

So there are a few things that I would recommend trying when you’re dealing with long term data collection efforts. First, if you can automate it, go for it. Not always an option but it never hurts to ask. So if you know it’s going to be long term, if it’s not a mission option, that’s great.

Next step is to ask the data collectors that they’ve got ideas for shortcuts or time-saving methods. Sometimes get the impression that data collection method is set in stone. So you should reach out if you haven’t and find out if people had ideas that they just haven’t relayed. So get ideas from the data collection themselves.

And lastly, engage the data collectors in the results of the data collection. Arrange to share the results with them so they can appreciate the fruits of their labor. It’s one thing to tell them what’s happening upfront, why you’re doing it, but it’s I think even more respectful to share it with them afterwards and say, “Here’s what we found. Here’s what we’re getting and it’s leading us toward a goal. It’s going to help us with the impact we want to see.” So really engage them in the reason for doing it.

Tracy O’Rourke: Wonderful. Thank you, Elisabeth. And this is the last question we have for you. How do you gain the support of a project leader who feels that data collection is tedious and unnecessary?

Elisabeth Swan: I know. That’s a tough one. You know when a person in a leadership position is not supportive, there may not be a lot of options. But being unfamiliar with the situation, I will make a few suggestions and see if they can help.

One idea is to understand exactly what the project leader believes will help. Is there any data collection going on at all? What does this person believe is valuable for the project? So I don’t understand what the project is about but are they getting some information and how are they getting it?

One idea is to understand exactly what the project leader believes will help. Is there any data collection going on at all? What does this person believe is valuable for the project? So I don’t understand what the project is about but are they getting some information and how are they getting it?

Another idea is to find easier ways to collect the data. We just talked about that. Is there some way to make this not a tedious task? Is there a simpler way to get it or simpler questions to ask? Have you described the potential benefit? Are they really clear with what you would get from the data you would like to get? What’s that going to do? If the team collects the data, maybe it helps to point to useful solutions. Would that benefit the team and other stakeholders? So make that –making that clear might help your effort and I can only say good luck.

Tracy O’Rourke: And I would just add. And you said this indirectly. The team collects the data and it helps them figure out what’s beneficial for the team. I would also say, do the what’s in it for this stakeholder or this leader, the WII-FM. Everyone loves to tune in to WII-FM radio, right? And maybe they are – point out, “Well, I mean do you really want someone else to prove us wrong because we didn’t collect those data? I mean wouldn’t that be embarrassing?” I mean whatever it is. I mean I think sometimes pointing out what it does for this leader in particular can be helpful into getting them to be more supportive.

So good luck with that as well. Elisabeth, thank you for all of your answers on these questions. You are definitely an expert in this. And I want to thank everyone for joining us on today’s webinar. We hope you’ve enjoyed your time with us and you have found this webinar to be very helpful.

Please share your feedback with us by completing this survey presented at the – when the webinar ends. And we do definitely use your feedback to design additional webinars on Lean Six Sigma topics. So if you want to hear about a specific webinar, please let us know. Thank you everyone and I want to also thank the whole team here at that you’ve joined us. Thank you so much and goodbye.

Elisabeth Swan: Bye everybody.

View our upcoming webinars and join live so you can ask questions and let us know what you’d like to us to cover next. We’re busy building new webinars all the time. And we’re happy to know you’re busy too – building your problem-solving muscles – keep it up!

Get Full Lean Six Sigma Training & Certification

You can also register for our full Yellow Belt, Green Belt, Black Belt and Lean Training & Certification courses to give you a solid foundation for applying Lean Six Sigma.

Elisabeth Swan

Elisabeth is a Master Black Belt at, the co-author of The Problem-Solver’s Toolkit and co-host of the Just-in-Time Cafe. For over 30 years, she's helped leading organizations like Amazon, Charles Schwab and Marriott International, Inc. build problem-solving muscles with Lean Six Sigma to achieve their goals.

This Post Has 0 Comments

Leave a Reply

Close search
×Close search