Debugging: 9 Indispensable Rules

The official title of this book is quite a mouthful: Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems, by David J. Agans.

Agans distills debugging techniques down to nine essential rules and includes engineering war stories that demonstrate these principles. This book helped me crystallize my debugging strategy and provided the language I now use to describe my techniques. This book also reinforced the importance of techniques that I have only loosly employed. I'm certain many of you are similarly guilty of violating the principles of "Change One Thing at a Time" and "Keep an Audit Trail"!

Here's the full list of rules:

  • Understand the System
  • Make it Fail
  • Quit Thinking and Look
  • Divide and Conquer
  • Change One Thing at a Time
  • Keep an Audit Trail
  • Check the Plug
  • Get a Fresh View
  • If You Didn't Fix it, It Ain't Fixed

I highly recommend reading this book if you want to improve your debugging chops and engineering skills. Even veteran debuggers will likely learn a thing or two.

Debuggers who naturally use these rules are hard to find. I like to ask job applicants, “What rules of thumb do you use when debugging?” It’s amazing how many say, “It’s an art.” Great — we’re going to have Picasso debugging our image-processing algorithm. The easy way and the artistic way do not find problems quickly.

You can find out more at Agans's website Debugging Rules!

My Highlights

When it took us a long time to find a bug, it was because we had neglected some essential, fundamental rule; once we applied the rule, we quickly found the problem.

People who excelled at quick debugging inherently understood and applied these rules. Those who struggled to understand or use these rules struggled to find bugs.

Debuggers who naturally use these rules are hard to find. I like to ask job applicants, “What rules of thumb do you use when debugging?” It’s amazing how many say, “It’s an art.” Great—we’re going to have Picasso debugging our image-processing algorithm. The easy way and the artistic way do not find problems quickly.

Quality process techniques are valuable, but they’re often not implemented; even when they are, they leave some bugs in the system.

Once you have bugs, you have to detect them; this takes place in your quality assurance (QA) department or, if you don’t have one of those, at your customer site. This book doesn’t deal with this stage either—test coverage analysis, test automation, and other QA techniques are well handled by other resources.

Debugging usually means figuring out why a design doesn’t work as planned. Troubleshooting usually means figuring out what’s broken in a particular copy of a product when the product’s design is known to be good—there’s a deleted file, a broken wire, or a bad part.

You need a working knowledge of what the system is supposed to do, how it’s designed, and, in some cases, why it was designed that way. If you don’t understand some part of the system, that always seems to be where the problem is. (This is not just Murphy’s Law; if you don’t understand it when you design it, you’re more likely to mess up.)

The essence of “Understand the System” is, “Read the manual.” Contrary to my dad’s comment, read it first—before all else fails. When you buy something, the manual tells you what you’re supposed to do to it, and how it’s supposed to act as a result. You need to read this from cover to cover and understand it in order to get the results you want. Sometimes you’ll find that it can’t do what you want—you bought the wrong thing.

A caution here: Don’t necessarily trust this information. Manuals (and engineers with Beemers in their eyes) can be wrong, and many a difficult bug arises from this. But you still need to know what they thought they built, even if you have to take that information with a bag of salt.

There’s a side benefit to understanding your own systems, too. When you do find the bugs, you’ll need to fix them without breaking anything else. Understanding what the system is supposed to do is the first step toward not breaking it.

the function that you assume you understand is the one that bites you. The parts of the schematic that you ignore are where the noise is coming from. That little line on the data sheet that specifies an obscure timing parameter can be the one that matters.

Reference designs and sample programs tell you one way to use a product, and sometimes this is all the documentation you get. Be careful with such designs, however; they are often created by people who know their product but don’t follow good design practices, or don’t design for real-world applications. (Lack of error recovery is the most popular shortcut.) Don’t just lift the design; you’ll find the bugs in it later if you don’t find them at first.

When there are parts of the system that are “black boxes,” meaning that you don’t know what’s inside them, knowing how they’re supposed to interact with other parts allows you to at least locate the problem as being inside the box or outside the box.

You also have to know the limitations of your tools. Stepping through source code shows logic errors but not timing or multithread problems; profiling tools can expose timing problems but not logic flaws.

Now, if Charlie were working at my company, and you asked him, “What do you do when you find a failure?” he would answer, “Try to make it fail again.” (Charlie is well trained.) There are three reasons for trying to make it fail:

  1. So you can look at it.
  2. So you can focus on the cause. Knowing under exactly what conditions it will fail helps you focus on probable causes.
  3. So you can tell if you’ve fixed it. Once you think you’ve fixed the problem, having a surefire way to make it fail gives you a surefire test of whether you fixed it.

If without the fix it fails 100 percent of the time when you do X, and with the fix it fails zero times when you do X, you know you’ve really fixed the bug. (This is not silly. Many times an engineer will change the software to fix a bug, then test the new software under different conditions from those that exposed the bug. It would have worked even if he had typed limericks into the code, but he goes home happy. And weeks later, in testing or, worse, at the customer site, it fails again. More on this later, too.)

You have enough bugs already; don’t try to create new ones. Use instrumentation to look at what’s going wrong (see Rule 3: Quit Thinking and Look), but don’t change the mechanism; that’s what’s causing the failure.

How many engineers does it take to fix a lightbulb? A: None; they all say, “I can’t reproduce it—the lightbulb in my office works fine.”

Automation can make an intermittent problem happen much more quickly, as in the TV game story. Amplification can make a subtle problem much more obvious, as in the leaky window example, where I could locate the leak better with a hose than with the occasional rainstorm. Both of these techniques help stimulate the failure, without simulating the mechanism that’s failing. Make your changes at a high enough level that they don’t affect how the system fails, just how often.

What can you do to control these other conditions? First of all, figure out what they are. In software, look for uninitialized data (tsk, tsk!), random data input, timing variations, multithread synchronization, and outside devices (like the phone network or the six thousand kids clicking on your Web site). In hardware, look for noise, vibration, temperature, timing, and parts variations (type or vendor). In my all-wheel-drive example, the problem would have seemed intermittent if I hadn’t noticed the temperature and the speed.

Once you have an idea of what conditions might be affecting the system, you simply have to try a lot of variations. Initialize those arrays and put a known pattern into the inputs of your erratic software. Try to control the timing and then vary it to see if you can get the system to fail at a particular setting. Shake, heat, chill, inject noise into, and tweak the clock speed and the power supply voltage of that unreliable circuit board until you see some change in the frequency of failure.

Sometimes you’ll find that controlling a condition makes the problem go away. You’ve discovered something—what condition, when random, is causing the failure. If this happens, of course, you want to try every possible value of that condition until you hit the one that causes the system to fail. Try every possible input data pattern if a random one fails occasionally and a controlled one doesn’t.

One caution: Watch out that the amplified condition isn’t just causing a new error. If a board has a temperature-sensitive error and you decide to vibrate it until all the chips come loose, you’ll get more errors, but they won’t have anything to do with the original problem.

You have to be able to look at the failure. If it doesn’t happen every time, you have to look at it each time it fails, while ignoring the many times it doesn’t fail. The key is to capture information on every run so you can look at it after you know that it’s failed. Do this by having the system output as much information as possible while it’s running and recording this information in a “debug log” file.

By looking at captured information, you can easily compare a bad run to a good one (see Rule 5: Change One Thing at a Time). If you capture the right information, you will be able to see some difference between a good case and a failure case. Note carefully the things that happen only in the failure cases. This is what you look at when you actually start to debug.

“It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.” —SHERLOCK HOLMES, A SCANDAL IN BOHEMIA

We lost several months chasing the wrong thing because we guessed at the failure instead of looking at it.

Actually seeing the low-level failure is crucial. If you guess at how something is failing, you often fix something that isn’t the bug. Not only does the fix not work, but it takes time and money and may even break something else. Don’t do it.

“Quit thinking and look.” I make this statement to engineers more often than any other piece of debugging advice. We sometimes tease engineers who come up with an idea that seems pretty good on the surface, but on further examination isn’t very good at all, by saying, “Well, he’s a thinker.” All engineers are thinkers. Engineers like to think.

So why do we imagine we can find the problem by thinking about it? Because we’re engineers, and thinking is easier than looking.

What we see when we note the bug is the result of the failure: I turned on the switch and the light didn’t come on. But what was the actual failure? Was it that the electricity couldn’t get through the broken switch, or that it couldn’t get through the broken bulb filament? (Or did I flip the wrong switch?) You have to look closely to see the failure in enough detail to debug it.

Many problems are easily misinterpreted if you can’t see all the way to what’s actually happening. You end up fixing something that you’ve guessed is the problem, but in fact it was something completely different that failed.

How deep should you go before you stop looking and start thinking again? The simple answer is, “Keep looking until the failure you can see has a limited number of possible causes to examine.”

Experience helps here, as does understanding your system. As you make and chase bad guesses, you’ll get a feel for how deep you have to see in a given case. You’ll know when the failure you see implicates a small enough piece of the design. And you’ll understand that the measure of a good debugger is not how soon you come up with a guess or how good your guesses are, but how few bad guesses you actually act on.

Seeing the failure in low-level detail has another advantage in dealing with intermittent bugs, which we’ve discussed before and will discuss again: Once you have this view of the failure, when you think you’ve fixed the bug, it’s easy to prove that you did fix the bug. You don’t have to rely on statistics; you can see that the error doesn’t happen anymore. When our senior engineer fixed the noise problem on our slave microprocessor, he could see that the glitch in the write pulse was gone.

In the world of electronic hardware, this means test points, test points, and more test points. Add a test connector to allow easy access to buses and important signals. These days, with programmable gate arrays and application-specific integrated circuits, the problem is often buried inside a chunk of logic that you can’t get into with external instruments, so the more signals you can bring out of the chip, the better off you’ll be.

As mentioned in “Make It Fail,” even minor changes can affect the system enough to hide the bug completely. Instrumentation is one of those changes, so after you’ve added instrumentation to a failing system, make it fail again to prove that Heisenberg isn’t biting you.

“Quit Thinking and Look” doesn’t mean that you don’t ever make any guesses about what might be wrong. Guessing is a pretty good thing, especially if you understand the system. Your guesses may even be pretty close, but you should guess only to focus the search. You still have to confirm that your guess is correct by seeing the failure before you go about trying to fix the failure.

So don’t trust your guesses too much; often they’re way off and will lead you down the wrong path. If it turns out that careful instrumentation doesn’t confirm a particular guess, then it’s time to back up and guess again. (Or reconsult your Ouija board, or throw another dart at your “bug cause” dartboard—whatever your methodology may be. I recommend the use of Rule 4: “Divide and Conquer.

An exception: One reason for guessing in a particular way is that some problems are more likely than others or easier to fix than others, so you check those out first. In fact, when you make a particular guess because that problem is both very likely and easy to fix, that’s the one time you should try a fix without actually seeing the details of the failure.

You can think up thousands of possible reasons for a failure. You can see only the actual cause. See the failure. The senior engineer saw the real failure and was able to find the cause. The junior guys thought they knew what the failure was and fixed something that wasn’t broken. See the details. Don’t stop when you hear the pump. Go down to the basement and find out which pump. Build instrumentation in. Use source code debuggers, debug logs, status messages, flashing lights, and rotten egg odors. Add instrumentation on. Use analyzers, scopes, meters, metal detectors, electrocardiography machines, and soap bubbles. Don’t be afraid to dive in. So it’s production software. It’s broken, and you’ll have to open it up to fix it. Watch out for Heisenberg. Don’t let your instruments overwhelm your system. Guess only to focus the search. Go ahead and guess that the memory timing is bad, but look at it before you build a timing fixer.

After he did this with all the pins, he plugged in the lines and proved that the terminals worked. Then he unplugged everything, reassembled the box, plugged everything back in, and reconfirmed that the terminals worked. This was to accommodate Goldberg’s Corollary to Murphy’s Law, which states that reassembling any more than is absolutely necessary before testing makes it probable that you have not fixed the problem and will have to disassemble everything again, with a probability that increases in proportion to the amount of reassembly effort involved.

But the rule that our technician demonstrated particularly well with this debugging session was “Divide and Conquer.” He narrowed the search by repeatedly splitting up the search space into a good half and a bad half, then looking further into the bad half for the problem.

Narrow the search. Home in on the problem. Find the range of the target. The common technique used in any efficient target search is called successive approximation—you want to find something within a range of possibilities, so you start at one end of the range, then go halfway to the other end and see if you’re past it or not. If you’re past it, you go to one-fourth and try again. If you’re not past it, you go to three-fourths and try again. Each try, you figure out which direction the target is from where you are and move half the distance of the previous move toward the target. After a small number of attempts, you home right in on it.

When you inject known input patterns, of course, you should be careful that you don’t change the bug by setting up new conditions. If the bug is pattern dependent, putting in an artificial pattern may hide the problem. “Make It Fail” before proceeding.

So when you do figure out one of several simultaneous problems, fix it right away, before you look for the others.

I’ve often heard someone say, “Well, that’s broken, but it couldn’t possibly affect the problem we’re trying to find.” Guess what—it can, and it often does. If you fix something that you know is wrong, you get a clean look at the other issues. Our hotel technician was able to see the direction of the bad wiring on the really slow terminal only after he fixed the high resistances in both directions in the breakout box.

A corollary to the previous rule is that certain kinds of bugs are likely to cause other bugs, so you should look for and fix them first. In hardware, noisy signals cause all kinds of hard-to-find, intermittent problems. Glitches and ringing on clocks, noise on analog signals, jittery timing, and bad voltage levels need to be taken care of before you look at other problems; the other problems are often very unpredictable and go away when you fix the noise. In software, bad multithread synchronization, accidentally reentrant routines, and uninitialized variables inject that extra shot of randomness that can make your job hell.

It’s also easy to become a perfectionist and start “fixing” every instance of bad design practice you find, in the interest of general quality. You can eliminate the GOTOs in your predecessor’s code simply because you consider them nasty, but if they aren’t actually causing problems, you’re usually better off leaving them alone.

When his change didn’t fix the problem, he should have backed it out immediately.

Change one thing at a time. You’ve heard of the shotgun approach? Forget it. Get yourself a good rifle. You’ll be a lot better at fixing bugs.

If you’re working on a mortgage calculation program that seems to be messing up on occasional loans, pin down the loan amount and the term, and vary the interest rate to see that the program does the right thing. If that works, pin down the term and interest rate, and vary the loan amount. If that works, vary just the term. You’ll either find the problem in your calculation or discover something really surprising like a math error in your Pentium processor. (“But that can’t happen!”) This kind of surprise happens with the more complex bugs; that’s what makes them complex. Isolating and controlling variables is kind of like putting known data into the system: It helps you see the surprises.

Sometimes, changing the test sequence or some operating parameter makes a problem occur more regularly; this helps you see the failure and may be a great clue to what’s going on. But you should still change only one thing at a time so that you can tell exactly which parameter had the effect. And if a change doesn’t seem to have an effect, back it out right away!

What you’re looking for is never the same as last time. And it takes a fair amount of knowledge and intelligence to sift through the irrelevant differences, differences that were caused by timing or other factors. This knowledge is beyond what a beginner has, and this intelligence is beyond what software can do. (A.I. waved good-bye, remember?) The most that software can do is help you to format and filter logs so that when you apply your superior human brain (you do have a superior one, don’t you?) to analyzing the logs, the differences (and possibly the cause of the differences) will jump right out at that brain.

Sometimes the difference between a working system and a broken one is that a design change was made. When this happens, a good system starts failing. It’s very helpful to figure out which version first caused the system to fail, even if this involves going back and testing successively older versions until the failure goes away.

The point of this story is that sometimes it’s the most insignificant-seeming thing that’s actually the key to making a bug happen. What seems insignificant to the person doing the testing (the plaid shirt) may be important to the person trying to fix the problem. And what seems obvious to the tester (the chip had to be restarted) may be completely missed by the fixer. So you have to take note of everything—on the off chance that it might be important and nonobvious.

Keep an audit trail. As you investigate a problem, write down what you did, what order you did it in, and what happened as a result. Do this every time. It’s just like instrumenting the software or the hardware—you’re instrumenting the test sequence. You have to be able to see what each step was, and what the result was, to determine which step to focus on during debugging.

Unfortunately, while the value of an audit trail is often accepted, the level of detail required is not, so a lot of important information gets left out. What kind of system was running? What was the sequence of events leading up to the failure? And sometimes even, what was the actual failure? (Duh!) Sometimes the report just says, “It’s broken.” It doesn’t say that the graphics were completely garbled or that all the red areas came up green or that the third number was wrong. It just says it failed.

Tool control is critical to the accurate re-creation of a version, and you should make sure you have it. As discussed later, unrecognized tool variations can cause some very strange effects.

Never trust your memory with a detail—write it down. If you trust your memory, a number of things will happen. You’ll forget the details that you didn’t think were important at the time, and those, of course, will prove to be the critical ones. You’ll forget the details that actually weren’t important to you, but might be important to someone else working on a different problem later.

You won’t be able to transmit information to anyone else except verbally, which wastes everybody’s time, assuming you’re still around to talk about it. And you won’t be able to remember exactly how things happened and in what order and how events related to one another, all of which is crucial information.

Write it down. It’s better to write it electronically so you can make backup copies, attach it to bug reports, distribute it to others easily, and maybe even filter it with automated analysis tools later. Write down what you did and what happened as a result. Save your debug logs and traces, and annotate them with related events and effects that they don’t inherently record themselves. Write down your theories and your fixes. Write it all down.

This is more likely to happen with what I call “overhead” or “foundation” factors. Because they’re general requirements (electricity, heat, clock), they get overlooked when you debug the details.

It’s classic to say, “Hmm, this new code works just like the old code” and then find out that, in fact, you didn’t actually load the new code. You loaded the old code, or you loaded the new code but it’s still executing the old code because you didn’t reboot your computer or you left an easier-to-find copy of the old code on your system.

Speaking of starting your car, another aspect to consider is whether the start-up conditions are correct. You may have the power plugged in, but did you hit the start button? Has the graphics driver been initialized? Has the chip been reset? Have the registers been programmed correctly? Did you push the primer button on your weed whacker three times? Did you set the choke? Did you set the on/off switch to on? (I usually notice this one after six or seven fruitless pulls.)

If you depend on memory being initialized before your program runs, but you don’t do it explicitly, it’s even worse—sometimes startup conditions will be correct. But not when you demo the program to the investors.

Your bad assumptions may not be about the product you’re building, but rather about the tools you’re using to build it, as in the consultant story.

Default settings are a common problem. Building for the wrong environment is another—if you use a Macintosh compiler, you obviously can’t run the program on an Intel PC, but what about your libraries and other common code resources?

It may not be just your assumptions about the tools that are bad—the tools may have bugs, too. (Actually, even this is just your bad assumption that the tool is bug-free. It was built by engineers; why would it be any more trustworthy than what you’re building?)

“Nothing clears up a case so much as stating it to another person.” —SHERLOCK HOLMES, SILVER BLAZE

There are at least three reasons to ask for help, not counting the desire to dump the whole problem into someone else’s lap: a fresh view, expertise, and experience. And people are usually willing to help because it gives them a chance to demonstrate how clever they are.

It’s hard to see the big picture from the bottom of a rut. We’re all human. We all have our biases about everything, including where a bug is hiding. Those biases can keep us from seeing what’s really going on. Someone who comes at the problem from an unbiased (actually, differently biased) viewpoint can give us great insights and trigger new approaches. If nothing else, that person can at least tell you it looks like you’ve got a nasty problem there and offer you a shoulder to cry on.

In fact, sometimes explaining the problem to someone else gives you a fresh view, and you solve the problem yourself. Just organizing the facts forces you out of the rut you were in. I’ve even heard of a company that has a room with a mannequin in it—you go explain your problems to the mannequin first. I imagine the mannequin is quite useful and contributes to the quick solution of a number of problems. (It’s probably more interactive than some people you work with. I bet it’s also a very forgiving listener, and none of your regrettable misconceptions will show up in next year’s salary review.)

There are occasions where a part of the system is a mystery to us; rather than go to school for a year to learn about it, we can ask an expert and learn what we need to know quickly. But be sure that your expert really is an expert on the subject—if he gives you vague, buzzword-laden theories, he’s a technical charlatan and won’t be helpful. If he tells you it’ll take thirty hours to research and prepare a report, he’s a consultant and may be helpful, but at a price.

In any case, experts “Understand the System” better than we do, so they know the road map and can give us great search hints. And when we’ve found the bug, they can help us design a proper fix that won’t mess up the rest of the system.

You may not have a whole lot of experience, but there may be people around you who have seen this situation before and, given a quick description of what’s going on, can tell you exactly what’s wrong, like the dome light short in the story. Like experts, people with experience in a specific area can be hard to find, and thus expensive. It may be worth the money:

If you’re dealing with equipment or software from a third-party vendor, e-mail or call that vendor. (If nothing else, the vendor will appreciate the bug report.) Usually, you’ll be told about some common misunderstanding you’ve got; remember, the vendor has both expertise and experience with its product.

This is a good time to reiterate “Read the Manual.” In fact, this is where the advice becomes “When All Else Fails, Read the Manual Again.” Yes, you already read it because you followed the first rule faithfully. Look at it again, with your newfound focus on your particular problem—you may see or understand something that you didn’t before.

You may be afraid to ask for help; you may think it’s a sign of incompetence. On the contrary, it’s a sign of true eagerness to get the bug fixed. If you bring in the right insight, expertise, and/or experience, you get the bug fixed faster. That doesn’t reflect poorly on you; if anything, it means you chose your help wisely.

The opposite is also true: Don’t assume that you’re an idiot and the expert is a god. Sometimes experts screw up, and you can go crazy if you assume it’s your mistake.

No matter what kind of help you bring in, when you describe the problem, keep one thing in mind: Report symptoms, not theories. The reason you went to someone else for fresh insight is that your theories aren’t getting you anywhere. If you go to somebody fresh and lay a theory on her, you drag her right down into the same rut you’re in.

I’ve seen this mistake a lot; the help is brought in and immediately poisoned with the old, nonworking theories. (If the theories were any good, there’d be no need to bring in help.) In some cases, a good helper can plow through all that garbage and still get to the facts, but more often you just end up with a big crowd down in that rut.

This rule works both ways. If you’re the helper, cut off the helpee who starts to tell you theories. Cover your ears and chant “La-La-LaLa-La.” Run away. Don’t be poisoned.

If you follow the “Make It Fail” rule, you’ll know how to prove that you’ve fixed the problem. Do it! Don’t assume that the fix works; test it. No matter how obvious the problem and the fix seem to be, you can’t be sure until you test it. You might be able to charge some young sucker seventy-five bucks, but he won’t be happy about it when it still fails.

When you think you’ve fixed an engineering design, take the fix out. Make sure it’s broken again. Put the fix back in. Make sure it’s fixed again. Until you’ve cycled from fixed to broken and back to fixed again, changing only the intended fix, you haven’t proved that you fixed it.

If you didn’t fix it, it ain’t fixed. Everyone wants to believe that the bug just went away. “We can’t seem to make it fail anymore.” “It happened that couple of times, but gee, then something happened and it stopped failing.” And of course the logical conclusion is, “Maybe it won’t happen again.” Guess what? It will.

When they report the problem, it’s better to say, “Thanks! We’ve been trying for months to capture that incredibly rare occurrence; please e-mail us the log file,” than to say, “Wow, incredible. That’s never happened here.”

While we’re on the subject of design quality, I’ve often thought of ISO-9000 as a method for keeping an audit trail of the design process. In this case, the bugs you’re trying to find are in the process (neglecting vibration) rather than in the product (leaky fitting), but the audit trail works the same way. As advertised in the introduction, the methods presented in this book are truly general purpose.

“That can’t happen” is a statement made by someone who has only thought about why something can’t happen and hasn’t looked at it actually happening.

New hardware designs with intermittent errors are often the victims of noise (stray voltages on a wire that turn a 1 into a 0 or vice versa).

By introducing additional noise into the system, I was able to make it fail more often. This was a bit lucky, though. For some circuits, touching a finger is like testing that leaky window with a fire hose—the noise will cause a perfectly good circuit to fail. Sometimes the opposite will occur; the capacitance of the finger will eliminate the noise and quiet the system down.

I’ve established a Web site at http://www.debuggingrules.com that’s dedicated to the advancement of debugging skills everywhere. You should visit, if for no other reason than to download the fancy Debugging Rules poster—you can print it yourself and adorn your office wall as recommended in the book. There are also links to various other resources that you may find useful in your debugging education. And I’ll always be interested in hearing your interesting, humorous, or instructive (preferably all three) war stories; the Web site will tell you how to send them. Check it out.

You can appeal to their sense of team communication. Several of the people who reviewed drafts of the book were team leaders who consider themselves excellent debuggers, but they found that the rules crystallized what they do in terms that they could more easily communicate to their teams. They found it easy to guide their engineers by saying, “Quit Thinking and Look,” the way I’ve been doing for twenty years.