4 Steps to Stop Fighting Fires: Step 2 Find Fragile Artifacts

by becki on June 22, 2009

In last week’s installment, we learned to stop shooting ourselves in the foot. You know the old joke, Patient: “Doctor, it hurts when I move my arm like this.” Doctor: “Well, stop moving your arm like that.” In our case it hurts when we continually make mistakes that result in service interruptions, so we stopped doing that.

Step 2 Goal: Learn What’s Causing the Pain

In step 1, we got a handle our our change management, and spend less time on unplanned work. In this step we do what the Visible Ops Handbooks calls “catch and release” and find “fragile artifacts.” I’m not going to kid you, this step is probably the most painful and time consuming, but don’t let that stop you. Just remember your goal is to move from a reactionary force that constantly fights fires to one that can repeatably deploy new services and successfully make changes.

The good news is you probably have a good idea of which devices are your fragile artifacts because those are the ones that break each time you touch them. “Infrastructure is considered fragile when it has a low change success rate and high MTTR” (Visible Ops Handbook p. 43). If you need confirmation, review the root causes of your service interruptions and focus on the top 1 or 2 causes.

Learn What We Have

Performing the catch and release is more time consuming. You need to audit all of your equipment and document it’s hardware and software configurations as well as the services it provides. This is not fun nor is it easy, but it is important. What you are likely to discover is that things are not the way you think they are. You’ll probably find that you have many unique configurations and hardware or services you didn’t know you have.

I love this quote from page 44 of the Handbook, “… what you have never matches what you think you have. The best organizations merely keep that gap small, safe and manageable.” Joe Judge, Former ISO, Adero, Inc

Benefits

When you complete this step you’ll:

  • Have an inventory of your equipment, software and services
  • Have documentation that elevates the knowledge of your group
  • Have identified your most fragile assets

Now you can use this information to make more informed decisions about your risk. For example, don’t make changes to your most fragile assets unless you absolutely must.

Now you’re ready for the next step, Create a Repeatable Build Library, and we’ll cover that next week.

What do you think; are you ready to tackle this project and stop fighting fires?

Related Posts


Leave a Comment

Previous post:

Next post: