There are some points I’ve been making to engineers lately that I feel would be valuable to share more widely.
When you work in engineering, you are given different types of tasks. Some tasks are urgent or short-term tasks. We sometimes call this “putting out fires,” especially when it involves the task of taking care of urgently broken or immediately needed things without delay.
Other tasks are strategic in nature. We gather information from our users about what they need and want, design a solution, and work towards it methodically and intelligently.
It’s important to understand when and what type of work you’re doing and think differently.
fire
When you put out a fire, your goal is to put it out. Essentially, you want to do whatever is the bare minimum to put out the fire so you can get back to long-term strategic work. You don’t want to be involved in building huge, complex systems that will last forever just to put out fires. In an emergency, you want to work quickly.That doesn’t mean you should do that. bad work. However, you should not build a long-term, high-maintenance system. Extinguish the fire.
There are different types of fires. In some cases, an executive or another team may come to you with an urgent request. Must It will be completed within the next few weeks. What you want to do is figure out how to get that task done and out of the way so you can get back to your long-term strategic work.
Also, actual emergencies may occur, such as power outages. In this case, it is clear that you can just fix the outage without worrying about other tasks. An outage is not the time to say, “I’ll wait to write the design documentation and review it with senior engineers next week.” However, this applies to any fire. A fire is not the time to apply long-term software design methods and systems.
example
Let’s look at a more concrete example to show what I’m talking about. Let’s say an officer comes to you and says, “I have a customer who wants to hand over $1 million next week, but first I need to create a graph that shows how the server holds up under high load.” Let’s say you don’t even have a system.
If you’re thinking about this long-term and strategically, you might think, “Oh, I need a system to track server load.” You need to consider in detail how the storage for this works, how you can ensure it is accurate, how you monitor it, and how you test it. Next, you’ll need to work with your user experience designer to conduct standard user research and create her UI design from there, so that the graphs your user experience designer creates will be well understood by your users. there is. ”
It doesn’t end in a week. It’s also a waste of time. In fact, we don’t know if this fire will happen again. Just because someone comes to you with an urgent need; one time This does not mean at all that you will need this in the long term.It may seem like it is, and it may be. guess Yes, but why do you Infer What about long-term strategic design? No need to guess about long-term work. When you’re doing long-term work, you have the luxury of doing research to find out the needs and requirements of real users. So do it and build things based on that and not on guesswork.
Instead, you should say something like: “Okay, tomorrow I’m going to do a very basic load test that I can run manually from a script on my machine.” You’re deploying a new version of the server that just writes information about the load to a log file. , manually create a graph based on the parsing of that log. ” All of this was basically the minimum amount of work needed to solve the problem.
However, that solution also comes with risks. That’s because you’ve configured your server to log load-related content. Someone later comes along and says that this is intended to be a long-term supported mechanism for tracking system load, and depends on it being properly designed and properly considered. You may think that there is. it’s not. This highlights a very important point.
Never make long-term decisions or implement long-term solutions. meanwhile fire.
In fact, intentionally undo Any work you did during the fire, such as deleting the logline, so that no one else did think That you have made a long-term decision.
This rule applies not only to technical implementation details, but also to organizational changes. Any decision. For example, let’s say the outage continues. meanwhile Power outage do not have Now is the time to discuss how you can prevent this from happening in the future and how you might need to change your normal daily processes.
Long-term decisions based on a fire can only be made safely when a “post-mortem” is performed to rationally consider the situation. rear The fire was “extinguished”. Then you can sit down and say, “What strategic efforts would you like to make to prevent a fire like this from happening again?” or “What can we leverage from this to change the way we work?”
This rule is very important. Violation of this will lead to the accumulation of madness that can destroy the group. If you base all of your company’s policies and work patterns solely on emergency decisions, you’ll end up looking like a completely crazy company and likely to fail.
strategic initiatives
The opposite of “putting out fires” (this is a spectrum, not black and white) is doing strategic work. Essentially, you have a known goal, and you’re working towards it by applying all the basic principles of software design, thinking long-term, and working smartly with your group to create something sustainable.
Similarly, applying a method or system of “putting out fires” to strategic work is a recipe for disaster. Treating every project as if it were an emergency and simply getting it done “quickly” because “it has to be done tomorrow” (even though it really isn’t) ultimately leads to chaos. .What actually happens is that you create fire! Poorly designed systems tend to tip over, have problems, are difficult to maintain, and eventually wear out. Overall They helped extinguish the fire around this poorly designed mess.
Applying Fires principles to strategy work does not actually get the strategy work done. When you see engineering organizations that can’t seem to get things done over the long term, this is often the reason. all You can’t really move forward, like the world is on fire.
Strategic work requires saying over and over again, “Okay, I understand your requirements. Thank you.” Thank you for letting us know your problem. We are building a solution for you and are doing it the right way, but it will take some time.It won’t last forever, but it will take some time. Several It’s time to get it done. ”
I think managers sometimes worry that if they tell their engineers, “Please take enough time,” the engineers will become lazy and won’t finish the work. This may be a legitimate concern in some companies, and certainly executives have an interest in keeping things moving so that the company can deliver the product. However, a balance must be struck between encouraging employees to deliver on time and ensuring that long-term software development processes and procedures are followed. In general, when doing strategic work, it’s best to make some mistakes. too much There are too many designs, a little too many reviews, etc. I’m not saying go overboard and stop building something, or subject everyone to unnecessary reviews just because you “might” need something.I’m just saying if you’re confused, this is the direction you should go error in.
do both
As long as you apply the general principles above, it is possible for one team (or one person) to handle both strategic work and fires at the same time (at least within the same week or month). The key is to do the bare minimum to deal with the fire to deal with the emergency and keep your business on track, and then refocus on strategic work once the fire is out.
After all, if you’re doing it right, strategic work should be the most important thing to your business. What we know from our research is that in the long run, providing it will have the greatest impact. So let’s put out the fire and get back to what actually matters in the long run.
-Max