(2024-07-02) Chin The Amazon Weekly Business Review

Cedric Chin: The Amazon Weekly Business Review. You may need to read the previous essay — titled Becoming Data Driven, From First Principles — to get the most out of this one. That essay lays out the worldview that backs the WBR. Do that first. 2024-01-08-ChinBecomingDataDrivenFromFirstPrinciples

Note: I learnt my practice of the Amazon-style WBR from early Amazon exec (and Jeff Bezos’s second shadow) Colin Bryar; over the course of 2022 I helped him explicate the practice of the WBR for his clients. Everything good here is attributable to him; all mistakes are mine alone.

Every Wednesday morning, Amazon’s executive team gets together and goes through 400-500 metrics that represents the current state of Amazon’s various businesses. The meeting lasts 60 minutes.

Earlier this year I wrote that one way to look at the Amazon-style WBR is to call it a ‘highly prescriptive mechanism to present and absorb a large amount of business data in a short amount of time.’ But we now know, of course, that this isn’t entirely accurate: a better way of thinking about the WBR is that it is a process control tool.

tool designed to uncover and disseminate the causal structure (model) of a business, so that you may execute with knowledge.

The Goals of the WBR

The Amazon-style WBR is designed to answer three questions:
What did our customers experience last week?
How did our business do last week?
Are we on track to hit targets?

Early on in Amazon’s practice of the WBR, Jeff Bezos observed that customers who visited Amazon.com were very likely to do comparison shopping... If Amazon’s customers knew the prices of items in other stores, so too should Amazon’s executives... so Bezos ordered that the WBR include the prices of the top 10 selling items in each major e-commerce category, along with prices of the same product if bought from Target or Walmart or some other competitor

Most businesses ask “how did the business do last week?” instead of “what did the customer experience last week?” It’s important to ask both questions, and to ask the ‘customer’ question first — it’s not for nothing that Amazon aims to be ‘Earth’s most customer-centric company.’

The third question is about targets.

Amazon sets targets for its most important metrics in an annual planning cycle it calls ‘OP1/OP2’

Amazon’s planning for the calendar year begins in the summer. It’s a painstaking process that requires four to eight weeks of intensive work

The S-Team (read: Senior Team) begins by creating a set of high-level expectations or objectives for the entire company

Once these high-level expectations are established, each group begins work on its own more granular operating plan—known as OP1

The OP1 process runs through the fall and is completed before the fourth-quarter holiday rush begins.

In January, after the holiday season ends, OP1 is adjusted as necessary to reflect the fourth-quarter results, as well as to update the trajectory of the business. This shorter process is called OP2, and it generates the plan of record for the calendar year.

OP2 makes it crystal clear what each group has committed to do, how they intend to achieve those goals, and what resources they need to get the work done.

During OP1, as the S-Team reads and reviews the various operating plans, they select the initiatives and goals from each team that they consider to be the most important to achieve. These selected goals are called, unsurprisingly, S-Team goals. In other words, my (Bill’s) team working on Amazon Music might have had 23 goals and initiatives in our 2012 operating plan. After reviewing our plan with us, the S-Team might have chosen six of the 23 to become S-Team goals. The music team would still have worked to achieve all 23 goals, but it would be sure to make resource allocation decisions throughout the year to prioritize the six S-Team goals ahead of the remaining 17.

Three notably Amazonian features of S-Team goals are their unusually large number, their level of detail, and their aggressiveness. S-Team goals once numbered in the dozens, but these have expanded to many hundreds every year

S-Team goals are mainly input-focused metrics that measure the specific activities teams need to perform during the year that, if achieved, will yield the desired business results.

S-Team goals must be Specific, Measurable, Attainable, Relevant, and Timely (SMART).

S-Team goals are aggressive enough that Amazon only expects about three-quarters of them to be fully achieved during the year.

S-Team goals for the entire company are aggregated and their metrics are tracked with centralised tools by the finance team.

Reviews are conducted in multi-hour S-Team meetings scheduled on a rolling basis over the quarter rather than all at once.

the Finance department is empowered to independently audit metrics, to ensure distortions aren’t happening at a lower level.

The finance team tracks the S-Team goals throughout the year with a status of green, yellow, and red. Green means you are on track, yellow means there is some risk of missing the goal, and red means you are not likely to hit the goal

Controllable Input Metrics and Output Metrics

directly actionable (hence ‘controllable’), and they impact some output metric you care about (hence the name ‘input’).

The output metric for one process may be a controllable input metric for another process in your business. Such causal links should emerge slowly

You are not allowed to discuss output metrics during the WBR.

A small number of metrics in the WBR are neither controllable input metric nor output metric; they are there purely for reporting or context-setting purposes. As an example, ‘percentage of mobile traffic vs desktop traffic’, for instance, is still a useful metric to look at each week

The metrics in a WBR are not static — you are expected to add or discard controllable input metrics as they stop working

the overarching goal here is that your metrics must be laid out in a logical, causal way: controllable input metrics at the start of the deck, corresponding output metrics immediately after; and then rinse and repeat your input and output metrics for each major initiative or department, ending with financial metrics at the end

As an example of what this might look like: for publication or e-commerce companies, typically the deck opens with web + mobile traffic, flows into customer engagement (or various other metrics, depending on the specifics of your business), and then ends with financials on the other end.

The Shape of the WBR

What is sitting in on a WBR like? The flow of events goes roughly as follows:

After midnight on Sunday, the metrics for the prior week are generated. On Monday morning each metric owner comes into work and reviews their numbers for the past week. To be clear, metric owners are expected to review their numbers daily. Is the same system generating the same metrics daily?

On Tuesday morning, departmental WBRs are held throughout the company. Not every department in Amazon runs a WBR; sometimes WBRs are conducted at the level of the business unit (which may cut across a few departments)....metric owners are expected to have investigated and then present causes for exceptional variation during these meetings.

The company-level WBR is held on Wednesday morning. Amazon regards this WBR as the single most expensive meeting in the company; every executive of note attends. This in turn means that the WBR is conducted like a tightly scripted, tightly choreographed affair.

The meeting opens with a facilitator (typically from Finance), who goes through follow-ups from the prior week’s WBR. Then, everyone turns to the WBR deck and looks at their first graph.

The handoff between facilitator to metric owner, and then from metrics owner to metrics owner should be seamless

If the metric shows only routine variation, the owner will say “nothing to see here”; everyone stares at the graph for one second, and then the entire meeting moves on to the next graph. You are not allowed to skip metrics, even if it displays just routine variation

You really only discuss exceptional variation: if there is an exception for a given metric (or if the metric has a target and the graph shows you have missed the target for a few weeks), then the owner will have to explain the issue. Alternatively, they may say “we don’t know and are still investigating” — and the facilitator makes a note to follow up. At no point are you allowed to pull an explanation out from your ass. (Really? See some culture notes below, cf psychological safety)

Handoff between each metrics owner should occur under two seconds. The WBR deck is static... waiting precious seconds for dashboards to load or for a presenter to switch tabs in a browser is simply unacceptable.

The only exception to this rule is the ‘Voice of the Customer’ program, which is typically presented as a slide

The customer service department routinely collects and summarises customer feedback and presents it during the WBR, though not necessarily every week. The chosen feedback does not always reflect the most commonly received complaint, and the CS department has wide latitude on what to present. (customer satisfaction)

One Voice of the Customer story was about an incident when our software barraged a few credit cards with repeated $1.00 pre-authorizations that normally happen only once per order.

while they were pending, they counted against credit limits

one customer wrote to say that just after buying an item on Amazon, she went to buy medicine for her child, and her card was declined. She asked that we help resolve the issue so she could purchase the medicine her child needed

An investigation into her complaint revealed that an edge-case bug — another way of saying a rare occurrence. At Amazon, such cases were regularly attended to because they would happen again and because the investigation often revealed adjacent problems that needed to be solved. What at first looked to be just an edge case turned out to be more significant. The bug had caused problems in other areas that we did not initially notice. We quickly fixed the problem for her and for all other impacted customers.

Discussions on business strategy or on problem solving are not allowed during the WBR.

If a protracted discussion breaks out, the facilitator of the WBR is allowed to cut in to suggest taking the discussion offline

Weekly strategic meetings do exist at Amazon — these are typically referred to as ‘S-team meetings’ and are held separately from the WBR.

even if the CEO or CFO are not present, the WBR runs as expected.

At this point, I’ve spent a lot of time talking about the WBR process itself, not on the tools or the contents of the WBR deck. This is intentional. The most important thing I’ve learnt from Colin is that good data practices stem from culture,

With that context in mind, let’s talk about the contents of the WBR deck.

The WBR Deck

Amazon WBR decks have three visualisation types. This is not a hard constraint, but also not an accident: Amazon believes that you’re only able to build fingertip-feel for your data if you receive the data in exactly the same colours, in exactly the same graph format, using the exact same fonts each week. You also get the sense that each visualisation type was chosen to communicate the largest amount of information in the smallest amount of time.

The 6-12 Graph

By far the most common visualisation is something called a ‘6-12 Graph’.

two graphs on the same X-axis. The graph on the left plots the trailing six weeks of data. The graph on the right plots the entire trailing year month by month. (Not XmR!)

Numbers are presented on the line graph itself, following a data visualisation principle that is best explained by early Amazonian Eugene Wei.

Each graph is numbered (top left) — which makes it easier to call out specific graphs during the WBR.

Colin liked to say that at Amazon, absolute values were often not as important as rates.

The other two visualization types are tables:

The 6-12 Table

Occasionally you may need to visualise more than two metrics on the same 6-12 Graph. Let’s suppose you run a cloud services business, and you want to track the latency of various cloud services for your top 10 customers

Plain Tables

The final visualisation type in an Amazon WBR is just a simple, plain table

Recall the earlier story about tracking how well Amazon’s prices stacked up against its competitors

If you need to come up with a new visualisation type, feel free to do so — but know that the fewer visualisation types you use, the easier it will be for you to build fingertip-feel for your data.

Metrics Composition of a WBR Deck

WBR decks are quite dense: when printed, you typically get four graphs to a landscape-orientation printed page. Here’s an example page from my own deck:

you will eventually have dozens if not hundreds of metrics in your deck.

Every output metric you care about will likely have multiple controllable input metrics.

We’ve discussed how process control forces teams to execute better, because they push operators to search for controllable inputs. This leads to an interesting use case for the WBR: if you want a particular business function to step up and shape up, you throw the WBR at it.

As a result, various parts of Amazon’s businesses are subjected to the WBR as-and-when they become relevant to top level strategy. In other words, the decision to include (or exclude) metrics in the top-level WBR is itself a mechanism to improve company execution.

In Amazon’s early days, the Amazon.com website was built around a central ‘catalog’. Changing the price of a single product demanded that you regenerate the entire catalog — a process that took a full day

Moving away from the catalog was clearly a top priority, and yet week after week dragged on with no change.

The S-team realised that recruiting was the root problem: Amazon was simply not able to hire new software engineers at a rate necessary for the catalog upgrade. At the time, every software engineer in the company was too busy fighting fires to keep the site up. (Bear in mind that this was in the late 90s.) Since recruiting was the root cause of this problem, Bezos decided to subject recruiting to process control: for months afterwards, recruiting metrics were presented at the WBR, and the hiring manager responsible for this team had to attend and present.

The Core Loop

You add an output metric to your WBR. This output metric may or may not have a target attached.

The team responsible for that output metric is incentivised to seek out changes that actually drive the output metric.

The team goes through an iterative, trial and error process to figure out the right process improvements to influence that output metric

You bring this area of business under control and move on to the next output metric you care about, and everyone who attends the WBR builds better intuition about what works

How to Come Up With Controllable Input Metrics?

If you’ve been running a business successfully, it’s likely that you already have a good sense for what affects your business. You should go measure those things, and then drive them to see if the relationships (to the corresponding output metrics) are real.”

there are a few other situations that give you controllable input metrics:

If you observe positive exceptional variation in your metrics, you should investigate to see what caused the spike, and then start an initiative to try and replicate it. If that activity is repeatable, you should come up with a set of controllable input metrics you can track to drive it

  • Sometimes, when asked “How is this measured?” a metrics owner will say that a particular metric consists of some other set of metrics. To which an Amazon exec might say: “well, why don’t we add those to the WBR then?”
  • If you observe negative exceptional variation in your metrics, you’ll want to investigate to see if this is a process failure you can avoid in the future. And if there’s something measurable there, you’ll want to add that to your WBR.
  • Colin told me that whenever a blowup occurred, Bezos would often ask: “What is something that we could have been tracking that would have allowed us to avoid this blowup in the future?” That metric, more often than not, would be added to the top-level WBR.

The Most Common Questions During a WBR

  • Is this metric worth discussing? If it’s an output metric, we’re just wasting time discussing it. We need to figure out what the input metrics are first.
  • If this is a controllable input metric, is this the right input metric?
  • how is it measured?

I want to draw your attention to the kind of thinking necessary to ask these three questions on the regular

Is this metric worth discussing?’ stems directly from the process control worldview:

Is this the right input metric?’ assumes that a measurement is always a compression (model) of reality... In an ideal world, you would use telepathy to read your customers’s minds. But that clearly isn’t possible! Instead, you’ll have to use other — imperfect — measures like NPS surveys or % repeat orders to get at ‘true’ customer satisfaction. This means that for any given phenomena you wish to measure, there are typically multiple possible metrics to select. This selection process should be discussed within the WBR itself

How is this measured?’ gets at something fundamental about the nature of measurement. You can only tell if you are getting at some facet of reality if you know how you are instrumenting that phenomena

The question is so important that we have spent an entire essay in the Becoming Data Driven Series on defining metrics, just to talk about defining your measurements

there is no way to claim that one method is more accurate than another method. The number you get will be the result of the method you use.

The Importance of Epistemology

*there seems to be a remarkable amount of good epistemology at the very top of Amazon.

Epistemology is a fancy word that means “how do you know that what you believe is true?”*

There are limits to the knowledge you will gain from your metrics. For instance, one question that nearly everyone asks is “how do I know if such-and-such a marketing campaign has worked — given all the other campaigns that are running concurrently?”

And the answer — of course — is no, it won’t. What’s most likely to happen in the context of the WBR is that you will say something like this: “When we ran X marketing campaign, we wrote a doc... Mary is confident that this growth is the result of the X campaign, and is willing to go ahead without running more tests. Mary tends to get things right; we’re willing to go with her judgment.”

Notice that this can only happen if you’re looking at metrics often enough to know what a ‘normal’ growth rate looks like. And notice, also, that ‘Mary’ has had enough good calls under her belt to justify trust in her judgment.

anything quantitative needed to stem from a ‘deep qualitative understanding of the customer’... “You don’t want to draw naive conclusions from your data,” he would say, “You’ll want to use your understanding of the customer to test hypotheses using data.”... To Colin, drawing ‘naive conclusions’ included things like taking correlations in your data at face value, or naively using customer surveys (in isolation) to conclude things about your customer... scientific method 101.” Just because a relationship exists in some existing data doesn’t mean the relationship is real. It could be a spurious correlation. An accident of the sample. You need to run a new experiment, collecting new data, to verify that the relationship is real.”

Invisible Benefits of the WBR

The WBR can be used as a mechanism to get certain parts of your org to shape up

The WBR is a mechanism to distribute data culture throughout your organisation

The WBR is (yet another) mechanism to enforce the ‘Dive Deep’ leadership principle

Because the WBR examines the business end-to-end, it gives your executive team a shared mental model of the entire business

Because the WBR examines the business end-to-end, it gives your executive team a shared mental model of the entire business. This helps with breaking down silos, but it also gives your executives better context

it wasn’t clear to me why the leader of Amazon’s e-commerce business, say, should need to know the state of Amazon’s cloud business (or vice versa). But there are intangible benefits to having leaders across a company keep track of what’s happening elsewhere in the business

The WBR is a mechanism to defend against Goodhart’s Law.

if you incentivise controllable input metrics, you can catch when an input metric stops affecting the output metric you care about, or if it starts producing negative effects elsewhere in the company

The WBR enables innovation: Since that data captures the customer experience, you can't help but feel the pain points of your customers and subsequently can't help but think about innovative ways to improve.

The WBR allows you to communicate a complex causal model of the business to your leaders.

Commoncog’s WBR currently has 47 metrics (including Xmrit’s metrics — which we launched only a few months ago); during our meetings we finish our metrics review in less than 15 minutes. We don’t use all of Amazon’s WBR practices. If we did, we would probably finish in ~5 minutes.

Amazon does not have 400-500 metrics in their WBR because they want to show off. They have 400-500 metrics because this represents the number of levers available to them to run their business.

Most companies do not have the capability to present more than a handful of metrics to their executives. This is why you will hear bromides like “don’t look at too many metrics” or “follow the North Star metric framework: find the ONE metric that matters.” Having read this essay, you should ask yourself: is it really true that the causal model of most businesses is representable using just one metric? Or even just a handful of metrics?

The format of the WBR matters, I think, because it enables you to examine more metrics in the same amount of time. This quantity is not a minor thing. It is central to what makes the WBR powerful: because you can examine more metrics, you have the capability to manage a richer causal model of your business.

perhaps it is ok to present a simplified model of your business to your board, or to your rank-and-file. Your leaders, on the other hand, should know the full causal model of the business. They’re flying the damn plane, after all. You should give them the richest instrument panel possible.

There is a lot that can go wrong with the WBR

What is Hard About All This?

One large software group, run by a senior leader who is no longer at Amazon, had memorably rough WBR meetings. Learning and taking ownership of problems and their solutions were two important goals of the WBR process, and on that front these meetings were a huge missed opportunity. They wasted a lot of everyone’s time.... The meetings were also just really unpleasant. There was a lack of ground rules and decorum, with quite a bit of interruption and sniping. Any anomaly was blood in the water, with accusatory questions fired at the presenter.

Bryar refuses to teach any executive team who isn’t fully committed to the practice of the WBR. To get the WBR to work, you need to have a prevailing culture of truth-seeking — which in most organisations means you must make an effort to build such a culture.

Collectively, we should have recognized that many of the areas being measured were not yet operationally under control and predictable. Many of the teams had skipped the first three DMAIC steps—define, measure, analyze—in an attempt to operate at the Improve stage

Last, we should have recognized that implementing a WBR for this new group for the first time was bound to be messy, requiring trial and error. In the end, we should have ensured that attendees felt free to talk about their mistakes and were actively encouraged to do so, allowing others to learn from them. The key to these meetings is to create a balance between extremely high standards and an atmosphere where people feel comfortable talking about mistakes.

you need total buy-in from company leadership to get a WBR process to take. Anything less will not succeed. And you need to set the tone right from the beginning, so it doesn’t turn into an adversarial thing.

If you’d like to view a full WBR deck, click here: Sample B2C WBR Deck.


Edited:    |       |    Search Twitter for discussion

No Space passed/matched! - http://fluxent.com/wiki/PostScarcity