MoreRSS

site iconstorytelling with dataModify

Helping rid the world of ineffective graphs, one 3D pie at a time!
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of storytelling with data

we’re hiring: join the storytelling with data team

2026-01-11 03:00:00

At storytelling with data, our work is centered on helping people communicate more clearly with data—so they can build understanding, influence decisions, and drive positive change.

As our workshops and teaching continue to reach organizations around the world, we’re looking to add data storytellers to the SWD team.

This role is a great fit for someone who enjoys facilitating and teaching, cares deeply about craft and clarity, and is excited to help others build confidence communicating with data—using an established framework and real-world examples. In addition to workshop delivery, data storytellers contribute to shared teaching content (such as blog posts and videos) that extends our work beyond the classroom.

If this sounds like you—or someone you know—you can learn more and apply, or visit our careers page to find out what it’s like to work at SWD.

As always, thank you for inspiring the work we do and for helping us spread the word!

don’t let your axis scales distort the story

2026-01-07 22:00:00

One of the most common pitfalls in data visualization is manipulating axis scales in ways that distort the story. A frequent example is the use of logarithmic scales where they are not appropriate.

Let’s walk through a case where this choice can mislead, even if unintentionally.

At first glance, a strong relationship appears

Below is a scatterplot showing student participation in after-school sports programs across a school district. Each dot represents one school.

 
 

The horizontal axis shows the percentage of students participating in competitive (interscholastic) athletics. The vertical axis shows participation in casual (intramural) programs. At first glance, the data set seems to show a clear and linear upward trend: schools with more competitive participation tend to have more casual participation as well.

Let’s highlight a few individual schools for comparison.

 
 

Imagine we’re trying to compare schools’ rates of casual participation, which is on the vertical axis. Coos Bay and Jeffersonville (marked and labeled in orange) appear to be quite different in their casual participation rates, while Jeffersonville and Cortez (which is in blue) seem more similar. This impression is shaped by the vertical axis being on a logarithmic scale.

How a log scale changes perception

On a standard axis, values increase by equal increments (for example, 10%, 20%, 30%). You can see that on the horizontal axis in the above graph, and in the vast majority of data visualizations we commonly see and use in business.

A logarithmic scale, on the other hand, increases geometrically (such as 1%, 10%, 100%). From a visual perception standpoint, this stretches out gaps between smaller values and compresses larger ones.

That difference matters. It alters how we perceive the distances between points.

Here is the exact same data, but with a standard, arithmetic scale on the vertical axis.

 
 

Now we can see that Cortez and Jeffersonville are not as close, vertically as they initially appeared. Elk City stands out as significantly higher than all others. This version more accurately reflects the data's true distribution.

When is a log scale appropriate?

To be clear, logarithmic scales are not inherently wrong. In fact, they are essential in some situations.

A log scale is appropriate when:

  • The data spans several orders of magnitude (e.g., from tens to millions)

  • The values grow exponentially, as with compound interest or viral spread

  • The focus is on relative rather than absolute change

For example, a chart showing the growth of confirmed cases in the early stages of a pandemic might use a log scale to better compare countries with vastly different totals. In financial data, log scales can help normalize trends across portfolios with different starting values.

However, when working with percentages that fall within a consistent range—such as 0% to 100%—a logarithmic scale can be misleading unless the data has a clear exponential pattern.

A better way to show small differences

So why use a log scale in the original example? Most often, it’s an attempt to make small values near the origin easier to distinguish. But manipulating the axis scale to make those differences more visible can distort the bigger picture.

If small values are important, a better approach is to zoom in intentionally.

 
 

In this version, we highlight the lower-left corner of the plot and then show that region in its own focused view.

 
 

Now, the differences between Coos Bay, Jeffersonville, and Cortez are clearly visible, without changing the underlying scale. More importantly, we maintain an accurate sense of proportion across the full dataset.

Let the scale match the story

If your data does not have a logarithmic structure, avoid using a log scale as a visual shortcut. It can introduce confusion or mislead the viewer, even if the chart is technically accurate.

When small values matter, consider breaking them out into a secondary chart or zoomed-in view. Use your axes to reflect the nature of the data, not to force a certain look.

In short: let the data guide the scale, not the other way around.

looking ahead with intention

2026-01-05 08:24:37

The new year is a natural time to look ahead. Even if nothing magical happened the moment the clock struck midnight on December 31st, January brings with it a sense of possibility. It’s a perfect point to pause, take stock, and decide what you want to be intentional about in the months ahead.

One habit I’ve had for years—personally and professionally—is setting OKRs (Objectives & Key Results) at the beginning of each quarter. It’s a practice I learned at Google and is one of the rituals I credit most for helping us make steady progress on ambitious, meaningful goals at SWD.

We use OKRs not as a rigid checklist, but as a way to clarify what really matters right now, what success looks like, and where to focus limited time and energy. If you’re curious to learn more, check out the Goals like Google episode of the SWD podcast.

The start of a new year is also an especially good time to think about skill development. I’m a fan of framing it in a focused, realistic way that fits into the rest of your life and work. This can even be one of your OKRs: invest in how I communicate with data.

If that’s an objective you share, here are a few ways to learn with us:

  • Participate in the monthly SWD challenge: it’s a low-risk way to experiment and practice. The current challenge focuses on communicating with partial data.

  • Attend a virtual workshop (the next takes place February 24th) or check out our on-demand courses to move at your own pace.

  • Want to dive deep? Join our 8-week online course to learn and practice the art of planning, creating, and delivering a data story; the next cohort begins January 26th.

  • Organize a training for your team: we offer mini, core, and immersive sessions tailored for your needs and customized with your examples.

However you choose to approach the year ahead, my hope is that you give yourself both permission to start fresh and a plan to make and measure progress.

Here’s to a fantastic year ahead!

at year’s end, a fermata

2025-12-19 03:38:00

The SWD team making chocolate in London after a public masterclass, April 2025.
(L–R: Simon, Amy, Jody, Kaitlin, Mike, Alex, Cole, Randy)

A fermata over a quarter-rest.

In music, a fermata signals an extension—a moment to hold a note, or a rest, longer than expected. It doesn’t mean the piece is over, but rather that we sit in anticipation, or we take a breath...letting something resonate a little longer before moving on.

This time of year feels like that: a natural but longer-than-normal pause in the rhythm of work and life. At SWD, we formally put work on hold (for two weeks, from close of business on Friday the 19th until Monday, January 5th), and use this time to reflect on what’s happened, what’s changed, and what lies ahead.

This year brought the publication of storytelling with data: before & after. Working on the book gave us a reason to look back through years of client work, and remember those folks who had invited us into their world, trusted us with their data, and asked thoughtful questions about how to make it better. As we selected which examples to include and revisited the visual makeovers themselves, we found ourselves remembering the unique challenges they were trying to solve, as well as the conversations and energy in the room when we revealed an "after" version. From Toronto to Trinidad, from the Bay Area to Boston to Belgium, and dozens of places in between, each session brought something different. And revisiting those moments reminded us just how lucky we’ve been to work with so many smart, curious, and generous people, all striving to make their data clearer and their stories stronger.

We also marked a milestone this year with the release of the 10th anniversary edition of storytelling with data. Revisiting the original book a decade after it first came out reminded us how much has changed (tools, terminology, ways of working), but how much more has held steady. The need for clear, thoughtful communication with data is just as relevant now as it was then, and in many ways, even more critical. What has evolved is how we teach and apply those ideas, shaped by years of continued learning and the experiences we've had working with so many of you.

And of course, this time of year inevitably inspires us to reflect on the changes we’ve seen over the past year. Change, after all, is both inevitable and unrelenting. It can be exciting, or difficult, or both. Maybe most often, it’s both. This year, like every year, brought transitions to SWD, both big and small. What we’ve come to believe is that even when change feels uncertain or uncomfortable, it creates the conditions for growth. Letting go of what was can make room for what could be.

The core of what we do is incredibly rewarding. We get to help people see their data more clearly, build confidence in their skills, and communicate in ways that make a difference. That’s work we’re proud of, and deeply grateful to do. At the same time, we recognize that doing the work we love often means putting other things on hold. Time with family. Space for ourselves. The personal projects or quiet moments that don’t always make the calendar. A pause like this gives us the chance to return to those parts of life that matter just as much.

We feel incredibly fortunate that our work is meaningful and fulfilling, but it’s not the only meaningful part of our lives. Taking time off creates space for everything else that matters, too. We hope you get that chance during this season, as we make a point of doing each year—and we’re also trying to build in more of these pauses throughout the year, not just when the calendar gives us permission.

So as we hold here at year’s end in this moment of reflection, we’re feeling grateful for the challenges, the connections, the people, and the progress we’ve made. Thanks for being part of this journey with us. Whether you’ve read our books, joined a workshop, participated in our community, or followed along with the videos, podcasts, or this very blog: thank you.

We don’t take your engagement or your trust for granted. When the fermata ends and the next movement begins, we look forward to making beautiful music along with you in 2026.

interesting data is probably wrong

2025-12-08 21:59:00

There’s a particular kind of excitement that comes from spotting something unusual in a dataset. A sharp spike, an unexpected dip, or a strange gap appears, and your brain starts reaching for explanations. Maybe it’s the result of a behavioral shift, or a deliberate strategy, or an operational quirk that reveals something important. Maybe it’s the thing that makes the whole story come together. But more often…it isn’t.

Tony Twyman made his name as a pioneer in the field of audience research for television and radio in the UK. For our discussion today, though, he’s best remembered for a single, enduring quotation, which is now known as Twyman’s Law:

“Any figure that looks interesting or different is usually wrong.”

It’s not meant to discourage curiosity or keep you from finding insights. It’s meant to remind you that the most exciting observations are often the ones most likely to be errors—either in the data, or in the way you’re interpreting it.

When I first heard of Twyman’s Law, it immediately reminded me of a personal, but very public, project I undertook back in my pre-SWD days. My experience of working on it stuck with me, as public mistakes often do, and I’m sharing it here simply to show how easy it is to get drawn in by the appeal of a dramatic discovery.

The strange time gap in a public figure’s Tweets

In 2017, I was exploring the Twitter behavior of a well-known public figure, a man in his late 60s. (You could probably guess who it was, but that’s not important to this story.) He was frequently in the news, and over the last couple of years had grown more and more active online. I wasn’t looking at the content of the tweets. I was more interested in the timing. What hours of the day did he tend to post? Were there obvious rhythms or gaps in his online behavior?

I pulled the dataset into a radial graph and almost immediately noticed a pattern that seemed too clean to ignore. 

There were no tweets—none at all—between noon and 1 PM. Eastern Time. This gap showed up across multiple months of data, and because it was so consistent, it looked meaningful. I started trying to make sense of it. Was that when he typically stepped away from his phone? Did he have a daily briefing or meeting? Had I found some hidden quirk in his routine?

I almost shared it. But something about it felt a little too perfect.

So I went back to the raw data and took a closer look at the timestamps. That’s when I found the first issue. The tweets that were supposed to be time-stamped between 12 and 1 PM had been incorrectly encoded as happening between 12 and 1 AM. Somewhere along the way, either the AM/PM flag had been dropped, or the data had been pulled in a 12-hour format and misread. The result was a tidy, hour-long void in the middle of the day that never actually existed.

After correcting the AM/PM problem, I re-engaged with the data and iterated a bit on the specific design of my radial chart, using what I now believed to be accurate timestamps:

 
 

However, I wasn’t quite finished being carelessly wrong about things yet. 

At the time, I was living in the Washington D.C. area. While the U.S. spans multiple time zones, nearly half the population runs on Eastern Time. My work, my family, and pretty much everything around me operated on that same clock, so I tended to treat Eastern as the default. With that mindset, I assumed the timestamps in the dataset were either recorded in the public figure’s local time zone (Eastern), or in what I considered the “default” time zone—also Eastern.

Unfortunately for me, this was not the case.

The timestamps in the dataset were in Coordinated Universal Time (UTC), not local time. Which meant that everything I was now confidently interpreting—early morning activity, late-night bursts, midday gaps—was five hours off. (Which explained why my analysis suggested a nearly 70-year-old man was posting on The Socials until 4 AM daily.)

Once the tweets were correctly placed on the radial diagram, the pattern looked much more reasonable:

 
 

What initially felt like a meaningful visual insight was actually the result of two separate issues: a formatting error and a faulty assumption. Neither was especially complicated, and both could have been avoided if I had slowed down and asked a few more questions early on. By the time I corrected everything and was finally working with an accurate version of the data, the story I had imagined was no longer there. What remained was the real story…unfortunately, one far less exciting than what I thought I had discovered initially.

Lessons (re)learned

If this had been a personal experiment that I never shared, it might have been nothing more than an annoying detour. But if this had been a client project, or a published post, or part of a public dashboard, I would have been in trouble. Not because the chart was unattractive or unclear, but because it would have told a false story.

There are a few basic principles I try to follow in situations like this, though clearly I don’t always follow them perfectly.

First, don’t rush to interpret the surprising thing.
When something jumps out of the data and feels unusually clean, it’s more likely to be a sign of a problem than a breakthrough. The more exciting it is, the more carefully you need to verify it. That doesn’t mean it can’t be true. But it does mean you should treat it as suspicious until you’ve ruled out the obvious errors. If the choice is between “major behavioral insight no one has ever noticed before” and “simple encoding bug,” assume the bug until proven otherwise.

Second, double-check the details you think you understand.
In this case, it wasn’t that the data was messy. It was that I thought I knew what the timestamps meant. I thought I understood the format. I thought I knew the time zone. And because of those unchecked assumptions, I built a clean-looking, wrong interpretation on top of a misaligned foundation.

Third, acknowledge the role your ego plays in all of this.
I liked what I had found. I was proud of the chart. I wanted it to be real. That made me slower to question it, and quicker to explain it. Fortunately the stakes weren’t very high for me, in that instance…but if they had been, things could have played out much worse.

So slow down. Stay calm. Take the time to validate your assumptions, and be honest with yourself about the limits of what your data can actually tell you. Because once you publish or share something, it doesn’t belong to you anymore—it becomes part of how people understand the world. That’s a real responsibility; and your own credibility, once damaged through carelessness, is difficult to rebuild.


Twyman’s Law isn’t about distrusting your data or being a kneejerk naysayer. It’s about remembering that excitement and accuracy don’t always arrive together. When something looks especially compelling, it’s worth taking a step back to make sure it’s real. 

In the end, the most valuable stories we find in data aren’t the ones that surprise us the most. They’re the ones we’ve taken the time to get right.

mind the gap: how to represent partial data

2025-11-13 22:30:09

When we’re reporting the latest information, it can be challenging to know how to handle data that is still in progress. For example, if we're reporting annual performance trends with only three quarters completed in the latest year, the numbers can appear misleadingly low. If you exclude the latest data points, it could hide crucial details from stakeholders. Audiences often want timely updates, but partial data can cause confusion if not clearly communicated. 

This was a challenge for a client I was recently working with. Take a look at the slide below, which displays new subscription revenue by year. 

Upon initial inspection, the 2025 drop-off appears concerning, as it's markedly below that of 2024. However, the 2018 to 2024 data covers the entire year, while the 2025 figure only includes results through September.

In this particular case, the slide was presented live, and the creator explained that the 2025 data was only through the third quarter. After the meeting, though, the slides were shared. This means that anyone who didn’t attend the presentation wouldn’t have that crucial context to avoid panic. Let’s explore four alternative solutions to clearly and effectively share partial data. 

Option 1: Differentiate between complete & incomplete data

One of the easiest ways to ensure the 2025 data isn’t mistaken for the full year’s information is to distinguish between full-year and partial-year data visually. For this, we could add a label on the horizontal axis and adjust the bar to a pattern fill. (Similarly, if we had a line chart, we could use a dashed line to indicate a difference.) These visual cues make it clear what’s finalized and what’s ongoing, reducing the chance of misinterpretation. 

 

Additionally, we could go further to provide a sense of where the full year might end up, for comparison.

Option 2: Provide a full year estimate for reference

Estimating where the final numbers might land for 2025 is another possible approach. Using a stacked bar that fills in the actual data through September in blue and a pattern-filled segment projecting the fourth quarter could convey this message. This visual sets realistic expectations and helps viewers compare years more accurately, particularly when the visual is accompanied by clear and explicit data labels.

 

Since the data for 2025 only covers a subset of the year, we could consider providing a sense of how the current year compares to the prior years for a similar time period.

Option 3: Explore using comparable time periods

Our final two options assume that we have access to the data at a more granular level than just the year summary. To compare apples to apples, try showing year-to-date (YTD) results for each time period. For instance, use a solid blue color to represent the data from January to September in each year, and use a lighter fill to indicate the remaining fourth-quarter data. Take care to differentiate 2025 data with an open fill to signal that the Q4 2025 data is not yet available. This approach helps the audience make fair comparisons across equivalent periods.

 

Option 4: Show more granular details 

A final option is to show even more detail. For example, we might plot monthly subscription revenue for each year, excluding the outstanding months for 2025, to highlight the incompleteness of the current year. An added footnote also clarifies when the numbers were extracted, leaving no room for ambiguity.

 

This more granular view looks quite busy. It is worth considering how much data is required. Depending on the message we want to convey, we might consider keeping only the most recent years. Displaying just the relevant information reduces the cognitive effort needed to understand the trends.

Take precautions when sharing partial information

When sharing data for an incomplete reporting period, be deliberate about context to avoid confusion. Take care to differentiate between complete and incomplete information with helpful labels and annotations, and adjust the marks to ensure they stand out clearly. Consider what your audience needs to know: do they need the latest snapshot, a full-year forecast, or a fair period-over-period comparison? Choose a strategy that provides the most applicable comparison and always makes the completeness of the data unmistakably clear.

To practice implementing these strategies, explore this related community exercise.