CARSTEN HOHNKE, PHD

View Original

The Metric of the Long Now

“The Metric of the Long Now” is what I thought of while reading all the latest excitement about semantic layers in analytics. The semantic layer (aka metric layer) is meant to provide unified definitions (i.e., business logic) on top of data warehouses.

For example, the data warehouse may provide sales, pricing, and marketing expense information. Now imagine that Finance, Marketing and Ops are each reporting on “revenue”. Does “revenue” include temporary, marketing-related discounts? Or are the discounts included as a marketing expense? How do we ensure each business unit is using the same definition of “revenue”?

Most of the discussion these days is where in the data stack to insert this layer. Should it be deep in the sql code that transforms raw data and loads it into the warehouse (i.e., the ETL layer)? Or should it be in the business intelligence tools that drive dashboards and reporting (i.e., the visualization layer)?

In his book “The Clock of the Long Now,” the brilliant Stewart Brand describes the long-term resilience of complex systems. He notes the ability of components of these systems to adapt at different time scales. An entire forest is stable for thousands of years, a copse for hundreds, a tree for tens, branches for … you get the idea.

It takes a great deal of energy to knock a forest off its footing. Which is both good and bad. Good for stability. The forest will withstand fires, hurricanes, mudslides—it’ll be bruised, but it’ll bounce back. Bad for change. If you want the forest to be somehow fundamentally different—a different shape, contain a different mix of trees—it will take a great deal of energy. On the other end of the spectrum, it’s easy to prune a branch the way you’d like, but it doesn’t take a lot of wind to damage one irreparably.

Perhaps the right way to think about where to place “the” semantic layer is to acknowledge that it’s OK to have a few—each layer being resilient at different time scales. The GAAP-defined financial metrics are forest-level—they can be defined in the sql transform. On the other end of the spectrum, exploratory what-if analyses of novel metrics can be constructed in the GUI visualization layer.

All that being said, I’m as excited as the next analyst about all of the activity around the new semantic layer. We’ve been missing a middle layer for sure. A layer that doesn’t require a senior ETL engineer to make changes but has more stability than business analysts’ calculations in individual Tableau worksheets. A layer that serves to align the organization around a strategic time scale of three to five years; that requires a moderate amount of energy to modify but can adapt at the speed required by the competitive landscape.

We have the forest and the branches—it’s great to see dbt, LookerML and others adding the trees.