Sometimes, the best way to figure things out is to try to explain them. It can be the Feynman Technique at work, or it can just be when somebody reacts to something that seemed obvious. Maybe it wasn’t?
“If you can’t explain it simply, you don’t understand it well enough.”
— attributed to Einstein, but it’s probably a paraphrase, or maybe it’s derived from this Feynman story
Defining the range of possible connections
The whole premise of the Data Market Study is that the path from data source to end user is more interesting than a simple line implying that the data just appears when needed. At least in some markets, a supply network of collectors, aggregators, and processors handles the work of assembling data for further work by data scientists, software developers, and the tools they build for their eventual customers.
In my last meeting before leaving New York last month, I explained what I was working on to a professor I’ve known a while. He said some really useful things, cautioned me about one avenue I was considering, and suggested that I write down one thing I had just improvised.
It was an off-the-cuff list of the available options for a company engaging in a data supply network or information business. Setting aside the “passive monetization” that amounts to using your own data to improve an existing business (clearly a high priority), there are only so many ways for data to enter or exit an organization in an information market, and they don’t depend on its position in the network.
As usual, we’re looking at the connections between organizations, not the work that happens inside the box. In this view of the world, it’s the data science and analytical work that just happens. ;-)
Data acquisition
On the data acquisition side, the basic options for an organization are creating data, collecting data, and buying data. All of this is an area for testing assumptions—and it’s not my intention to impose a lexicon on anyone—but here’s what I mean by these today:
Creating data is about data that originates inside the organization. It may be the result of routine operations in a different business (such as operational datasets or data exhaust), or it may be the organization’s primary output (such as market research). “Active” data monetization is based on this category of data: your company sells widgets, but there’s potential revenue in packaging your internal data for sale.
Collecting data is about data that originated somewhere else and was collected by technical means. Companies scraping web sites or gathering radio signals are collecting data. There’s a fine line between creating and collecting in some of these cases, so we’ll see how the distinction holds up. I think remote observation and sensor data belong in this category, but it’s a grey area.
Buying data is obvious, except that I include barter transactions and freely available data as special cases. The distinction I want to make here is that the organization is acquiring data that is already in at least a somewhat usable form, so we’re talking about datasets and streams that have been prepared for external delivery.
Selling data and its derivatives
On the product-to-market side, organizations have more options, which are distinguished by how much work the seller has done, and how much remains for the buyer. In generally, I expect to see more data for sale in the early steps, and more finished products in later steps. In between, I expect to see companies whose inputs and outputs are both data.
Reports present a moment in time from the data, wrapped up in the analysis and insight of the report author (who may increasingly be a computer system). Technologically, they demand little of the customer. The analytical work is already done.
Data sales are the obvious case here, except that they’re not limited to the beginning of the supply network. Subject to license terms, companies later in the network may have the option of doing some work on a dataset or stream and delivering a value-added version of the same dataset or stream to their own customers. I group freely available data (such as from government sources) in this category as a special case where the price is zero.
Software products add functions and user interfaces to the data, turning it into something useful for less-technical or non-technical staff. There’s a big lift involved in extracting the value from demanding data sets. Software developers fill the gap between data providers who aren’t in that business and end users who don’t have the capability to do it.
Services carry the load even farther for clients who want more than a data drop, a report, or another software tool to learn. From advice to outsourcing, providers and customers can define services as narrowly or broadly as needed. In our context, what they have in common is adding value on top of the data at the core of the work.
Managing the workload
My overall plan for this project was to float the original view of how data markets are structured, and then talk to people in some specialized markets. The goals are to test and improve the overall framework, and then understand how it fits these specialist markets—or doesn’t. I plan to organize the work around monthly themes to simplify things and give the newsletter some structure. If you squint, you can see the beginnings of a book outline, too.
Here’s the current view of what’s coming:
March: Alternative data
April: Transportation & shipping
May: Space
June: Web & social media
July: News
August: Consumer data
September: Sensor data
October: Agriculture
November: Health
Whew. A month isn’t really going to get any of these topics done, but this should at least spread out the initial exploration workload and keep things interesting in the newsletter.
Now might be a good time to point out that there’s a comments section on this newsletter. I’m working in public, and explaining what I’m thinking about is partly about learning it myself. If any of this is inspiring a reaction, I’d like to know. If you would prefer to comment in private, just reply to this email.
Inspiration
Discovery
The chatbot that wrote a check the company had to cash.
Too much virtual/digital everything? Revisit the physical world with Scope of Work.
TBR
Data market adjacent? Capitalism without Capital: The Rise of the Intangible Economy
Squirrel!
Maybe talking about the weather isn’t so bad, after all.