Why AI Demands Investment in Public Data Infrastructure and XBRL

Imagine an AI chatbot confidently telling an investor that a company’s revenue jumped 40% last quarter, when in fact, it fell. The correct data existed on the internet, but the model retrieved its answer from an unreliable source, misread inconsistent labels, or grabbed the wrong reporting period. This isn’t a thought experiment. As AI becomes embedded in investment decisions, the quality of the underlying data matters more than ever.

For nearly two decades, I have focused on open public datasets and data standardisation, joining the XBRL community in 2006 to pursue the promise that structured, digital reporting would finally deliver high-quality data that users could trust and analyse at scale. That promise is now more important than ever with the emergence of natural language AI interfaces where the investor can ask any question they want, such as above.

Working at Oracle on data warehousing and business intelligence, I saw first-hand how difficult it was to turn inconsistent data into meaningful insight. The classic promises of Business Intelligence (BI), ‘turning data into information’ and ‘helping users make better decisions’, consistently fell apart when the underlying data was fragmented and used different labels and definitions. I came to realize that standardizing data at source and validating quality is a prerequisite for BI systems to deliver real value.

For AI chatbots, data quality, data structure, and the authority of the original publisher are even more critical to producing reliable analysis than they are for traditional BI systems built on tightly defined datasets.

Unfortunately, investment in data collection systems is being questioned like never before, while the benefits of data standardisation remain poorly understood. The result is too many fragmented implementations, driven by local interests insisting on exceptions that ultimately erode trust in the data itself.

The rest of this article can be read on Medium for free – here