Understanding the real-world adoption of structured data is essential for the continued evolution of our shared vocabulary. To support this, we are pleased to share a new dataset providing aggregate usage statistics for Schema.org terms across the public web.
This initiative, a collaboration between Google and the Schema.org community, aims to provide greater transparency into how different Types and Properties are being utilized by developers and publishers globally.
The dataset, updated monthly, offers a high-level view of term usage across millions of domains. To maintain stability and respect privacy, counts are aggregated at the domain level and presented in popularity range buckets. This approach helps filter daily noise while highlighting meaningful adoption trends for researchers and toolmakers.
The files are now available on the official Schema.org GitHub repository in CSV and JSON formats. We hope these statistics serve as a useful resource for the community’s own analysis and experimentation. To highlight the statistics, we are also including them directly on the schema term pages to showcase their usage.
While this initial contribution comes from Google, we recognize that a truly comprehensive view of the web requires multiple perspectives. We invite other crawlers and indexers to contribute their own statistics using this same open format. By sharing data together, we can build a more transparent and resilient semantic web.
To find out more about this data, see About Usage Statistics.
Leave a Reply