MCP Scorecard

Mission StatementGitHub
← All posts

Government Data Finds MCP

A Canadian university student built an MCP server for Statistics Canada's open data API — and it's the highest-provenance government data server in the registry. Eight government and open-data servers now exist. The public sector is arriving.
io.github.Aryan-Jhaveri

Most MCP servers wrap developer tools or SaaS APIs. A quieter category is forming underneath: government open data. Eight servers in the registry now expose public-sector datasets through MCP, and the one with the strongest trust signals was built by a university student in Ontario.

The Server

io.github.Aryan-Jhaveri/mcp-statcan connects AI agents to Statistics Canada's Web Data Service — the API behind one of the world's most comprehensive national statistical agencies. Canada publishes everything from CPI to immigration to labor force data through this API, and now an MCP server makes it conversational.

The tools are serious: cube operations (search, list, metadata, download links), vector operations (series data, bulk fetches by date range), and a SQLite persistence layer so agents can store and query retrieved data locally. The README is refreshingly honest about the limitations:

"LLMs tend to default to get_data_from_cube_pid_coord_and_latest_n_periods in a loop (one API call per data point) instead of using bulk vector tools... This is slower, wastes API calls, and increases the risk of the LLM fabricating numbers when it loses patience mid-loop."

mcp-statcan README

The recommended pattern: bulk vector fetch, store to SQLite, then SQL queries. That's not a workaround — it's good data engineering advice that happens to also be the right way to use MCP for analytical workloads.

The Builder

Aryan Jhaveri is a BSc student at Brock University in St. Catharines, Ontario, with interests in biomedical science, data analysis, and creative coding. His GitHub shows a pattern: three MCP servers, all wrapping Canadian public data — StatCan, Canada's Food Guide, and Brock University events. Plus a Google Earth Engine code generator. This is someone who sees public data APIs and thinks "this should be an agent tool."

The StatCan server has been in development since April 2025 — 69 commits over 10 months, recently refactored from FastMCP to the standard MCP SDK with a custom ToolRegistry. MIT-licensed, published on PyPI, zero flags. Score: 63, with the highest provenance rating (80) of any government data server in the registry.

The Landscape

StatCan isn't alone. Eight government and open-data MCP servers now exist in the registry:

ServerDomainScore
StatCanCanadian statistics63
ClinicalTrials.govUS clinical trials63
UN Data CommonsUN statistical data57
Italy OpenDataItalian open data56
Malaysian LawMalaysian legislation54
Fulcrum GovernanceGovernance data42
Google Data CommonsGoogle's data aggregation41
Tork GovernanceGovernance tools39

The pattern is global but early. Canadian statistics, American clinical trials, Italian open data, UN development indicators, Malaysian law. Each built independently, mostly by solo developers, with no coordination. Google's Data Commons MCP — wrapping their aggregation of World Bank, CDC, and Eurostat data — sits at the bottom with a score of 41 because it lacks a license and has minimal maintenance signals. The student project outscores the Google one.

Why This Matters

Government open data has a distribution problem. The data is public, often high-quality, and frequently ignored because the APIs are arcane, the documentation is bureaucratic, and the schemas require domain knowledge to navigate. StatCan's Web Data Service is powerful but not intuitive — cube IDs, vector IDs, coordinate strings, reference periods. An MCP server turns that into a conversation: "What was Canada's unemployment rate last quarter?"

This is the same pattern that made MCP work for developer tools — take a powerful but hostile interface and put a natural language layer on top. The difference is that government data serves a broader audience. Journalists, researchers, policy analysts, students — people who need the data but shouldn't need to learn an API to get it.

Eight servers is a signal, not a movement. But the ingredients are here: public APIs with no authentication barriers, data that's explicitly meant to be accessible, and a protocol that turns technical interfaces into conversational ones. If MCP is going to matter beyond developer productivity, public-sector data is one of the most natural fits. A Canadian undergrad just showed everyone the playbook.

Sources: Aryan Jhaveri — GitHub · LinkedIn · mcp-statcan — repo · Statistics Canada Web Data Service — API docs · Scorecard: io.github.Aryan-Jhaveri (score 63)

← Registry Pulse: Crypto Arrives, Claude Learns to DJEvalView: pytest for AI Agents →