Three OC4IDS portal integrations I've worked on failed not because the data standards were wrong, but because the integration pattern didn't match what the source system could deliver. Most teams make the architecture decision after the build has started. By then, changing it costs real money. This is the decision that determines whether a portal gets maintained for five years or abandoned after eighteen months.
Infrastructure portals collect data from government systems: e-procurement platforms, integrated financial management systems (IFMIS), project management databases. Five patterns cover most real-world implementations. Each has different maintenance requirements, different failure modes, and different suitability depending on what the source system can actually expose.
3 of 5
Portal integrations that failed because the integration pattern did not match what the source system could deliver, not because the OC4IDS standard was wrong.
2 silent
ETL extraction failures during fiscal year transitions, both caused by schema changes in the source system and both undetected until data stopped updating.
1 steward
The minimum institutional commitment a file-drop integration requires. Without a named, backed-up data steward role, this pattern degrades within months of launch.
Pattern 1: Direct Database Query
The portal queries the source database via a read-only connection, transforming records to OC4IDS schema at query time. This is the simplest pattern when the source database has a stable, documented schema and a DBA who will grant and maintain a read-only service account. The Uganda PPDA portal (gpp.ppda.go.ug) uses a variant: a scheduled ETL job pulls contract award records from IFMIS nightly, maps them to OCDS fields, and feeds the broader OC4IDS project record.
The failure mode is schema changes. When the source system upgrades and field names or types shift, extraction breaks. I've seen this happen twice during fiscal year transitions, both times silently. If the source system has an active development team and no API, this pattern requires a DBA relationship that survives personnel changes. That relationship is harder to maintain than the code.
| Pattern | Technical Complexity | Maintenance Risk | Best For |
|---|---|---|---|
| 1. Direct database query (ETL) | Low to medium | Medium: schema changes break it | Stable IFMIS with documented schema |
| 2. API-to-API (source system API) | Medium | Low: API versioning protects consumer | Modern e-procurement systems |
| 3. CSV/file drop (scheduled export) | Low | High: manual step in the chain | Legacy systems with no API or DB access |
| 4. Middleware broker | High | Low: decoupled from source system | Multiple heterogeneous source systems |
| 5. Manual data entry (last resort) | None | Very high: human compliance required | Only where no other option exists |
Pattern 2: API-to-API Integration
The portal polls a REST API on a schedule, transforms responses to OC4IDS schema, and persists records. When the source system has a versioned API, this is the most maintainable pattern. The Kaduna State Infrastructure Data Portal (ipdata.kdsg.gov.ng) uses API polling from the state's procurement system to populate its OC4IDS project records. The Open Contracting Partnership's OCDS tools (standard.open-contracting.org/latest/en/guidance/build/) document reference implementations for consuming procurement APIs and mapping to OCDS format.
The failure mode is authentication drift. API keys expire. OAuth tokens need rotation. I've seen portals fail silently for weeks because a service account password was rotated during a security audit and nobody updated the portal configuration. Build monitoring before you build the integration, not after.
Pattern 3: Scheduled File Drop
When the source system cannot expose a database connection or API, a scheduled CSV export dropped to an SFTP endpoint or shared drive is the fallback. The portal picks up the file on a schedule, validates the structure, and transforms it.
This pattern introduces a manual step: a civil servant runs the export. CoST Uganda's 2019 Synthesis Report (infrastructuretransparency.org/programmes/uganda/) documented what happens when that dependency exists: file drops became irregular when the responsible officer changed. That is not a data problem. It is a staffing problem that looks like a data problem. File drop integrations require a named, backed-up data steward role with a budget line, not a project position. Without that commitment before you build, this pattern will degrade.
Pattern 4: Middleware Broker
A middleware layer sits between source systems and the portal, normalising data from multiple heterogeneous sources into a common schema. The middleware accepts whatever format the source system produces: database dump, CSV, XML, proprietary API. It emits standardised OC4IDS-compatible records. When source systems upgrade or are replaced, only the middleware adapter changes. The World Bank GovTech Maturity Index 2022 (worldbank.org/en/programs/govtech) documents this as the common pattern in low-income country implementations with multiple incompatible source systems.
The failure mode is team dependency. Middleware requires ongoing development capacity. I've watched middleware brokers that worked perfectly go unmaintained after the implementing partner left. The government team didn't understand the code. The original developer was no longer on contract. The broker accumulated technical debt until it stopped working. Build middleware only if the maintaining team has the capacity to own it after handover. That conversation needs to happen during implementation, not at the handover ceremony.
Pattern 5: Manual Entry (Last Resort)
When automated integration is not technically feasible, a manual entry form is the only option. OC4IDS portals built on this pattern fail. CoST Uganda's assurance reports documented the cycle across multiple programmes: compliance dropped within months of launch, recovered during audit periods, and declined again between audits. The data produced is unreliable because it reflects compliance effort, not project reality.
If manual entry is unavoidable, document why automation is not yet possible and set a deadline for it. Build the manual form as a temporary measure with a sunset date, not as the permanent solution. A portal that acknowledges its data limitations is more credible than one that publishes unreliable data without caveat.
The Case Against This Framework
The argument against pattern-first thinking is that it attributes portal failure to the wrong cause. Critics who have watched more portals fail than I have will tell you that the pattern never mattered. What killed those portals was underfunding, political will that evaporated after the launch ceremony, and government IT units that had no ownership over a system built for them by an external contractor. If the political commitment is not there, no integration pattern will keep a portal running. A well-architected ETL job sitting on an abandoned server does not produce transparency data.
That critique is correct, and it does not contradict the argument here. Pattern selection is necessary but not sufficient. An unsuitable pattern will kill a portal even when political commitment exists. A sustainable pattern will not save a portal when commitment does not. The two problems require different interventions. Arguing about pattern choice when the funding model is not resolved is the wrong conversation. Choosing the wrong pattern when everything else is in place is a preventable failure. This framework addresses the second problem, not the first.
Choosing the Right Pattern
Three factors determine the right choice: the source system's technical capabilities, the IT capacity of the team that will maintain the integration after handover, and the data stewardship model. A direct database query with one dedicated DBA is more reliable in practice than a middleware broker maintained by a rotating project team.
Automation solves the technical problem. Institutional design solves the maintenance problem. Both are required for any pattern to function beyond the initial project cycle. The pattern selection conversation should happen before architecture is decided. The failure mode conversation should happen before implementation begins.
Playbook
Decision Table
| Option | When to Use | Tradeoff |
|---|---|---|
| Adopt immediately | Low-risk process and clear team ownership | Fast progress, limited validation runway |
| Pilot first | Uncertain data quality or mixed institutional capacity | Slower scale-up, higher confidence |
| Defer pending controls | Missing governance, QA, or monitoring guardrails | Lower short-term output, better long-term durability |
Execution Checklist
Failure Modes
- Skipping the section "Pattern 1: Direct Database Query" during implementation.
- Skipping the section "Pattern 2: API-to-API Integration" during implementation.
- Skipping the section "Pattern 3: Scheduled File Drop" during implementation.
Found this useful?
I write about open data systems, transparency, and implementation.
Read more articles →