Custom Software
Data Pipelines & Integration
Ingest, transform, and serve hydrological, meteorological, and telemetry data. Pipelines that feed models, dashboards, and decisions — engineered for the messiness of real-world water data.
Examples
- Ingesting SAWS rainfall stations and SA-Flood-Radar products into an analytics database
- Pulling DWS river-level telemetry into a reservoir-management dashboard
- Normalising climate-model outputs into scenarios usable by an internal modelling pipeline
- Integrating SCADA historian data into regulatory reporting flows
Who it’s for
Organisations that have more data than they can usefully use. Telemetry sitting in a historian that nobody can easily query. Rainfall feeds that the modelling team re-ingests by hand for every project. Gauged flow data that takes a fortnight to reach a report.
What you get
- An ingestion pipeline that pulls from your sources on the schedule they actually publish.
- Cleaned, versioned data in a store your team can query — typically PostgreSQL/TimescaleDB, Parquet files in object storage, or a light data warehouse.
- Monitoring so you know when a source is late or broken, rather than discovering it two weeks later.
- Documentation: where data comes from, how it is transformed, how to add a new source.
- Optional API layer if downstream apps need it.
How we work
Most data-pipeline work is a retainer arrangement rather than a one-off scoped project, because sources change, formats change, and a pipeline without ongoing attention rots. Retainers are capped and transparent; the alternative is a project that looks finished at go-live and breaks two quarters later.
- Source inventory. What you have, what you want, what is actually available from each source (they are rarely the same thing).
- Prototype pipeline on a narrow slice. One source to one destination, end-to-end, in weeks not months.
- Expand coverage and add monitoring. As confidence grows, add the rest of the sources. Monitoring and alerting are first-class, not afterthoughts.
- Retainer or handover. Handover is possible if you have the in-house team to run this. Otherwise a small retainer covers source changes, monitoring response, and new-source onboarding.
Stack & interoperability
- Orchestration: Airflow, Prefect, or plain cron — whichever matches your ops model.
- Transformation: Python (pandas, polars, xarray for raster data), SQL.
- Storage: PostgreSQL + TimescaleDB, Parquet in object storage, netCDF for raster time series.
- Sources we have integrated with before: SAWS, DWS hydrological telemetry, SA-Flood-Radar, ERA5, CMIP climate data, SCADA historians (OPC-UA or vendor APIs), vendor telemetry hardware.
What this is not
- Not a data lake in a box. We build specific pipelines for specific purposes. If you want a generic platform, there are vendors for that.
- Not a substitute for owning your data. We build the pipelines; your team owns the data.