São Paulo's Secretaria Municipal de Urbanismo e Licenciamento is sitting on a digital archive that its own technical staff describe as bloated with redundant imagery — aerial photographs, cadastral scans, and infrastructure survey files that have been copied, re-uploaded, and misfiled across multiple servers over at least a decade of patchwork digitisation. The problem is not unique to this city, but the scale here is hard to ignore: the metropolitan region covers more than 7,900 square kilometres, and systematic aerial re-surveys have been conducted at irregular intervals since the early 2000s, leaving layers of overlapping visual data with no unified deduplication protocol.
The issue gained administrative urgency this year after the city's ongoing flood-mapping initiative — launched in response to the catastrophic drainage failures that submerged parts of Itaquera and the Córrego Ipiranga corridor in the 2025 wet season — began relying on georeferenced imagery to track drainage infrastructure. Engineers working through the Programa Drenagem Inteligente reported internally that duplicate images were causing geo-referencing conflicts, slowing the identification of blocked culverts and compromised embankments in the Zona Leste. When emergency response depends on accurate, non-duplicated visual data, the administrative drag becomes an operational risk.
What São Paulo Is Doing — and What It Is Not
The city's current response centres on a contract issued in early 2026 to the Instituto de Pesquisas Tecnológicas, the state-linked research body headquartered on the Cidade Universitária campus near Butantã. The IPT is building a hashing-based deduplication pipeline — essentially software that generates a unique fingerprint for each image file and flags identical or near-identical copies for human review before deletion. The contract, awarded through a public tender published in the Diário Oficial do Município in February 2026, covers the Secretaria's primary geospatial archive but does not yet extend to the separate databases maintained by the Empresa Municipal de Urbanização, known as EMURB, or the housing authority SEHAB.
That fragmentation is the critical gap. Rotterdam, which manages a comparable municipal geospatial dataset across a 325-square-kilometre port city, completed a city-wide deduplication project in 2023 using a centralised cloud repository under its Digitale Stad programme, ensuring that all departments draw from a single verified image library. Bogotá's Catastro Distrital launched a similar consolidation in 2024, reducing its image storage load by an estimated 34 percent and cutting processing time for urban permit applications. São Paulo, by contrast, is running parallel and non-communicating systems across at least four municipal secretariats, each with its own procurement cycle and IT staff.
The Practical Cost and What Comes Next
Storage is not cheap even at municipal scale. São Paulo's municipal IT budget for 2026, as approved by the Câmara Municipal in December 2025, allocated R$380 million to digital infrastructure across all secretariats — a figure that sounds large until measured against the complexity of administering a city of 12.3 million people whose urbanisation records stretch back to analogue cadastral maps from the 1950s. Deduplication, when done properly, typically recovers between 20 and 40 percent of storage capacity in legacy public-sector archives, according to published findings from the European Union's Interoperable Europe initiative. Even a conservative recovery at that range would free meaningful budget for the city's still-incomplete transition to a unified GIS platform.
The IPT contract runs through December 2026. If the pilot phase covering the Secretaria de Urbanismo delivers measurable results by October — the deadline written into the project timeline — Mayor Ricardo Nunes' administration has indicated it plans to extend the methodology to SEHAB and the infrastructure secretariat before the 2027 budget cycle. For residents in flood-prone districts like Jardim Pantanal and Grajaú, the practical payoff is simple: faster, cleaner data means faster infrastructure decisions when the next rainy season arrives in November. For now, the city is catching up to where Rotterdam and Bogotá already are — running the same race, a lap behind.