São Paulo's municipal database of public assets contains tens of thousands of duplicate photographic records — the same pothole, the same crumbling retaining wall, the same faded zebra crossing catalogued twice, three times, sometimes more — and city hall has no firm deadline for clearing the backlog. That is the situation facing the Secretaria Municipal de Infraestrutura Urbana e Obras (SIURB), which oversees the digital inventory of more than 17,000 kilometres of roadway across the 32 subprefeituras.
The duplication problem is not cosmetic. When field teams in Sapopemba or Itaquera log a repair request, a duplicate image entry can generate parallel work orders, dispatch two crews to one location and leave an identical problem three streets away unattended. For a city that has spent years fighting a chronic urban drainage crisis — the kind that floods the Marginal Tietê corridor every rainy season — wasted crew deployments are a budget drain that officials can no longer easily absorb.
The issue has sharpened in 2026 because Mayor Ricardo Nunes signed off in March on a five-year digital infrastructure modernisation plan that ties municipal spending to verified asset records. Contractors bidding on Paulista Avenue repaving contracts, for instance, now have to reconcile their photographic submissions against a central repository. If that repository is polluted with duplicates, validation slows, payments stall and the entire pipeline backs up. The city's technology arm, Prodam — the Empresa de Tecnologia da Informação e Comunicação do Município de São Paulo — has been tasked with building a de-duplication pipeline using hash-matching and machine-learning classification, with a pilot scheduled for the Centro Histórico and Pinheiros districts before the end of the third quarter.
What Bogotá and Seoul Got Right Earlier
Bogotá confronted a structurally similar problem when it digitalised its street-furniture inventory under the 2020 Bogotá Resiliente programme. The Colombian capital deployed perceptual hashing across roughly 1.2 million stored images, reducing its duplicate rate from an estimated 34 percent to under four percent within 18 months, according to the Instituto Distrital de Gestión de Riesgos y Cambio Climático. The lesson Bogotá learned — that de-duplication has to happen at ingestion, not retrospectively — is precisely the lesson São Paulo is still absorbing.
Seoul is further along still. The city's Smart City Operations Center, operating out of the Digital Mayor's Office since 2022, flags duplicate asset images in near real time using a combination of GPS metadata cross-referencing and convolutional neural network classifiers. Seoul processes roughly 400,000 field images per month across 25 autonomous districts. London's equivalent system, run through the Greater London Authority's City Data Store, uses a supplier-side API that rejects duplicate submissions before they enter the master record. That approach transfers the de-duplication burden upstream, to the contractor, rather than letting city databases absorb the mess.
São Paulo's challenge is scale and legacy. Prodam engineers are dealing with records that stretch back to at least 2011, when the city first mandated photographic documentation of street repairs under the Programa de Manutenção de Vias. Many of those early images carry inconsistent metadata — different coordinate standards, varying file-naming conventions across subprefeituras — which makes automated matching harder than it would be for a system built clean from scratch.
What Comes Next for Paulistanos
The Pinheiros pilot, if it hits its Q3 2026 target, will be the clearest test of whether Prodam's pipeline can handle real-world volume. Pinheiros was chosen partly because its subprefeitura has above-average digital literacy among field inspectors, and partly because the neighbourhood's mix of commercial strips, residential side streets and proximity to the Rio Pinheiros flood-control infrastructure makes it a representative stress test.
For residents, the practical payoff is faster repair turnaround. The city's 156 service line — São Paulo's main civic complaint channel — logged more than 2.3 million requests in 2025, according to figures published by Prodam. A meaningful slice of delayed responses trace back to duplicate work orders. Clearing the duplication backlog will not fix every pothole, but it should mean that the ones that get reported actually get fixed once, not routed around twice. That modest promise is, for now, what the city is selling.