São Paulo's public and private institutions are sitting on a sprawling, largely unaudited mess of duplicate digital images — and the conversation about how to fix it has moved from back-office IT rooms to the desks of city planners, archivists and tech entrepreneurs in Faria Lima. The issue surfaced publicly in late June 2026 when the Prefeitura de São Paulo's urban data portal, GeoSampa, acknowledged in an internal working document that its imagery layers contained redundant files accounting for an estimated 18 percent of total storage allocation — a figure shared at a technical roundtable held at the Centro de Estudos da Metrópole on Rua Morgado de Mateus, Vila Mariana.
That number matters because GeoSampa is not a niche tool. City engineers, flood-risk modellers and neighbourhood associations from Capão Redondo to Santana use it daily to cross-reference satellite and drone imagery against drainage maps — the same drainage infrastructure that failed catastrophically during the January 2025 floods in Jardim Pantanal. Bloated, poorly indexed image repositories slow query times and, more dangerously, allow outdated aerial photographs to surface alongside current data, potentially misleading technical decisions.
What the Specialists Are Saying
Professionals working at the intersection of urban data and software architecture have been circling this problem for months. At a panel hosted by the Instituto de Engenharia — the 105-year-old professional body headquartered on Avenida Brigadeiro Luís Antônio — technicians described a pattern common across São Paulo's municipal bodies: images are ingested through multiple departments without a shared metadata standard, so the same drone pass over Parelheiros can exist in four separate folders with four different file names and no deduplication flag.
The Secretaria Municipal de Inovação e Tecnologia has been working since March 2026 on a procurement framework for what it internally calls a Plataforma de Gestão de Ativos Visuais — a centralised image-asset management system. Technology directors familiar with the process, speaking in their professional capacity at public events rather than on the record, have pointed to hash-based deduplication as the baseline technical requirement. The idea is straightforward: every image file generates a unique cryptographic fingerprint; if two fingerprints match, one copy is deleted automatically. The challenge in a government context is governance, not the algorithm.
On the private sector side, São Paulo's tech community has taken notice. Startups clustered around the CUBO Itaú hub on Rua Tamoios, near Berrini, and around the Cubo network's expansion in Pinheiros have been pitching AI-assisted duplicate-detection tools to municipal and state clients since early 2025. Several founders have cited the Arquivo Público do Estado de São Paulo — which holds digitised collections dating to the nineteenth century — as a priority client, given that legacy scanning projects from the early 2000s predated any consistent naming convention.
The Broader Cost and What Comes Next
Storage is not free. Cloud infrastructure contracts for São Paulo state agencies, subject to public procurement rules under Lei 14.133/2021, have risen in line with global cloud pricing. A duplicate image library does not just waste space; it inflates renewal costs at contract renegotiation and creates legal exposure when image rights, timestamps or geolocation metadata are disputed in administrative proceedings.
The Câmara Municipal de São Paulo has a digital governance subcommittee that last met in May 2026 at the Viaduto do Chá complex to discuss data stewardship standards across municipal agencies. Members of that body have signalled — at public sessions, on the record — that image-asset management is likely to be folded into a broader municipal data-quality bill expected to be tabled in the third quarter of 2026.
For institutions and businesses navigating this now, specialists recommend a three-step triage: audit existing storage for file-hash duplicates before any new procurement; establish a single metadata standard aligned with the Brazilian national norm ABNT NBR ISO 19115 for geographic data; and designate a named data steward in each department with authority to approve deletions. The technical fix is cheap. Building the institutional habit is the harder job — and in São Paulo, with dozens of secretariats operating semi-independently, that habit has to start somewhere concrete, which is why GeoSampa has become the test case everyone is watching.