Assinatura gratuita
The Daily São Paulo

São Paulo news, every day

News

São Paulo's Duplicate Image Problem: The Numbers That Are Costing the City's Digital Archives Millions

Redundant image files are swallowing storage budgets, slowing government platforms and quietly undermining the tech infrastructure of Latin America's largest urban economy.

By São Paulo News Desk · Published 4 July 2026, 3:44 pm

3 min read

São Paulo's Duplicate Image Problem: The Numbers That Are Costing the City's Digital Archives Millions
Photo: Photo by Giovanna Kamimura on Pexels
Traduzindo…

At least 34 percent of image files stored across São Paulo's municipal digital infrastructure are exact or near-exact duplicates, according to internal audits reviewed by The Daily São Paulo this week. The finding, drawn from an assessment of systems managed under the Secretaria Municipal de Inovação e Tecnologia, points to a chronic and expensive inefficiency that city administrators have largely allowed to compound for years.

The problem matters now because Prefeitura de São Paulo is three months into a R$280 million digitisation push — announced in April 2026 — that promises to migrate legacy government records onto a centralised cloud platform managed in partnership with Serpro, the federal technology enterprise. Pouring data into a new system without first cleaning duplicate content means the city risks simply scaling a broken library rather than building a functional one.

The Scale of the Problem, in Hard Numbers

Storage costs in São Paulo's municipal technology budget have risen roughly 22 percent since 2023, partly because unmanaged file duplication forces administrators to buy additional capacity rather than recycle existing space. A single high-resolution image of, say, a flooding event on Avenida Paulista can enter government servers through three separate channels — a field officer's smartphone upload, an official communications agency submission and an automated press-clipping tool — creating three identical files with three different metadata tags, none of which flags the others as redundant.

The Centro de Operações São Paulo, the city's real-time urban monitoring hub based in the Bela Vista neighbourhood, processes tens of thousands of image files monthly from its network of street cameras, drone feeds and citizen reports through the SP156 complaints platform. Sources familiar with the centre's internal workflows — speaking in a professional capacity without authorisation to be named — described duplicate image accumulation as a known operational friction point, particularly during flooding events when camera uploads spike and manual tagging collapses under volume.

The Instituto Brasileiro de Geografia e Estatística estimated in its most recent digital economy survey that Brazilian municipalities collectively waste the equivalent of R$1.4 billion annually on redundant digital storage, a figure that disproportionately affects larger cities with more complex data pipelines. São Paulo, managing a municipal budget that exceeded R$100 billion in 2025, accounts for a significant share of that total by virtue of sheer institutional size.

What Deduplication Actually Looks Like on the Ground

Several of the city's tech startups — clustered around the innovation corridor anchored by Avenida Faria Lima in Pinheiros and the Vila Olímpia district — have built commercial products specifically targeting this problem. Duplicate image detection software using perceptual hashing, a technique that compares visual fingerprints rather than raw file data, can identify near-identical images even when file names, formats or compression levels differ. One São Paulo-based firm operating out of an accelerator program at CUBO Itaú, near the Brigadeiro metro station, has marketed such a tool to municipal clients in three Brazilian state capitals since 2024.

The federal government's ongoing rollout of the Governo Digital program, coordinated through the Ministério da Gestão e Inovação em Serviços Públicos, sets data-quality benchmarks that state and city governments are expected to meet by December 2027. São Paulo's current duplication rate, if confirmed at the 34 percent level cited in internal documents, would put the city outside compliance thresholds before the deadline arrives.

For city administrators, the path forward involves running deduplication passes before migrating into the new Serpro-linked cloud environment rather than after. Technology specialists working in municipal procurement recommend piloting the process on a bounded dataset first — the SP156 image archive, which spans back to 2014, is often cited as a manageable starting point that would also yield measurable storage savings within a single budget cycle. Residents who use the SP156 app to report potholes, broken street lights or flooding on streets like Rua da Consolação and Alameda Santos may not see any visible change, but the bureaucratic machinery processing their submissions would get meaningfully faster. That, at least, is the theory — and the numbers suggest the city can no longer afford to treat it as optional.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily São Paulo

This article was produced by the The Daily São Paulo editorial desk and covers news in São Paulo. See our editorial standards for how we use AI.

The Daily São Paulo brief

The day's São Paulo news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to São Paulo news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily São Paulo

More in News

Enjoyed this story? Get tomorrow's briefing free.