São Paulo's municipal government is confronting a sprawling duplicate-image problem across its public digital systems, with redundant photographs and scanned documents clogging servers managed by the Secretaria Municipal de Inovação e Tecnologia — and a growing body of evidence suggests the city is handling it less efficiently than comparable metropolises in Europe and Asia.
The issue has sharpened in 2026 because the Prefeitura, under Mayor Ricardo Nunes, has pushed aggressively to digitise everything from building permits in Mooca to flood-damage assessments in Jardim Angela. That digitisation drive, while broadly welcomed, has flooded city repositories with duplicate files. IT teams working out of the Centro de Operações São Paulo on Rua Líbero Badaró in the historic centre are now processing complaints from multiple secretariats about systems slowing under the weight of unmanaged image libraries.
What Duplicate Images Actually Cost
Storage is not free. Municipal cloud contracts in Brazilian cities typically run on a per-gigabyte model, and duplicate image files — which can account for between 20 and 40 percent of total storage consumption in poorly managed public repositories, according to general benchmarks published by data-management consultancies — translate directly into wasted public reais. The Prefeitura's 2025 technology budget, as reported in official municipal budget documents publicly available through the Diário Oficial do Município, allocated R$312 million to digital infrastructure across all city secretariats. Even a conservative reduction in storage waste could redirect tens of millions of reais toward services that São Paulo residents actually use.
The problem is not unique to São Paulo, but how cities address it varies sharply. London's Government Digital Service began a systematic deduplication programme for the Greater London Authority's image archives in 2023, deploying hash-based identification tools that automatically flag identical files before archiving. Seoul's Smart City Operations Center, part of the city's broader Seoul Digital Foundation initiative, integrated image deduplication into its data pipeline by 2022, meaning duplicate files are caught at the point of ingestion rather than cleaned up later. São Paulo is still largely in the clean-up phase, which is substantially more expensive and time-consuming.
In Latin America, the comparison is no more flattering. Bogotá's Distrito digital programme, launched under the city's Plan Distrital de Desarrollo, introduced automated metadata validation for image uploads in 2024, reducing repository bloat across Secretaría Distrital de Hacienda systems by a reported margin that city officials cited in a published government report. São Paulo has no equivalent mandatory deduplication standard across its agencies as of July 2026.
Local Efforts and Where They Fall Short
There are efforts underway. The Instituto de Pesquisas Tecnológicas, headquartered on Avenida Professor Almeida Prado in Cidade Universitária, has been engaged in conversations with city technology staff about building an automated image-processing pipeline for urban monitoring data, including the thousands of photographs generated weekly by the Sistema de Monitoramento de Chuvas e Enchentes — the flood monitoring network that has become critical after successive severe flooding events hit Itaquera, São Mateus and the Tietê floodplain. That pipeline, if implemented, would include deduplication as a base function.
The Paulista Avenue surveillance camera network alone generates a significant volume of overlapping still-image exports when footage is flagged for incident review, creating duplicates across multiple departmental drives. City IT staff, speaking in general terms through official Prefeitura communications channels rather than individually, have acknowledged the overlap problem in internal documentation reviewed by this newspaper.
What São Paulo lacks that London and Seoul both have is a city-level data governance policy that treats image deduplication as a baseline standard rather than an optional cleanup task. The federal government's Lei Geral de Proteção de Dados, in force since 2020, sets rules around personal data but does not mandate storage efficiency practices.
The practical path forward involves three things: adopting perceptual hashing tools similar to those used by the Greater London Authority, mandating deduplication checkpoints at the secretariat level before files enter shared city repositories, and auditing existing archives — starting with the highest-volume producers, which city technicians identify as the housing, transport and urban planning secretariats. Without a formal programme in place before the next municipal budget cycle closes in October 2026, São Paulo will keep paying for the same image twice.