São Paulo's Duplicate Image Problem: The Numbers That Are Costing the City Millions
From municipal websites to hospital record systems, redundant digital files are draining public IT budgets — and the data tells a damning story.
From municipal websites to hospital record systems, redundant digital files are draining public IT budgets — and the data tells a damning story.

São Paulo's public digital infrastructure is sitting on a mountain of duplicate images. Across municipal systems — from the Prefeitura de São Paulo's online service portals to the digitised patient records managed by Hospital das Clínicas in Cerqueira César — redundant image files are consuming server capacity, inflating storage costs, and slowing down the systems that millions of paulistanos use every day. The scale of the problem, according to federal procurement data published by the Tribunal de Contas da União through June 2026, runs into the tens of terabytes across São Paulo's municipal and state-linked IT environments.
The timing matters. Mayor Ricardo Nunes is pushing a R$4.2 billion digital transformation agenda for the 2025–2027 period, with cloud migration at its centre. The São Paulo State government is simultaneously running its Programa de Modernização da Gestão Pública, which targets full digitisation of civil service records by December 2026. Neither program has yet built in a mandatory deduplication audit — meaning that as files move to cloud infrastructure, duplicate images are migrating with them, locking in wasted expenditure at the new, higher cloud-storage rate.
Deduplication is not a new concept. In private-sector IT, major São Paulo-based data centres — including facilities operated along the Marginal Pinheiros corridor in the Barra Funda and Vila Leopoldina districts — routinely run automated deduplication tools that reduce stored image data by between 40 and 70 percent. Public-sector adoption of the same discipline lags badly. A 2025 audit by the Controladoria-Geral do Município found that the Secretaria Municipal de Educação alone maintained more than 1.3 million digitised school documents, with internal estimates suggesting roughly 30 percent of stored image files were full or near-full duplicates created through repeated scanning of the same physical records.
The cost arithmetic is straightforward. Cloud object storage in Brazil, priced in contracts with providers operating under the Lei de Licitações framework, runs at roughly R$0.08 to R$0.12 per gigabyte per month depending on redundancy tier. A single terabyte of unnecessary duplicate images costs the public purse approximately R$80 to R$120 every month — every month it sits there. Multiply that across a dozen secretarias, public hospitals, and the GeoSampa urban mapping platform that the city's Departamento de Tecnologia e Informação maintains, and the annual waste climbs well past R$1 million before accounting for bandwidth and retrieval costs.
GeoSampa, the free mapping service accessible to all paulistanos and widely used by urban planners on projects from the Avenida Paulista cultural corridor to the drainage works in the flood-prone Brasilândia neighbourhood in the north zone, is a particular flashpoint. The platform stores hundreds of thousands of aerial and satellite image tiles, many captured across multiple overlapping survey campaigns. City IT staff acknowledge internally that image-layer duplication is a known issue — but without a formal deduplication policy attached to the current modernisation program, systematic cleanup has no budget line and no deadline.
The federal government's Estratégia de Governo Digital 2024–2027, administered through the Ministério da Gestão e da Inovação em Serviços Públicos in Brasília, does include guidance on data governance and storage efficiency — but compliance at the municipal level is voluntary, not mandated. São Paulo's Câmara Municipal has received at least two proposals in the past eighteen months calling for a dedicated city-wide data hygiene ordinance, neither of which has reached a floor vote.
Practical steps are available now, without waiting for legislation. The Instituto de Pesquisas Tecnológicas, headquartered on the Cidade Universitária campus in Butantã, has published open-source deduplication toolkits designed specifically for Brazilian public-sector document management systems. Secretarias can run perceptual hash comparisons across stored image libraries to flag duplicates for human review — a process that IT specialists estimate can clear 30 to 50 percent of redundant files within 90 days on a typical mid-sized municipal archive. The bill for not doing so keeps compounding, one duplicate jpeg at a time.
How does this story make you feel?
Spread the word
About this article
Published by The Daily São Paulo
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News