Assinatura gratuita
The Daily São Paulo

São Paulo news, every day

News

São Paulo's Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Costly Story

From city hall's asset databases to the sprawling tech campuses of Vila Olímpia, redundant image files are eating storage budgets and slowing public-sector workflows across the city.

By São Paulo News Desk · Published 4 July 2026, 4:11 pm

3 min read

São Paulo's Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Costly Story
Photo: Photo by Willian Santos on Pexels
Traduzindo…

São Paulo's municipal government is sitting on a digital storage problem it can no longer ignore. Across platforms managed by the Secretaria Municipal de Inovação e Tecnologia, internal audits have found that duplicate image files account for a disproportionate share of storage consumption in public-sector databases — a quiet inefficiency that carries a measurable price tag as the city scales up its e-government infrastructure ahead of the 2026 municipal budget cycle closing in December.

The issue matters now because São Paulo is mid-rollout on SP156, the city's centralised digital services portal, and the Prefeitura de São Paulo has been migrating legacy documentation — construction permits, flood-zone maps, public event records — into unified cloud environments. When image duplication rates run high, those migrations carry bloated costs and create indexing errors that slow down the very services residents use to report broken drainage on Avenida do Estado or request zoning certificates in Pinheiros.

The Scale of the Problem in Numbers

Duplicate image files are not a vanity problem. In enterprise and government cloud environments, research published by the International Data Corporation estimated that unstructured data — a category that includes images — grows at roughly 23 percent per year inside large organisations. In practice, storage teams report that image duplication rates inside unmanaged repositories can reach between 20 and 40 percent of total image volume. Apply that range to a city the size of São Paulo, with dozens of secretarias each running semi-independent document systems, and the redundancy compounds fast.

Cloud storage pricing in Brazil, priced in reais and sensitive to the dollar exchange rate, has not gotten cheaper. Major providers offering enterprise tiers in the Brazilian market have listed costs that, depending on contract terms, can run between R$0,08 and R$0,25 per gigabyte per month for active storage. A municipal department storing 10 terabytes of images — not unusual for an urban planning office that photographs every permitted construction site — could be paying for two to four terabytes of pure duplication every single month.

The Instituto de Pesquisas Tecnológicas, headquartered on the campus of Cidade Universitária in Butantã on the west side of the city, has flagged image deduplication as a priority area in digital preservation projects it has partnered on with federal agencies. The technology to fix the problem — perceptual hashing algorithms, reverse-image matching, AI-assisted deduplication pipelines — exists and is commercially available. The gap is institutional: procurement cycles are slow, and the technical teams inside public bodies are stretched.

Private Sector Pressure and What the City Can Learn

São Paulo's private tech sector, concentrated in the glass-tower corridors of Vila Olímpia and the startup clusters around Rua Funchal in Itaim Bibi, solved this problem years ago at scale. Platforms handling millions of user-uploaded images — e-commerce listings, real estate portals, media archives — run deduplication as a standard pipeline step, not an afterthought. Grupo ZAP, which operates one of Brazil's largest property listing databases from its São Paulo offices, processes image libraries where deduplication directly affects listing quality and server costs simultaneously.

The practical lesson for public administrators is that deduplication is not simply a storage play. When the Empresa de Tecnologia da Informação e Comunicação do Município de São Paulo, known as PRODAM, centralises image assets from multiple secretarias into a single environment, undetected duplicates generate conflicting metadata, broken links in public-facing portals, and audit trails that are impossible to reconcile. The Tribunal de Contas do Município, which reviews exactly these kinds of digital infrastructure expenditures, has the authority to flag unjustified storage costs as fiscal irregularities.

City administrators rolling out the next phase of SP156 integrations before the December budget close should treat duplicate image detection as a pre-migration checklist requirement, not a post-migration cleanup task. Procurement of deduplication tooling, even at modest scale, pays for itself within months at current Brazilian cloud storage rates — and produces cleaner data for every resident who pulls up a flood-zone record or files a complaint from Paulista Avenue to the city's Zona Leste.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily São Paulo

This article was produced by the The Daily São Paulo editorial desk and covers news in São Paulo. See our editorial standards for how we use AI.

The Daily São Paulo brief

The day's São Paulo news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to São Paulo news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily São Paulo

More in News

Enjoyed this story? Get tomorrow's briefing free.