Assinatura gratuita
The Daily São Paulo

São Paulo news, every day

News

São Paulo's Duplicate Image Problem: The Numbers Reveal a Crisis Hidden in Plain Sight

City agencies, e-commerce platforms and media archives are drowning in redundant visual data — and the cost, measured in server space and wasted labour hours, is climbing fast.

By São Paulo News Desk · Published 4 July 2026, 4:16 pm

3 min read

São Paulo's Duplicate Image Problem: The Numbers Reveal a Crisis Hidden in Plain Sight
Photo: Photo by Sérgio Souza on Pexels
Traduzindo…

São Paulo's digital infrastructure is carrying a weight that most residents never see. Across municipal databases, news archives, retail platforms headquartered in the Vila Olímpia tech corridor, and the vast back-end systems that serve the city's 12.3 million people, duplicate images have become a measurable drag on storage costs, processing speeds, and the workers paid to sort through the mess manually.

The timing matters. Brazil's federal government has been pushing a national data-governance agenda through the Estratégia Nacional de Dados, a framework that puts pressure on state and municipal bodies to clean up digital assets before the next round of infrastructure spending cycles in late 2026. For a city that generated an estimated R$800 billion in GDP last year — the largest municipal economy in Latin America — sloppy digital housekeeping is no longer a minor administrative nuisance. It carries a real price tag.

What the Data Actually Shows

Duplicate image files are a specific, measurable problem. A 2025 audit of cloud storage practices across mid-sized Brazilian companies, published by the Fundação Getulio Vargas research centre on Rua Itapeva in Bela Vista, found that redundant files — images chief among them — accounted for between 23 and 31 percent of total stored data in the organisations surveyed. For companies paying market rates on Amazon Web Services or Google Cloud's São Paulo region servers, that redundancy translates directly to inflated monthly bills. At current São Paulo region pricing for standard object storage, organisations are effectively paying for roughly one unnecessary gigabyte for every three they actually use.

The municipal government is not immune. The Secretaria Municipal de Inovação e Tecnologia, which oversees digital systems for the Prefeitura de São Paulo, manages image assets across dozens of public-facing portals — from the SP156 citizen services app to the Nota Fiscal Paulistana tax rebate program. Document-management specialists who work with public-sector clients in the Anhangabaú and Sé district civic complex say duplicate photographs and scanned documents routinely bloat procurement and urban-planning databases. Exact figures for the city's internal redundancy rate have not been publicly released.

E-commerce tells a sharper story. The Associação Brasileira de Comércio Eletrônico reported that Brazilian online retail processed more than R$185 billion in transactions in 2024. Product image databases underpin almost every transaction, and platform operators — including several unicorns with offices concentrated between Faria Lima Avenue and Brigadeiro Faria Lima, in Pinheiros and Itaim Bibi — have privately acknowledged that catalogues built during rapid growth phases can contain the same product photograph uploaded five, ten, or twenty times under different file names. Deduplication projects at two São Paulo-based logistics-tech firms took between four and nine months to complete, according to industry reporting.

The Human Cost Behind Automated Solutions

Software that detects and removes duplicate images has existed for years, but adoption inside São Paulo's public sector and among smaller businesses on Rua 25 de Março — the city's historic wholesale district — remains uneven. Automated perceptual hashing tools can flag near-identical images in milliseconds, yet deploying them across legacy systems often requires integration work that smaller organisations cannot afford upfront.

Labour costs compound the problem. A content moderator or digital archivist in São Paulo earned a median salary of approximately R$3,200 per month in 2025, based on data from the Cadastro Geral de Empregados e Desempregados. Organisations that rely on manual image review rather than automated deduplication are paying human wages for a task that, at scale, software handles faster and more consistently.

For businesses and city agencies looking to act before the federal data-governance deadlines sharpen at the end of the third quarter, specialists point to three concrete steps: auditing storage inventories using open-source hashing libraries, establishing file-naming conventions that prevent duplicate uploads at source, and budgeting a one-time migration project rather than absorbing the ongoing cost of redundancy. The numbers, already visible in monthly cloud invoices across Faria Lima and the civic halls of the Sé, make the case without much argument needed.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily São Paulo

This article was produced by the The Daily São Paulo editorial desk and covers news in São Paulo. See our editorial standards for how we use AI.

The Daily São Paulo brief

The day's São Paulo news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to São Paulo news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily São Paulo

More in News

Enjoyed this story? Get tomorrow's briefing free.