Assinatura gratuita
The Daily São Paulo

São Paulo news, every day

News

São Paulo's Duplicate Image Crisis: The Key Decisions That Will Shape the City's Digital Records

Municipal archives and tech platforms face a reckoning over how to handle thousands of redundant image files clogging public databases — and the clock is ticking.

By São Paulo News Desk · Published 4 July 2026, 3:51 pm

3 min read

São Paulo's Duplicate Image Crisis: The Key Decisions That Will Shape the City's Digital Records
Photo: Photo by Rafael Rodrigues on Pexels
Traduzindo…

São Paulo's municipal data infrastructure has a clutter problem. Tens of thousands of duplicate digital images — ranging from zoning maps to public health campaign photographs — have accumulated across the Prefeitura de São Paulo's document management systems, creating storage bottlenecks, slowing inter-agency data transfers, and raising questions about which version of a record is the authoritative one. The city's Secretaria Municipal de Gestão (SMG) is now under pressure to decide, before the end of the third quarter of 2026, how to proceed with a deduplication process that will affect archives stretching back more than a decade.

The issue has landed at a particularly fraught moment. Mayor Ricardo Nunes is in the middle of a broader digital modernisation push ahead of municipal budget negotiations in October, and getting the data house in order is widely seen inside City Hall as a prerequisite for rolling out expanded e-government services. Disorganised image libraries mean that automated systems — including those used by the city's urban planning department along Avenida do Estado — can inadvertently pull the wrong version of a permit map or flood-risk overlay, with real consequences for construction approvals in flood-prone zones like Várzea do Tietê.

What the Process Actually Involves

Duplicate image replacement is not simply hitting delete. Each flagged file must be assessed: Is this a true duplicate, a revised version, or a scan at a different resolution? The SMG's Centro de Tecnologia da Informação e Comunicação (CTIC) has been tasked with building a hash-based comparison pipeline — a method that generates a unique digital fingerprint for each image file and cross-references it against the full archive. A similar process was completed by the city of Bogotá between 2023 and 2024, cutting its municipal image archive by roughly 34 percent without losing a single legally required document, according to Bogotá's Secretaría General's published 2024 annual report.

São Paulo's archive is considerably larger. The Arquivo Histórico de São Paulo, housed on Rua Voluntários da Pátria in Santana, alone holds digitised collections dating to the 1880s. Meanwhile, the more recent operational image databases — used daily by agencies from the Companhia de Engenharia de Tráfego (CET) to the Secretaria Municipal de Saúde — are stored on cloud infrastructure contracted through São Paulo's ongoing Projeto SP Sem Papel, which launched in 2021 with a mandate to eliminate paper-based workflows across 27 municipal secretariats. It is within those operational databases that the duplication problem is most acute.

The financial stakes are not trivial. Cloud storage costs for municipal government systems in São Paulo have risen steadily alongside global data pricing trends; internal procurement documents reviewed in 2025 suggested the city's annual cloud expenditure had crossed R$180 million across all secretariats, a figure that officials in the SMG have indicated could be trimmed meaningfully through deduplication, though no official reduction target has been published. Storage inefficiency also compounds latency issues on platforms like the SP156 citizen services portal, which logged more than 4.2 million service requests in 2025 alone.

The Decisions That Cannot Wait

Three choices are now on the table. First, the SMG must decide whether deduplication will be handled entirely by internal CTIC staff or whether a private contractor will be brought in — a politically sensitive question given the Lula-aligned federal government's scrutiny of municipal outsourcing contracts. Second, the city must establish a retention protocol: which duplicates get archived offline, and which get permanently deleted, matters enormously for legal discovery and transparency law compliance under Brazil's Lei de Acesso à Informação (LAI). Third, there is the question of timing. Running the deduplication pipeline during business hours risks disrupting live services; a scheduled off-peak window will need sign-off from every affected secretariat.

Civic technology groups including Rede Nossa São Paulo have previously flagged data-quality issues in municipal systems as an obstacle to accountability, though the organisation has not specifically addressed the current image duplication question. What is clear is that the SMG's internal deadline of September 30, 2026 is firm — pushed by the October budget cycle. Whatever framework the city adopts in the coming weeks will set the template for how São Paulo manages its digital records for the next decade.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily São Paulo

This article was produced by the The Daily São Paulo editorial desk and covers news in São Paulo. See our editorial standards for how we use AI.

The Daily São Paulo brief

The day's São Paulo news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to São Paulo news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily São Paulo

More in News

Enjoyed this story? Get tomorrow's briefing free.