Assinatura gratuita
The Daily São Paulo

São Paulo news, every day

News

São Paulo's Digital Archives Face Cleanup Crisis as Duplicate Image Problem Hits Municipal Systems This Week

City departments and local tech firms are scrambling to address a growing backlog of redundant image files clogging public databases and slowing down urban services.

By São Paulo News Desk · Published 4 July 2026, 4:28 pm

3 min read

São Paulo's Digital Archives Face Cleanup Crisis as Duplicate Image Problem Hits Municipal Systems This Week
Photo: Photo by Th2city Santana on Pexels
Traduzindo…

São Paulo's municipal technology infrastructure hit a friction point this week when multiple city departments flagged an accelerating problem: duplicate images embedded in public-facing digital systems are degrading performance across everything from the Prefeitura's online permit portals to neighbourhood-level urban planning dashboards. The issue, which has been building for months, surfaced publicly on July 1 when the Secretaria Municipal de Inovação e Tecnologia acknowledged the scale of the backlog in an internal circular reviewed by The Daily São Paulo.

The timing is awkward. The Nunes administration has spent the past two years positioning São Paulo as a regional leader in smart-city infrastructure, channelling investment into integrated data platforms that are supposed to speed up licensing, flood monitoring, and public health reporting. Redundant image files — photographs uploaded multiple times, scanned documents saved in duplicate, satellite imagery layers stacked without deduplication — are quietly eating storage capacity and slowing query times on systems that city staff use daily. In a city where flooding season demands real-time drainage data, sluggish databases are more than a bureaucratic nuisance.

Where the Problem Is Showing Up

The heaviest impact is concentrated in two operational hubs. The Centro de Operações São Paulo, headquartered near the Marginal Tietê corridor in the Barra Funda district, relies on continuous image feeds from street cameras and drone overflights to coordinate emergency responses. Staff there have been flagging since at least April that search functions within the image archive were returning duplicate results, forcing manual cross-checks that add time to incident response. Separately, the Empresa de Tecnologia da Informação e Comunicação do Município de São Paulo — known as PRODAM — confirmed this week that its cloud storage costs for the first half of 2026 came in above the projected budget, partly because automated backup routines were saving image copies without checking for existing identical files. PRODAM declined to specify the overage figure.

The problem is not unique to government. Along Rua Frei Caneca in Consolação, a cluster of adtech and legaltech startups that feed data to public procurement platforms have been dealing with the same issue. Firms using shared image repositories to process document submissions — property deeds, construction plans, environmental licenses — have reported that their deduplication pipelines, many built on open-source tools, were not calibrated for the volume surge that followed an expansion of digital-first city services in early 2025. One São Paulo-based software firm, Idwall, which works in document verification, had previously noted publicly that image data volumes in Brazilian compliance workflows roughly doubled between 2023 and 2025, a trajectory that outpaced most infrastructure planning assumptions.

What Comes Next for City Systems

PRODAM has indicated it will roll out a forced deduplication pass across its primary storage environment during the week of July 13, a maintenance window timed to avoid overlap with the Carnaval planning cycle, which typically begins generating large imagery loads in August. The process will use hash-matching algorithms to identify bit-for-bit identical files before moving to more computationally intensive perceptual hashing for near-duplicate images — photographs of the same site taken seconds apart, for instance, that differ only in compression artefacts.

For residents and businesses interacting with city portals, the practical effect should be faster load times on permit-status pages and the interactive flood-risk maps available through the GeoSampa platform, which covers drainage basin data across all 96 subprefeituras. GeoSampa's image layers, some of which date to aerial surveys conducted before 2018, have accumulated particularly dense duplication because successive survey cycles were uploaded without archiving or retiring older versions.

Experts in urban data management have suggested that cities at São Paulo's scale — managing a metropolitan population of roughly 22 million and one of Latin America's densest concentrations of connected public infrastructure — need automated deduplication baked into data ingestion pipelines rather than treated as a periodic cleanup task. The city has not yet announced a permanent policy change along those lines, but the Secretaria de Inovação is expected to present a revised data governance framework to the Câmara Municipal before the end of the third quarter.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily São Paulo

This article was produced by the The Daily São Paulo editorial desk and covers news in São Paulo. See our editorial standards for how we use AI.

The Daily São Paulo brief

The day's São Paulo news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to São Paulo news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily São Paulo

More in News

Enjoyed this story? Get tomorrow's briefing free.