Assinatura gratuita
The Daily São Paulo

São Paulo news, every day

News

São Paulo's Digital Archive Crisis: The Key Decisions Ahead on Duplicate Image Replacement

City agencies and tech firms are being forced to choose between costly manual audits and automated AI tools as duplicated visual records clog public databases and slow urban services.

By São Paulo News Desk · Published 4 July 2026, 4:45 pm

3 min read

São Paulo's Digital Archive Crisis: The Key Decisions Ahead on Duplicate Image Replacement
Photo: Photo by Toni Ferreira on Pexels
Traduzindo…

São Paulo's municipal government is sitting on a problem it can no longer defer. Thousands of duplicate images — aerial photos, infrastructure inspection records, flood-zone surveys — have accumulated across at least a dozen city databases, slowing the delivery of urban services and creating legal exposure around which version of a document is authoritative. The question now is who pays to fix it, and which technology gets the contract.

The issue surfaced publicly in late June when the Secretaria Municipal de Urbanismo e Licenciamento flagged delays in building permit approvals in Tatuapé and Mooca, two eastern districts where digital file redundancy had caused processing backlogs stretching beyond 40 business days. Engineers cross-referencing aerial imagery from the Geosampa platform — the city's official geographic information portal — found multiple versions of the same cadastral photographs indexed under different file names, making it impossible to confirm which record was current.

Why the Timing Is Forcing a Decision

This is not a new technical problem, but three converging pressures are making inaction untenable in 2026. First, the city's ongoing drainage infrastructure programme — a direct response to the catastrophic flooding that struck Jardim Pantanal and the Várzea do Rio Tietê corridor repeatedly since 2023 — depends on accurate, deduplicated aerial mapping to route new retention basins. Contaminated image databases produce faulty modeling outputs. Second, Mayor Ricardo Nunes' administration faces a procurement deadline: the current data management contract with a consortium managed through the Empresa de Tecnologia da Informação e Comunicação do Município de São Paulo, known as PRODAM, expires in the fourth quarter of 2026, triggering a mandatory rebid. Third, the federal Lula government's push to digitise municipal services under the Programa Cidades Digitais framework means federal co-financing is available — but only if São Paulo can demonstrate clean, non-redundant data architecture by the end of the fiscal year.

None of those deadlines is comfortable to miss. Federal co-financing under Cidades Digitais has reportedly supported data infrastructure upgrades in cities including Fortaleza and Recife, but municipal officials in São Paulo have not yet publicly confirmed whether a formal application is in process. The PRODAM rebid alone is expected to cover data governance services valued in the hundreds of millions of reais, based on the scale of previous contracts published on the city's transparency portal.

The Technical Fork in the Road

Two broad approaches are being evaluated. The first is a manual audit-and-tag methodology, in which human reviewers classify image files by capture date, sensor source, and geographic bounding box before flagging duplicates for deletion or archiving. It is reliable but slow: pilot tests conducted inside the Cohab-SP housing agency's image library — which holds land-use documentation for social housing zones in Heliópolis and Cidade Tiradentes — suggested a throughput of roughly 3,000 files per analyst per month, far below the scale needed.

The second option is automated deduplication using perceptual hashing algorithms combined with machine-learning classifiers trained on geospatial metadata. Several São Paulo-based tech firms operating out of the Cubo Itaú innovation hub on Avenida Brigadeiro Faria Lima have pitched variants of this approach to PRODAM over the past 18 months. The advantage is speed; the risk is false positives, where near-identical images taken days apart — showing, for example, a flooded street before and after water receded — get incorrectly merged, erasing an evidential record that regulators or courts may later need.

Civil engineering researchers at the Escola Politécnica da USP, whose campus sits on Rua do Lago in Butantã, have been studying exactly this false-positive problem in satellite image archives, though no published findings specific to the city's municipal databases are yet available.

The practical decisions landing on desks at the Secretaria de Inovação e Tecnologia over the coming 90 days include: whether to split the deduplication work between automated pre-screening and human verification; which datasets — drainage maps, building permits, or social housing records — get prioritised first; and whether the PRODAM rebid should include deduplication as a mandatory deliverable or treat it as an optional scope extension. Each choice carries a price tag and a political constituency. Getting it wrong means slower permits in Mooca, murkier flood data in Jardim Pantanal, and federal money left on the table.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily São Paulo

This article was produced by the The Daily São Paulo editorial desk and covers news in São Paulo. See our editorial standards for how we use AI.

The Daily São Paulo brief

The day's São Paulo news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to São Paulo news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily São Paulo

More in News

Enjoyed this story? Get tomorrow's briefing free.