Assinatura gratuita
The Daily São Paulo

São Paulo news, every day

News

São Paulo's Digital Archives Face a Reckoning Over Duplicate Images: What Happens Next and the Key Decisions Ahead

City agencies and tech platforms are being forced to choose between costly manual review and automated AI tools as duplicated visual records pile up across public databases.

By São Paulo News Desk · Published 4 July 2026, 4:51 pm

3 min read

São Paulo's Digital Archives Face a Reckoning Over Duplicate Images: What Happens Next and the Key Decisions Ahead
Photo: Photo by Gabriel Schincariol Cavalcante on Pexels
Traduzindo…

São Paulo's municipal government is sitting on a growing problem it can no longer defer. Thousands of duplicate images — photographs, scanned documents, satellite snapshots, urban planning visuals — have accumulated across at least four major city data systems, creating redundancy that wastes storage, skews public data portals and complicates infrastructure projects from Avenida Paulista to the flood-prone banks of the Rio Pinheiros. The question now is who decides what gets deleted, what gets kept, and who pays for the cleanup.

The issue has been building for years but arrived at a pressure point in 2026 as Ricardo Nunes' city administration pushed forward its São Paulo Inteligente digitisation agenda, which aims to centralise urban data across departments by the end of this fiscal year. When engineers began migrating records from the Secretaria Municipal de Urbanismo e Licenciamento into the GeoSampa platform — the city's open geospatial data portal — the volume of duplicated aerial photography alone ran into the tens of thousands of files, according to technical documentation reviewed as part of public procurement disclosures.

Why the Backlog Got This Bad

The duplication problem is partly structural. São Paulo's 96 subprefeituras each generated their own image libraries over the past decade, often photographing the same street interventions, flood events and zoning inspections with no central coordination. The Subprefeitura da Sé, covering the historic centre, and the Subprefeitura de Pinheiros, which oversees some of the city's densest infrastructure corridors, both ended up with overlapping visual records of the same drainage work completed along Rua Estados Unidos in 2023. Nobody flagged the duplication at intake.

Federal data governance frameworks under the Lula administration's Estratégia de Governo Digital, which formally covers municipal partners through joint technical agreements, require that public bodies maintain accurate, non-redundant digital records. São Paulo signed onto a cooperative framework with the federal Ministério da Gestão e da Inovação em Serviços Públicos in late 2024. That agreement now creates legal exposure: duplicated records that distort official counts of infrastructure interventions or urban permits could constitute administrative irregularities under the Lei de Acesso à Informação.

The São Paulo tech sector is watching closely. Startups clustered around the Cubo Itaú hub in Faria Lima and the USP spin-off ecosystem in Cidade Universitária have been quietly pitching AI-based image deduplication tools to city procurement offices since early 2025. At least two proposals from local firms were submitted to the Secretaria de Inovação e Tecnologia before June 30, 2026, the deadline set internally for the GeoSampa migration phase. Neither contract has been awarded publicly as of today.

The Decisions That Can't Wait

Three choices will define how this unfolds over the next six months. First, the city must decide whether to run automated deduplication — faster and cheaper but prone to false positives that could delete legally significant records — or commission a manual audit, which one internal estimate placed at roughly R$2.4 million for a full review of the urbanismo database alone. Second, city legal teams must determine whether images flagged as duplicates but containing different metadata — different timestamps on the same scene, for instance — must be preserved under records retention law. Third, and most politically sensitive, is the question of public access: if GeoSampa's image layer is temporarily taken offline for cleaning, civil society organisations like LabCidade at FAU-USP that rely on the portal for urban research will face a gap in open data availability.

The Nunes administration has not announced a final methodology. Budget discussions for the second half of 2026 are ongoing, and any contract above R$1.5 million requires a full tender process under Lei 14.133/2021, the federal procurement law that replaced the old Lei das Licitações. That process alone takes a minimum of 40 working days once published in the Diário Oficial.

Civil society groups and city councillors on the Câmara Municipal's Committee on Technology and Innovation are expected to request a public hearing on the matter before August. The window for course correction is narrow. Get the methodology wrong and São Paulo risks either losing irreplaceable urban records or locking in a bloated, legally vulnerable archive for another decade.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily São Paulo

This article was produced by the The Daily São Paulo editorial desk and covers news in São Paulo. See our editorial standards for how we use AI.

The Daily São Paulo brief

The day's São Paulo news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to São Paulo news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily São Paulo

More in News

Enjoyed this story? Get tomorrow's briefing free.