São Paulo's municipal digital infrastructure holds tens of thousands of photographic records — urban planning documents, public works files, cultural heritage images — and a growing chorus of archivists, technologists and city officials say a systemic problem with duplicate and mislabeled image files is quietly undermining the reliability of those databases. The issue surfaced publicly in recent months through discussions at the Centro de Documentação e Memória of the Universidade Estadual Paulista and through procurement reviews at the Secretaria Municipal de Gestão, which oversees city-wide data systems.
The timing matters. Mayor Ricardo Nunes' administration is currently midway through a R$180 million digital transformation program that was announced in 2024 and is scheduled to complete its second phase by December 2026. That program is migrating legacy records from dozens of municipal departments onto a unified cloud platform. If duplicates are not identified and resolved before migration, archivists warn, the problem gets embedded into the new system at scale — costing far more to untangle later.
What the Experts Are Saying
Specialists in digital preservation and records management gathered at a workshop held in late June at the Instituto de Estudos Avançados da USP, on the Butantã campus, flagged several specific vulnerabilities. Researchers from the university's library science faculty pointed to the absence of a standardised hash-verification protocol across municipal departments — meaning identical image files are routinely stored under different filenames, with conflicting metadata, across the Secretaria de Cultura e Economia Criativa, the Arquivo Histórico Municipal on Rua Quirino de Andrade, and urban mobility databases. Without automated deduplication tools running at the point of ingestion, the problem compounds with each upload cycle.
The Arquivo Histórico Municipal, which holds photographic records dating to the late 19th century, began a partial digitisation partnership with the Fundação Seade in 2023. Staff there have described — in published institutional reports, not in interviews for this article — encountering cases where the same physical photograph had been scanned multiple times by different departments, each scan saved under a unique identifier, creating false impressions of a larger, more diverse record set than actually exists. The Fundação Seade's 2025 annual report noted that its shared digitisation projects had flagged duplicate asset ratios in some batches reaching as high as 23 percent of total files submitted.
Technology sector voices are weighing in too. São Paulo's startup ecosystem, centred around the Faria Lima corridor and the innovation district near Avenida Paulista, includes several computer vision companies that have built image-recognition and deduplication tools primarily for e-commerce clients. Representatives from at least two firms — neither of which has a current municipal contract — have presented proposals to the Secretaria Municipal de Inovação e Tecnologia arguing that the same perceptual-hashing technology used to flag counterfeit product photos on retail platforms could be adapted for public archive management at a fraction of what custom government software would cost.
The Practical Stakes for the City
This is not only a technical inconvenience. Urban planning decisions, historic preservation rulings and public procurement disputes all rely on photographic documentation stored in municipal databases. A misidentified or duplicated image that is incorrectly indexed to a specific property on, say, Rua Augusta or in the Bela Vista neighbourhood can generate cascading errors in zoning records. Legal challenges to demolition orders or heritage listings have, in at least one documented case reviewed by the Tribunal de Justiça de São Paulo in 2025, hinged on whether the photographic evidence submitted matched the correct parcel of land.
Officials at the Secretaria Municipal de Gestão have not publicly confirmed a timetable for implementing deduplication standards, but the digital transformation program's next milestone review is scheduled for September 2026. Archivists and technology specialists say that review is the most realistic intervention point — a moment when procurement decisions are still open and technical specifications can be amended before the migration's final phase locks in the current data structure. Municipal residents who rely on public records for property research or heritage queries can currently submit correction requests through the Arquivo Histórico Municipal's online portal, which the city relaunched in March 2026 with an updated interface and a stated 15-business-day response commitment.