Assinatura gratuita
The Daily São Paulo

São Paulo news, every day

News

How São Paulo's Digital Archives Ended Up Flooded With Duplicate Images — And What Happened Next

Years of fragmented municipal digitisation projects, competing platforms and budget shortcuts left the city's public image databases clogged with redundant files, and the effort to clean them up is only now gaining traction.

By São Paulo News Desk · Published 4 July 2026, 4:00 pm

3 min read

How São Paulo's Digital Archives Ended Up Flooded With Duplicate Images — And What Happened Next
Photo: Elliott, L. E. (Lilian Ellwyn), b. 1894 / Public domain (Wikimedia Commons)
Traduzindo…

São Paulo's municipal governments have spent more than a decade digitising public records, urban planning documents and cultural heritage photographs — and the result, according to archivists and technology specialists familiar with the city's systems, is a sprawling mess of duplicate images that has quietly paralysed search functions across multiple official platforms.

The problem did not appear overnight. It accumulated across successive administrations, each of which launched its own digitisation initiative with its own file-naming conventions, storage vendors and quality controls — or the lack thereof.

A Patchwork of Projects, a Mountain of Redundant Files

The roots of the crisis trace back to at least 2013, when the Prefeitura de São Paulo began an ambitious push to digitise physical records held at the Arquivo Histórico Municipal Washington Luís, a repository on Rua Cantareira in the Centro district that holds colonial-era maps, urban planning blueprints and over a century of photographic documentation. That effort ran in parallel with a separate scanning programme coordinated by the Secretaria Municipal de Cultura, which operated out of offices near Avenida São João, and the two projects never shared a common metadata standard.

By the time later administrations layered in additional platforms — including the GeoSampa geographic information portal and the São Paulo Aberto open-data initiative — every new batch of scanned images was being uploaded against a backdrop of files that had already been uploaded, often more than once, under different filenames. A photograph of the Viaduto do Chá, for example, might exist simultaneously in three separate folders labelled by different archivists using different date formats, different resolution settings and no persistent unique identifier to flag the duplication.

Technology workers contracted by the city to audit the GeoSampa database described encountering image sets where individual aerial photographs of neighbourhoods like Pinheiros and Vila Madalena appeared in four or five separate subdirectories. The audit, conducted in late 2024, identified the structural cause: procurement rules had pushed each secretariat to contract separate cloud-storage providers, none of which talked to one another automatically.

Why the Problem Matters Beyond Filing Cabinets

Duplicate images are not merely a librarian's headache. For urban planners working on São Paulo's chronic flooding and drainage infrastructure — a priority for the Ricardo Nunes administration, which has publicly committed to expanding the city's network of piscinões, or retention reservoirs — outdated or mislabelled aerial imagery can mean engineers are referencing the wrong version of a neighbourhood map. Parelheiros and Grajaú, two southern districts repeatedly battered by flooding during the summer rainy season, have both been the subject of conflicting satellite-image datasets held across different city systems.

The financial stakes are real. Brazil's Lei de Acesso à Informação, in force since 2012, requires public bodies to respond to records requests within 20 business days. Delays caused by staff having to manually triage duplicate files before releasing documents have contributed to municipalities across the state missing that deadline, according to monitoring data published by the Escola de Administração de Empresas de São Paulo da Fundação Getulio Vargas. São Paulo state bodies received more than 890,000 information requests in 2024 alone, a figure released by the state's Ouvidoria Geral.

The federal government's push under Lula's administration to expand the Conecta Gov digital-services framework has added pressure. Federal guidelines issued in early 2025 set interoperability standards that municipal databases must meet to qualify for infrastructure transfer payments — and duplicate, untagged image files are a direct obstacle to compliance.

What comes next is a phased deduplication programme that the Secretaria Municipal de Inovação e Tecnologia began piloting in March 2026 across a subset of the Arquivo Histórico's scanned collections. The approach uses hash-matching algorithms to flag identical binary files before human archivists confirm deletions. Officials familiar with the process say a full rollout across GeoSampa and São Paulo Aberto is targeted for completion before the end of 2027 — giving the city roughly 18 months to bring its image infrastructure in line with federal interoperability standards and, more practically, to ensure that the next time a planner in Grajaú pulls up a flood-plain map, it is the right one.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily São Paulo

This article was produced by the The Daily São Paulo editorial desk and covers news in São Paulo. See our editorial standards for how we use AI.

The Daily São Paulo brief

The day's São Paulo news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to São Paulo news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily São Paulo

More in News

Enjoyed this story? Get tomorrow's briefing free.