São Paulo's city hall is sitting on hundreds of thousands of duplicate images. That is the uncomfortable administrative reality behind a quiet but consequential reform effort now under way inside the Secretaria Municipal de Gestão, the body responsible for the city's internal digital infrastructure. The problem did not appear overnight — it is the accumulated result of at least three separate digitization waves stretching back to the early 2000s, each conducted by different contractors under different technical standards, and none of them speaking cleanly to the others.
The stakes are higher than they might sound. When government databases carry multiple copies of the same document scan, photograph, or urban planning image, retrieval systems slow down, storage costs balloon, and — most critically — civil servants pulling records for licensing decisions, zoning reviews, or public works bids can end up working from the wrong version. In a city where the Prefeitura de São Paulo processes tens of thousands of administrative acts per year across 96 subprefeituras, version confusion is not a theoretical risk.
Three Digitization Waves, One Tangled Legacy
The first large-scale push came in the early 2000s, when the city contracted out the scanning of paper planning and property records held at the Arquivo Público Municipal on Rua Riachuelo, in the Liberdade neighbourhood. The second wave followed the 2013 municipal open-data decree, which pushed departments to upload documents to the Dados Abertos SP portal without a unified naming convention. The third and most recent surge happened between 2020 and 2022, during the pandemic, when remote-work pressures accelerated ad-hoc digitization across departments that had never before shared a common file management protocol.
Each wave deposited images into separate repositories. The Empresa de Tecnologia da Informação e Comunicação do Município de São Paulo — known as Prodam — has been responsible for maintaining the city's core IT infrastructure since 1971, but the company's mandate did not historically extend to enforcing image deduplication standards across every secretariat. Departments operated semi-independently, and a photograph of a flooded street in Capão Redondo might exist in four different folders under four different filenames, uploaded by four different civil servants across four different years.
A Prodam internal review completed in late 2024 — the existence of which has been reported by tech-sector outlets covering São Paulo's municipal IT contracts — identified the duplication issue as a top-tier data quality problem, though the precise volume of redundant files has not been made public. What is known is that the city's central document management system, the Sistema Eletrônico de Informações, or SEI, which São Paulo adopted from the federal government framework in 2017, does not natively prevent duplicate uploads. That structural gap was never patched.
What a Fix Actually Looks Like
The solution being piloted inside the Secretaria Municipal de Gestão involves perceptual hashing — a technique that generates a compact digital fingerprint for each image and flags files that are identical or near-identical — combined with metadata standardisation across departments. The pilot is currently running on the image libraries held by the Secretaria Municipal de Urbanismo e Licenciamento, which oversees building permits and land-use records for the entire 1,521-square-kilometre municipality.
Getting every secretariat aligned will take time. São Paulo has 27 municipal secretariats, each with its own IT staff, budget cycles, and institutional culture. The Secretaria Municipal de Inovação e Tecnologia, created in 2021 under the Nunes administration, has been tasked with coordinating the broader data-governance reform, but coordination across bureaucratic silos in Latin America's largest city is a project measured in years, not quarters.
For residents, the practical payoff would be faster responses to public records requests filed under the Lei de Acesso à Informação, Brazil's freedom-of-information law, which has been in force since 2012. Duplicate and misfiled images are among the most commonly cited reasons that document searches inside municipal systems exceed the law's 20-business-day response deadline. Cleaning the archive does not just save server space — it makes the city's accountability machinery work faster. Officials at the Secretaria Municipal de Gestão are expected to present a phased rollout plan to the Câmara Municipal de São Paulo before the end of the third quarter of 2026.