São Paulo's municipal administration confirmed this week that a citywide audit of its digital document repositories has identified tens of thousands of duplicate scanned images clogging public records systems managed by the Secretaria Municipal de Gestão. The redundant files — many of them repeated ID photos, property registration scans, and infrastructure inspection images — have slowed database response times and complicated public access requests filed under Brazil's Lei de Acesso à Informação, the federal transparency law enacted in 2011.
The timing matters. Brazil's federal government under President Lula has been pushing a broader digitisation agenda through the Ministério da Gestão e da Inovação em Serviços Públicos, and São Paulo, as the country's largest urban economy, is expected to model best practice for the other 26 state capitals. Mayor Ricardo Nunes' administration flagged the duplicate-image problem internally in early 2026 after a migration project at the Central de Atendimento ao Cidadão — the main public services hub on Rua Líbero Badaró in the historic Centro district — surfaced file duplication rates that technicians described in internal documentation as unexpectedly high.
What the Duplication Problem Actually Looks Like on the Ground
The issue is not trivial. Municipal records systems in São Paulo span decades of accumulated paper documents that were digitised in batches, often by different contractors using inconsistent naming conventions. When the Arquivo Público do Município, located in the Bela Vista neighbourhood near Avenida Paulista, began consolidating legacy datasets in the first quarter of 2026, staff encountered image files that had been scanned multiple times across different departmental uploads. A single property deed photograph, for example, might exist in four or five nearly identical versions under different file names, each occupying server space and each appearing as a separate result in public search queries.
The practical consequence for residents is real. A search request for building permits in the Pinheiros district, submitted through the city's online portal, can return duplicate results that force applicants to manually verify which version is authoritative — adding days to processes that are supposed to take hours.
São Paulo is not alone. Mexico City's Agencia Digital de Innovación Pública reported in late 2024 that a parallel digitisation push for its land registry had produced duplication rates of roughly 18 percent across certain document categories, according to figures the agency published in its annual transparency report. Bogotá's Secretaría General acknowledged similar challenges when it migrated cadastral records in 2023. Mumbai's Brihanmumbai Municipal Corporation ran a deduplication initiative across its property tax image database in 2022, contracting a local technology firm to apply perceptual hashing — a technique that matches visually identical images even when file names differ.
The Technical Fix and What São Paulo Is Deploying
São Paulo's Secretaria Municipal de Gestão is understood to be piloting a perceptual hashing tool developed in partnership with startups from the Cubo Itaú innovation hub in Itaim Bibi, according to procurement notices published on the city's Diário Oficial. The pilot, which began in May 2026, is being tested on a subset of approximately 200,000 images drawn from construction and zoning records for the Zona Leste administrative region before any broader rollout.
The cost of data storage is one driver of urgency. Cloud infrastructure contracts for municipal systems are not cheap at São Paulo's scale — the city's annual technology spending has historically run into the hundreds of millions of reais, with storage costs forming a meaningful share. Eliminating verified duplicates compresses that footprint without destroying original records.
Cities that have already completed similar exercises offer a useful reference. Bogotá reported a storage reduction of around 22 percent after its 2023 deduplication project, according to figures in that city's published audit. Mexico City's 2024 report cited improved search accuracy as the primary user-facing benefit rather than cost savings alone.
For São Paulo residents, the practical advice is straightforward: if you have a pending records request through the city portal at sp156.prefeitura.sp.gov.br, expect possible delays through August 2026 as the Zona Leste pilot runs. The Arquivo Público do Município has posted guidance on its website advising applicants to reference the specific protocolo number assigned at submission to avoid being caught in duplicate-result confusion. The administration has indicated a city-wide rollout decision will follow an evaluation of the pilot results, expected before the end of the third quarter.