São Paulo's municipal technology secretariat is pushing a city-wide effort to identify and remove duplicate images embedded across dozens of public databases, from building permit files maintained by the Secretaria Municipal de Urbanismo e Licenciamento to the street-level photo archives used by the city's Geosampa mapping platform. The cleanup, which began in earnest during the first quarter of 2026, targets an estimated backlog of redundant visual records accumulated over nearly a decade of uncoordinated digital uploads across more than 40 municipal departments.
The timing is not accidental. Brazil's Lei Geral de Proteção de Dados — the country's primary data privacy framework — is now firmly in enforcement mode, and federal auditors have been scrutinizing how municipal governments store personal and civic data. Bloated image archives create compliance headaches. Redundant files containing photographs of residents, property facades, or infrastructure can constitute unnecessary personal data retention, exposing city halls to regulatory risk. For São Paulo, Latin America's largest urban economy, the reputational and operational stakes are unusually high.
What São Paulo Is Actually Doing on the Ground
The practical work is concentrated in two places. The Centro de Operações São Paulo — the city's main urban monitoring hub, housed near Paulista Avenue — is coordinating deduplication scripts across real-time camera feeds and incident-photo logs. Meanwhile, the IPT, the Instituto de Pesquisas Tecnológicas do Estado de São Paulo, based in Cidade Universitária in the western zone, has been contracted to validate the algorithmic tools being used to flag duplicate files before deletion.
The city is using a hash-matching approach, assigning each image a unique digital fingerprint and comparing it against the full archive. Files that share an identical fingerprint are flagged as exact duplicates; those with near-identical fingerprints — the same photo saved at two different resolutions, for instance — are queued for human review before any deletion. According to the program's internal documentation, shared with the city council's technology committee in May 2026, the initial sweep identified duplicates representing roughly 18 percent of total image storage across three pilot departments, freeing an estimated 4.2 terabytes of server capacity in the first two months alone.
That figure matters because São Paulo's municipal cloud infrastructure contract, renegotiated with a Brazilian data center operator in late 2024, prices excess storage at a premium above the base allocation. Reducing redundant files directly cuts operating costs — and given Mayor Ricardo Nunes's administration has flagged budget discipline as a priority for 2026, the savings argument has given the technical teams political cover to move quickly.
How São Paulo Compares to Seoul, Amsterdam, and Mexico City
Set against peer cities, São Paulo is making progress but remains a step behind the front-runners. Seoul's Smart City Division completed a city-wide image deduplication exercise across its public CCTV and urban-planning databases in 2023, reportedly reducing storage costs by more than 22 percent across its metropolitan data centers. Amsterdam's Gemeente Amsterdam ran a similar audit of its building permit photo archive in 2022 and subsequently published open-source deduplication tools that smaller Dutch municipalities adopted within months.
Mexico City, a closer regional peer, began a comparable program under its Agencia Digital de Innovación Pública in late 2024 but has moved more slowly, hampered by inter-agency data-sharing restrictions that São Paulo has, to its credit, partly resolved by centralizing authority under a single coordinating secretariat.
What São Paulo still lacks is a published, publicly accessible methodology document — the kind of transparency that allowed Amsterdam's approach to be replicated elsewhere. Without it, civil society groups in the Bixiga and Vila Madalena neighborhoods that track urban data governance have no easy way to verify that the deletion process is not accidentally removing legally required records, particularly from environmental licensing files tied to flooding-risk zones along the Pinheiros and Tietê rivers.
The city's technology committee is scheduled to vote in August 2026 on a resolution that would require the secretariat to publish quarterly deduplication reports. If passed, that would bring São Paulo meaningfully closer to the Amsterdam standard — and give the program a layer of public accountability that, right now, it does not have.