São Paulo's city hall is sitting on millions of duplicate photographs — redundant images embedded across property tax records, infrastructure inspection logs, and social services files — and the effort to clean them out is moving slower than in comparable megacities that tackled the same problem years ago. Municipal IT managers at the Secretaria Municipal de Inovação e Tecnologia, based in the Centro district, confirmed the duplicate-image backlog in internal audit documents circulated earlier this year, though the city has not released a full public accounting of how many records are affected.
The timing matters. São Paulo has spent heavily over the past three years expanding its urban data infrastructure, folding street-level inspection imagery, flood-sensor photography, and resident-submitted pothole reports into a single integrated platform. That expansion, accelerated under Mayor Ricardo Nunes's digital governance agenda, has also multiplied the opportunities for the same image to be ingested, stored, and billed against the city's cloud contracts more than once. Storage redundancy is not a theoretical inconvenience — it translates directly into wasted public spending and slower retrieval when emergency services need visual evidence fast.
What Other Cities Have Already Done
Seoul addressed a structurally similar problem between 2021 and 2023, when the city's Smart City Division implemented automated perceptual-hash deduplication across its public CCTV archive and citizen-report image libraries. The South Korean capital reduced its active image storage load measurably and cut cloud expenditure on visual data within two budget cycles, according to reporting by Korean municipal tech publications at the time. Amsterdam's city data team deployed an open-source deduplication pipeline across its Digitale Stad infrastructure in 2022, focusing first on imagery tied to its environmental permitting process along the canal districts.
São Paulo has no equivalent program running at scale yet. The closest operational effort is a pilot being run by CET — Companhia de Engenharia de Tráfego, the city's traffic engineering company — on its Avenida Paulista camera network and extending into parts of the Pinheiros neighbourhood. That pilot, which began processing images in March 2026, uses hash-comparison tools to flag obvious duplicates before they are written permanently into cold storage. CET has not published results, and the scope covers only a fraction of the city's total visual data estate.
The São Paulo metro area generates a staggering volume of urban imagery. The city operates more than 14,000 surveillance and monitoring cameras across its 96 districts, according to figures the Nunes administration cited in a 2025 budget justification document. Each camera cycle, combined with resident uploads through platforms like the Fala.SP app — which logged more than 2.3 million service requests in 2024 — creates continuous pressure on storage systems not designed with aggressive deduplication in mind.
Cost and Urgency
Storage costs are not trivial at municipal scale. Cloud contracts for São Paulo's data operations have grown year-on-year, though the city does not itemise image storage as a separate line in its published accounts. Technology procurement experts who follow Latin American public-sector contracts note that image duplication rates of 15 to 25 percent are typical in large urban systems that have grown organically without enforced ingestion standards — a description that fits São Paulo's patchwork expansion accurately.
The flooding crisis adds urgency. After the January 2025 storms that overwhelmed drainage infrastructure in zones including Vila Mariana and parts of the Zona Leste, emergency managers complained that retrieval of pre-event inspection photographs was slow, partly because search queries returned multiple identical images that had to be manually filtered. A leaner, deduplicated archive would directly reduce that friction.
City hall's roadmap, outlined in a presentation to the Câmara Municipal earlier this year, calls for a citywide deduplication standard to be adopted by all secretariats before the end of 2026. Whether that deadline holds will depend on whether the Secretaria de Inovação secures the additional contract funding it has requested. For residents and city workers who depend on those image databases daily, the practical test will come with the next rainy season — which forecasters expect to arrive earlier than usual this year.