São Paulo's public records offices and the tech companies anchoring the city's Faria Lima unicorn corridor are confronting the same unglamorous problem: hundreds of thousands of duplicate images clogging storage systems, draining budgets, and slowing the digital services that residents increasingly depend on. The question is no longer whether to replace them. It is who decides, who pays, and how fast.
The issue has sharpened in 2026 partly because of scale. The municipal government's Secretaria Municipal de Inovação e Tecnologia — headquartered near Praça do Patriarca in the Centro district — has been midway through a multi-year platform consolidation since 2024. Sources familiar with the project, speaking in their official capacity at a public procurement forum held in May at the Centro de Referência em Informação Ambiental on Rua Dona Gertrudes de Lima, say deduplication is now a formal line item in the city's technology roadmap, though final budget allocations remain pending approval from the Câmara Municipal.
Why the Timing Is Critical
Storage is not cheap, and São Paulo's public agencies are not buying it at discount rates. Enterprise cloud storage contracts in Brazil's public sector have trended upward alongside the real's volatility against the dollar, with some municipal contracts pegged to dollar-denominated pricing from major providers. A procurement notice published in the Diário Oficial do Município in March 2026 listed object-storage costs for one city agency at roughly R$0.23 per gigabyte per month — a figure that compounds quickly when duplicates inflate total stored volume by an estimated 30 to 40 percent, a range commonly cited in technology audits for large urban governments in Latin America.
Private platforms in the Itaim Bibi and Vila Olímpia tech cluster face a parallel reckoning. Several fintech and healthtech firms that expanded aggressively during the 2021-2023 growth cycle built image-heavy onboarding pipelines — document scans, identity photos, proof-of-address uploads — without robust deduplication at the point of ingestion. Now those pipelines are expensive to maintain and, in some cases, create compliance headaches under Brazil's Lei Geral de Proteção de Dados, which prohibits retaining personal data beyond what is strictly necessary.
The Decisions That Will Define the Outcome
Three choices will shape what happens next, and none of them is purely technical.
First: automated versus human-reviewed replacement. Fully automated deduplication tools — several of them offered by São Paulo-based software firms including vendors accredited under the Programa Cidade Inteligente run through the Secretaria de Governo Digital — can sweep and flag redundant files in days. But for public archives, particularly those tied to legal proceedings or historical records held at the Arquivo Público do Estado on Rua Voluntários da Pátria in Santana, automated deletion without human sign-off carries legal risk. City attorneys have already flagged this tension in internal working groups.
Second: centralised versus distributed governance. The city's 96 subprefeituras each manage their own document repositories with varying degrees of interoperability. A deduplication policy imposed from the top — from Inovação e Tecnologia down — will face resistance unless it comes with technical support and clear standards. Without those, subprefeituras in peripheral districts such as Cidade Tiradentes and Parelheiros, which have thinner IT staffing, risk being left further behind.
Third: the procurement timeline. Any new software contract to handle replacement workflows at city scale must clear the Tribunal de Contas do Município, and procurement cycles in São Paulo typically run four to eight months from notice to execution. If the Secretaria de Inovação does not publish a formal call by September 2026, the consolidation project almost certainly slips into 2027 — past the municipal budget window currently under negotiation.
For the private sector, the calculus is faster but no less consequential. Firms that delay face growing storage bills and potential LGPD exposure. Those that act now can potentially standardise their pipelines ahead of any regulatory guidance that the Autoridade Nacional de Proteção de Dados may issue on data minimisation in image-heavy sectors. The ANPD has signalled in its 2026 regulatory agenda that guidance on biometric and document image retention is a priority area. São Paulo's tech ecosystem, which hosts the highest concentration of LGPD-regulated startups in Brazil, cannot afford to wait for the rules to arrive before building the infrastructure to comply with them.