São Paulo city government is sitting on a digital storage problem that specialists say costs public institutions millions of reais annually: duplicate images embedded in procurement databases, urban planning files, and citizen service portals are consuming server capacity that could otherwise be redirected to emergency infrastructure systems. The issue surfaced in renewed discussions this year after the Secretaria Municipal de Inovação e Tecnologia flagged redundant data as a growing drain on its cloud infrastructure budget.
The timing matters because São Paulo is in the middle of a major push to digitise city services under the Programa SP Digital, an initiative tied to the Ricardo Nunes administration's promise to modernise Prefeitura operations before the 2028 municipal election cycle. Storage inefficiency undermines that agenda directly — every duplicated file that occupies cloud space is a line item that competes with investments in flood-sensor networks in the Tietê River basin and real-time traffic monitoring on Avenida Marginal Pinheiros.
What Specialists Are Saying at Institutions Across the City
The debate is not confined to government corridors. At the Escola Politécnica da Universidade de São Paulo, in the Butantã neighbourhood, data engineering faculty have incorporated duplicate-image detection into graduate coursework on large-scale database management. The university's library system — one of the largest in Latin America — itself acknowledged in a 2025 annual report that its digital repository contained overlapping image assets across multiple collections, a problem the report described as a structural consequence of decentralised digitisation efforts carried out by different departments between 2018 and 2023.
Specialists in the field point to a cluster of technical and organisational causes. Institutions typically accumulate duplicate images when different teams upload the same files independently, when automated scraping tools pull images without deduplication checks, or when legacy systems migrate data without cleaning it first. The result is that identical or near-identical image files can appear dozens of times inside the same archive. In computing terms, the fix is well-understood: perceptual hashing algorithms can identify visually similar images even when file metadata differs. The harder problem is institutional will to implement cleanup at scale.
At TOTVS, the São Paulo-headquartered enterprise software firm whose clients include numerous Brazilian municipal governments and healthcare systems, product discussions in the past eighteen months have repeatedly raised duplicate-media handling as a priority request from public-sector customers. The company has not published specific figures on the scope of the problem across its client base, but the pattern of client requests signals that the issue is widespread well beyond the capital.
The Numbers Behind the Problem
Cloud storage costs in Brazil are denominated in US dollars for most enterprise contracts, meaning the weak real amplifies the financial pressure. An institution storing one petabyte of redundant image data on a major cloud provider can face annual costs exceeding R$ 500,000 at mid-2026 exchange rates — money that city budget watchdogs at the Tribunal de Contas do Município on Rua Líbero Badaró have been scrutinising with increasing intensity as São Paulo's 2026 fiscal budget comes under pressure from flood-recovery spending in the Zona Leste.
The federal government's data governance framework, updated under the Lula administration in 2024 through a directive from the Ministério da Gestão e da Inovação em Serviços Públicos, formally requires federal bodies to conduct annual audits of redundant digital assets — but the obligation does not automatically extend to municipal governments, leaving cities like São Paulo to set their own standards.
Technologists following the issue say the practical path forward involves three steps: deploying automated deduplication tools at the point of upload rather than retrospectively, establishing a shared image repository that different city secretariats can draw from instead of each maintaining parallel libraries, and training procurement staff to include deduplication requirements in vendor contracts. The Secretaria Municipal de Inovação e Tecnologia has not announced a formal timeline for any of those steps, but the budget pressure is unlikely to ease before the city's next technology procurement cycle, expected to open in the first quarter of 2027.