The problem is older than the smartphone, but it has never been more expensive to ignore. São Paulo's municipal archives, advertising agencies along Avenida Paulista and the sprawling network of tech startups clustered in the Vila Olímpia district are all grappling with the same operational headache: thousands of duplicate images clogging databases, distorting search results and inflating cloud storage costs at a moment when digital infrastructure budgets are already stretched thin.
The trigger this year was a procurement audit released by the Prefeitura de São Paulo in June 2026, which found that the city's Secretaria Municipal de Inovação e Tecnologia had flagged a significant volume of redundant image files across public-facing platforms — including the official city portal, SP156, and the urban planning database maintained by the Secretaria Municipal de Urbanismo e Licenciamento. The audit did not publish a final cost figure, but the review is ongoing and the pressure to act before the municipal budget cycle closes in October is real.
Why the Timing Is Critical
Three forces are converging at once. First, the federal government's push under the Lula administration to digitise public records has sent a flood of scanned documents and photographs into municipal systems that were not built to handle deduplication at scale. Second, São Paulo's tech unicorn ecosystem — anchored by names like Totvs, headquartered in Barra Funda, and a growing cluster of AI-focused startups in the Berrini corridor — has been offering competing proprietary solutions, creating a vendor selection dilemma for public administrators. Third, the deadline matters: any system adopted after October risks missing integration with the 2027 budget planning cycle entirely.
Cultural institutions are feeling the pressure too. The Pinacoteca do Estado, on Praça da Luz in the Luz neighbourhood, has been digitising its permanent collection for several years. Museum archivists have described the duplicate image problem — in general terms, in public forums — as one of the most labour-intensive aspects of digital cataloguing, requiring human review even when automated tools flag matches. The Museu de Arte de São Paulo, on Avenida Paulista itself, has a separate digitisation program that uses open-source matching algorithms, but staff capacity limits how fast errors can be corrected.
On the commercial side, the advertising and media sector centered around Faria Lima and Paulista has its own stake. Brazilian digital advertising spending reached R$28 billion in 2024, according to IAB Brasil data published that year, and duplicate or misidentified creative assets represent a measurable source of campaign waste that agencies have been lobbying brands to quantify properly since at least 2023.
The Decisions Ahead
Three questions will define the next phase. The first is technical: does the Prefeitura de São Paulo adopt a perceptual hashing standard — the most widely used method for detecting near-identical images — or invest in the newer neural-network-based comparison tools that several Berrini-based startups are currently pitching? The second is governance: who holds the authority to mark an image as a duplicate and authorise its deletion from a public archive? A record removed in error from a planning database can have legal consequences for property disputes in dense neighbourhoods like Pinheiros or Mooca. The third question is procurement: open-source frameworks cost less upfront but require in-house technical staff that the municipal secretariats currently do not have in sufficient numbers.
A working group convened by the Secretaria de Inovação is expected to present preliminary recommendations before the end of August 2026. That timeline is tight. Vendors are already in conversations with city officials, and advocates for open-source and civic-tech approaches — including groups affiliated with the SP Tech ecosystem on Rua Pamplona — are pushing for a public consultation period before any contract is signed.
For residents and businesses, the practical advice is straightforward: if your organisation contributes imagery to any city-linked platform or participates in municipal tender processes that require photographic documentation, audit your own files now. The rules for what counts as a permissible duplicate — and what happens to flagged files during any automated review — are still being written, and early engagement with the working group's consultation process is the most direct way to influence the outcome.