Assinatura gratuita
The Daily São Paulo

São Paulo news, every day

News

São Paulo's Duplicate Image Crisis: The Key Decisions That Will Shape the City's Digital Records

Municipal archives and tech firms face a reckoning over how to clean up years of redundant visual data clogging public databases — and the clock is ticking.

By São Paulo News Desk · Published 4 July 2026, 4:00 pm

3 min read

São Paulo's Duplicate Image Crisis: The Key Decisions That Will Shape the City's Digital Records
Photo: Photo by Ariadne Barroso on Pexels
Traduzindo…

São Paulo's municipal government is sitting on a problem that has quietly ballooned for years: hundreds of thousands of duplicate images embedded in public databases, civic portals, and urban monitoring systems, creating a bureaucratic and financial burden that administrators can no longer ignore. City Hall sources confirmed this week that the Secretaria Municipal de Inovação e Tecnologia is preparing a formal review of the problem, with decisions expected before the end of the third quarter of 2026.

The timing matters. Brazil's Lei Geral de Proteção de Dados — the national data protection law known as the LGPD — entered its enforcement phase with teeth, and the Autoridade Nacional de Proteção de Dados has signaled that redundant storage of personal images, including facial data captured by security cameras in public spaces, carries compliance risk. For a city of 12.3 million people with an expanding network of monitoring infrastructure, the legal exposure is real.

Where the Problem Is Concentrated

The duplication issue is most acute in two systems. The first is the Nota Fiscal Paulistana program, which processes millions of commercial transaction records annually and attaches scanned receipts and identity documents — many uploaded multiple times by the same user. The second is the GeoSampa platform, the city's official geospatial data portal, which hosts satellite and drone imagery of the urban grid; database managers at the Empresa Municipal de Urbanização, known as EMURB, have flagged internal estimates suggesting that between 15 and 22 percent of stored image files are exact or near-exact duplicates, though those figures have not been independently verified or published officially.

In the private sector, the problem is even more visible. Along Avenida Faria Lima, the stretch of financial and tech offices between Itaim Bibi and Vila Olímpia that houses the bulk of São Paulo's startup ecosystem, product teams at several Series B and C companies told industry groups earlier this year that duplicate image management had become a line item in infrastructure budgets. Firms using object storage on Brazilian cloud providers were reporting storage cost inflation of roughly 30 percent year-on-year, with duplicate assets cited as a leading driver.

The Decisions That Now Define the Path Forward

Three choices will determine how quickly this gets resolved — and who bears the cost.

The first is whether City Hall mandates a single deduplication standard across all municipal secretariats or leaves each department to procure its own solution. The Secretaria de Gestão, which oversees procurement under Mayor Ricardo Nunes, is reportedly weighing both options. A unified mandate would be cheaper at scale but requires coordination across at least 27 secretariats. A decentralised approach is faster to implement but risks creating new silos.

The second decision involves open-source versus proprietary tooling. The Instituto de Pesquisas Tecnológicas, the state research body based in Cidade Universitária on the west side of the city, has developed a hash-based image comparison pipeline that could be adapted for public-sector use at low cost. Whether City Hall opts for that route or signs a licensing deal with a commercial vendor is a procurement call that could run into the tens of millions of reais.

The third, and politically most sensitive, decision is about deletion authority. Once duplicates are identified, who has the legal power to destroy a government image record? Under current archiving regulations, some categories of municipal visual data require a formal declassification process before deletion — a step that could add six to eighteen months to any cleanup timeline.

Civic technology advocates in Pinheiros, particularly groups affiliated with the Rede Nossa São Paulo coalition, have been pushing for a public consultation process before any automated deletion pipeline is approved. Their concern: that overzealous deduplication could erase legitimate variations in imagery that document urban change over time — flood lines along the Tietê River, before-and-after demolition records in Brás, or protest documentation from Paulista Avenue. That argument has traction, and it means the fastest technical solution may not be the one that gets approved. The review deadline of September 30 is the next date to watch.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily São Paulo

This article was produced by the The Daily São Paulo editorial desk and covers news in São Paulo. See our editorial standards for how we use AI.

The Daily São Paulo brief

The day's São Paulo news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to São Paulo news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily São Paulo

More in News

Enjoyed this story? Get tomorrow's briefing free.