Assinatura gratuita
The Daily São Paulo

São Paulo news, every day

News

São Paulo's Digital Archivists Race to Solve a Duplicate Image Crisis Swamping City Records This Week

Municipal databases and newsroom photo libraries across the city are grappling with a surge in duplicate image files, exposing gaps in how São Paulo manages its visual documentation.

By São Paulo News Desk · Published 4 July 2026, 4:23 pm

3 min read

São Paulo's Digital Archivists Race to Solve a Duplicate Image Crisis Swamping City Records This Week
Photo: Photo by Kaique Rocha on Pexels
Traduzindo…

São Paulo's public records offices and several major media organisations this week acknowledged a growing technical headache: tens of thousands of duplicate digital images clogging municipal databases, slowing access to official documentation and costing storage budgets that are already under pressure. The problem, long simmering inside city hall's IT infrastructure on Viaduto do Chá, came into sharper focus after the Secretaria Municipal de Gestão flagged the issue in an internal review circulated to department heads on July 1.

The timing matters. The city has been digitising decades of physical records as part of a broader modernisation push, accelerating uploads of flood-damage assessments, zoning permits, and public works photography — particularly after the severe drainage failures that hit the Tietê corridor and low-lying neighbourhoods like Capão Redondo earlier this year. That rush to digitise produced what archivists describe as cascading duplication: files uploaded multiple times by different departments, renamed but otherwise identical, now occupying server space that administrators say is measurably finite.

Who Is Affected and Where

The Arquivo Histórico Municipal, located on Rua Vieira de Morais in Campo Belo, manages more than 1.2 million digitised images dating back to the early twentieth century. Staff there say the duplication problem arrived with the integration of newer departmental uploads into its shared repository. The institution has not yet released figures on how many redundant files it has identified, but the review process was confirmed as active this week. Separately, Agência Paulista de Notícias, the state government's own wire service operating out of Consolação, told production staff on Thursday to audit their photo asset management systems before the end of the month.

Beyond the public sector, the duplication issue has drawn attention from São Paulo's tech ecosystem. Distrito, the innovation hub headquartered near Faria Lima that tracks the city's startup activity, has seen at least three image-tech ventures pitch deduplication tools to municipal clients in the past six months. One Pinheiros-based startup, working with a pilot contract tied to the city's Programa de Modernização da Gestão Pública, is testing software that uses perceptual hashing — a technique that identifies visually identical images even when file names differ — across departmental servers. The pilot launched in May and is scheduled to run through September.

The Cost of Getting This Wrong

Storage is not free, and at the scale São Paulo operates, duplication has a direct fiscal consequence. Cloud storage rates for government procurement in Brazil hovered around R$0,18 per gigabyte per month for standard-tier services as of mid-2025, according to pricing frameworks published by the Ministério da Gestão e da Inovação. When databases contain hundreds of thousands of redundant files — each image from a municipal surveillance camera or public works site averaging between 4 and 12 megabytes — the cumulative bill compounds quickly across a city operating more than 40 active secretariats.

The duplication problem also has a practical journalism dimension. Photo editors at several Paulista Avenue-based news operations say misidentified or duplicated images in shared municipal libraries have led to minor but embarrassing publication errors in recent months, including photographs tagged with incorrect dates or project names because metadata was copied alongside a duplicate file. No major corrections have been publicly issued, but internal style audits have become more frequent.

For residents and businesses that rely on public image records — contractors pulling zoning documentation from the Prefeitura's online portal, lawyers accessing flood-damage photographs for insurance disputes — the sluggish, duplication-bloated databases translate into real delays. The portal at sp156.prefeitura.sp.gov.br has registered increased load times on image-heavy request categories, though the city has not attributed this specifically to the duplication backlog.

What happens next depends largely on how quickly the Secretaria Municipal de Gestão moves from internal review to active remediation. The September deadline on the Pinheiros startup's pilot will be a practical test of whether algorithmic deduplication can scale to São Paulo's full archival footprint. If it does, the model could inform a citywide procurement process before the end of 2026. If it stalls, the city will face a choice between expensive manual auditing or living with a database that grows messier with every new upload.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily São Paulo

This article was produced by the The Daily São Paulo editorial desk and covers news in São Paulo. See our editorial standards for how we use AI.

The Daily São Paulo brief

The day's São Paulo news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to São Paulo news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily São Paulo and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily São Paulo

More in News

Enjoyed this story? Get tomorrow's briefing free.