São Paulo's public records offices and several major media organisations this week acknowledged a growing technical headache: tens of thousands of duplicate digital images clogging municipal databases, slowing access to official documentation and costing storage budgets that are already under pressure. The problem, long simmering inside city hall's IT infrastructure on Viaduto do Chá, came into sharper focus after the Secretaria Municipal de Gestão flagged the issue in an internal review circulated to department heads on July 1.
The timing matters. The city has been digitising decades of physical records as part of a broader modernisation push, accelerating uploads of flood-damage assessments, zoning permits, and public works photography — particularly after the severe drainage failures that hit the Tietê corridor and low-lying neighbourhoods like Capão Redondo earlier this year. That rush to digitise produced what archivists describe as cascading duplication: files uploaded multiple times by different departments, renamed but otherwise identical, now occupying server space that administrators say is measurably finite.
Who Is Affected and Where
The Arquivo Histórico Municipal, located on Rua Vieira de Morais in Campo Belo, manages more than 1.2 million digitised images dating back to the early twentieth century. Staff there say the duplication problem arrived with the integration of newer departmental uploads into its shared repository. The institution has not yet released figures on how many redundant files it has identified, but the review process was confirmed as active this week. Separately, Agência Paulista de Notícias, the state government's own wire service operating out of Consolação, told production staff on Thursday to audit their photo asset management systems before the end of the month.
Beyond the public sector, the duplication issue has drawn attention from São Paulo's tech ecosystem. Distrito, the innovation hub headquartered near Faria Lima that tracks the city's startup activity, has seen at least three image-tech ventures pitch deduplication tools to municipal clients in the past six months. One Pinheiros-based startup, working with a pilot contract tied to the city's Programa de Modernização da Gestão Pública, is testing software that uses perceptual hashing — a technique that identifies visually identical images even when file names differ — across departmental servers. The pilot launched in May and is scheduled to run through September.
The Cost of Getting This Wrong
Storage is not free, and at the scale São Paulo operates, duplication has a direct fiscal consequence. Cloud storage rates for government procurement in Brazil hovered around R$0,18 per gigabyte per month for standard-tier services as of mid-2025, according to pricing frameworks published by the Ministério da Gestão e da Inovação. When databases contain hundreds of thousands of redundant files — each image from a municipal surveillance camera or public works site averaging between 4 and 12 megabytes — the cumulative bill compounds quickly across a city operating more than 40 active secretariats.
The duplication problem also has a practical journalism dimension. Photo editors at several Paulista Avenue-based news operations say misidentified or duplicated images in shared municipal libraries have led to minor but embarrassing publication errors in recent months, including photographs tagged with incorrect dates or project names because metadata was copied alongside a duplicate file. No major corrections have been publicly issued, but internal style audits have become more frequent.
For residents and businesses that rely on public image records — contractors pulling zoning documentation from the Prefeitura's online portal, lawyers accessing flood-damage photographs for insurance disputes — the sluggish, duplication-bloated databases translate into real delays. The portal at sp156.prefeitura.sp.gov.br has registered increased load times on image-heavy request categories, though the city has not attributed this specifically to the duplication backlog.
What happens next depends largely on how quickly the Secretaria Municipal de Gestão moves from internal review to active remediation. The September deadline on the Pinheiros startup's pilot will be a practical test of whether algorithmic deduplication can scale to São Paulo's full archival footprint. If it does, the model could inform a citywide procurement process before the end of 2026. If it stalls, the city will face a choice between expensive manual auditing or living with a database that grows messier with every new upload.