Online content isn’t forever: Archiving as a digital publication

Archiving your stories isn’t only a safety precaution for if you ever close your doors, it’s also vital to the health of the news ecosystem at large. Here’s a primer on news archiving and why you should be thinking about your publication’s public record.

Shortly after Gawker suddenly announced its bankruptcy and shut down in 2016, a high-profile investigation was pulled from its site, spurring renewed concerns over the preservation of its archive.

Gawker’s articles were facing the “billionaire problem” — wealthy interests buy remains of bankrupted sites to remove controversial content. To mitigate the risk, Freedom of the Press Foundation teamed up with U.S.-based nonprofit Internet Archive to protect the website’s content and ensure the public record would remain.

But billionaires aren’t the only ones who see value in archives. After the hyperlocal outlet serving Chicago and New York, DNAinfo, shuttered in 2017, a group of former DNAinfo Chicago editors scooped up its brand and archive assets for free, courtesy of New York Public Radio WNYC (who had previously acquired them). They then leveraged the notoriety of the brand to launch Block Club Chicago in 2018.

Archiving your stories isn’t only a safety precaution for if you ever close your doors, it’s also vital to the health of the news ecosystem at large. Here’s a primer on news archiving and why you should be thinking about your publication’s public record.

The difference between backing up and archiving stories  

Do you have a system in place for backing up your new articles? Google Docs and Microsoft Word files are relatively stable in the short term, and can empower you to repopulate your stories in the unfortunate event that your site is hacked or crashes. But as Columbia University journalism professor Angela Woodall explains, it does not ensure long-term access spanning decades or more.

There are a number of issues with both solid-state drives, like those you plug into your computer, and cloud storage. What happens if the storage software you use becomes obsolete? What happens if there’s a meltdown at the server farm for the cloud storage you use? Keep in mind that no matter the cloud storage, it’s probably outsourced to Amazon Web Services, Google or Microsoft, according to Woodall.

In the face of market volatility, what happens to this information? Backing up your articles is an important first step — but not enough to ensure ongoing access.

Local news archives could offer a clearer picture of history 

“Local, independent, and alternative news sources are especially at risk of not being preserved, threatening to leave critical exclusions in a record that will favor dominant versions of public history,” Woodall and colleague Sharon Ringel stated in a 2019 report called “A Public Record at Risk: The Dire State of News Archiving in the Digital Age.

Long-term plans for preserving content help ensure increased access to vital information. It’s just a matter of publishers making archiving a priority. “Having the consciousness is the first thing,” Woodall says. “And then I think from there, you can assess what needs to be done.”

Jeremy Klaszus of The Sprawl has been thinking about archiving a lot lately. He points to OpenFile Calgary, which disappeared in 2012, as an example. “If you read around in the Wayback Machine long enough, you can find a way to certain stories,” he says. “But that’s an issue you know, all this work, kind of assuming like, ‘Oh, yeah, it’s online. It’ll be online. No it won’t, actually.”

Make your content redundant 

The biggest obstacle to archiving — besides just not saving anything — is that formats change a lot, explains Woodall.

There are different ways to proactively overcome this barrier depending on your budget and scope of what you want to preserve (Do you include social media posts? Newsletters? Comments on your site?).

“The bottom line — redundancy is probably the best bet. Having [your files] on a cloud server, a PDF format, and your original word document, which is pretty stable. And then just printing it.”

If you choose to back up content as PDFs, Woodall suggests you revisit the strategy annually to assess whether there’s a possibility that the format will become inaccessible.

Third-party services can help, like Preservica or The Internet Archive’s Archive-It. Just remember, this puts preservation in third-party hands.

Be wary of silver bullets — and blockchains

While Woodall and Ringel’s research found that most major newspapers have contracts with commercial search databases like ProQuest or Newspapers.com, many small digital outlets don’t have archiving plans. 

Klaszus of The Sprawl says he’s approached the Calgary Public Library a few times about the preservation conundrum without any luck. “They basically don’t know what to do with an online news publisher.”

“As for a silver bullet — there is none,” Woodall says. “Finding a silver bullet is a really demanding proposition because then you start getting into collective memory and history and a lot of other issues.”

Blockchain startups are marketing hard for asset storage, but what they’re offering is unrealistic, Woodall warns.

“These blockchain startups or anybody else who’s offering some like, magic potions that are going to keep everything safe and nobody’s going to be able to touch it, they don’t actually store the content. They just store the hashtags and other things.”

Sign up for the Indie Publisher newsletter

Get the weekly newsletter that demystifies news entrepreneurship.

Are you a news entrepreneur? Not sure where to start?

Schedule a free strategy consultation to discuss your news business.

This site uses cookies to provide you with a great user experience. By continuing to use this website, you consent to the use of cookies in accordance with our privacy policy.

Scroll to Top