The Internet Archive’s Wayback Machine has preserved over one trillion web pages, creating a living history of the internet from a converted church in San Francisco.
Key Takeaways
- Wayback Machine archived its trillionth page last month
- Preserves web pages, AI content, and technical architecture
- Operates from a former church with global backup servers
- Faces new challenges from AI and political pressures
Just blocks from San Francisco’s Presidio stands a gleaming white building with gothic columns. What was once a Christian Scientist church now houses the Internet Archive – a non-profit library preserving internet history for nearly 30 years.
Inside the stained-glass sanctuary, church sermons have been replaced by server hums. The Wayback Machine preserves web pages used by millions daily, helping academics and journalists access historical corporate, government and personal web content.
The Internet Archive also preserves music, television, newspapers, videogames and books, which archivists digitize page by page using bespoke machines. — CNN
Founder Brewster Kahle stated: “We are here to try to provide a record of what happened, so that people can learn and build on that to build a better future, or to build new ideas that are worthy of being in the library.”
The Internet’s Living Library
Kahle launched the archive in 1996 when annual saved pages fit on 2TB drives – today’s iPhone capacity. Now it saves nearly 150TB daily, equivalent to hundreds of millions of web pages.
The energetic founder purchased the church building for its resemblance to their logo and as a symbol of permanence, referencing the Library of Alexandria. “Now that place is the internet, and the Internet Archive serves the whole internet as a library,” Kahle explained.
Brewster Kahle created the archive in 1996 when a year’s worth of saved pages could fit on about 2 terabytes worth of hard drives, the amount of storage you can get today in an iPhone. — CNN
Beyond Screenshots: Preserving Digital Architecture
The Wayback Machine saves technical architecture – HTML, CSS, JavaScript – enabling page replay even if original servers fail, according to Director Mark Graham.
With AI’s rise, the archive now captures AI-generated content like ChatGPT responses and Google search summaries. The team experiments with preserving chatbot news interactions through daily question prompts and output recording.
Global Preservation Against Political Pressures
The archive maintains global server copies as protection against disasters and political pressures. The Trump administration’s website overhaul demonstrated this need when countless government pages disappeared during transitions.
“Whole sections of the web came down,” Kahle recalled. “That’s why we have libraries to go and have the record.”
Inside the Digital Sanctuary
Most servers reside in a San Francisco warehouse, but symbolic units occupy the former church sanctuary. Kahle hopes this display helps people understand “we’re all part of the collective protection for our knowledge.”
The 200-strong team of engineers, librarians and archivists work in a space featuring employee statues referencing China’s terracotta army. Archivists digitize books page-by-page while livestreaming on YouTube with lo-fi music.
Around 200 people work at the archive, a mix of engineers, archivists, librarians and more. — CNN
Wikipedia editor Annie Rauwerda noted the “cyberpunk atmosphere” at a trillion-page celebration, contrasting the corporate internet with the passionate community.
CNN
Despite the museum-like feel, Kahle emphasizes this isn’t about storytelling: “It’s trying to be a resource to make it so that other people can come up with their own ideas.”









