Two thin strands of glass. When combined, these two strands of glass are so thin they still wouldn’t fill a drinking straw. That’s known in tech circles as a “fiber pair,” and these two thin strands of glass carry all the information of the world’s leading archive in and out of our data centers. When you think about it, it sounds kind of crazy that it works at all, but it does. Every day. Reliably.
Except this past Monday night, here in California…
On Monday, June 24, the real world had other ideas. As a result, the Internet Archive was down for 15 hours. For Californians, this was less of a big deal: those 15 hours stretched from mid-Monday evening (9:11pm on the US West coast), to 11:51am on Tuesday. Many Californians were asleep during several hours of that time. But in the Central European time zone (e.g. France, Germany, Italy, Poland, Tunisia), that fell on early Tuesday morning (06:11) to mid-Tuesday evening (21:51). And in the entire country of India, it was late Tuesday morning (09:41) to just after midnight on Wednesday (00:21).
Here’s what we do know: on Monday, June 24 at 9:11 pm PDT, at a spot about one kilometer from our Richmond, CA data center, one strand of fiber broke. We don’t know what caused it to break. But we do know that the quietly heroic and unheralded Core Infrastructure Team at the Internet Archive sprang into action. They quickly figured out we needed to call our fiber vendor for help. Working through the night, they worked with our vendor to fix it. Before noon the next day, our million+ patrons were back listening to the Grateful Dead, reading Proust, and playing Oregon Trail at archive.org. It’s crazy how much inconvenience that such tiny tear in the filament can cause. We love this library, so it breaks our heart to see it offline for 15 hours. To our patrons: we apologize for the disruption in service.
“Move fast and break things” is the tired mantra of Big Tech. Last year, Cory Doctorow pointed out what was wrong with this Big Tech ethic in his DWeb Summit 2018 closing keynote, and of course, Randall Munroe put it very succinctly in a simple xkcd comic:
There are professions where it’s just not cool to glorify breaking things. At the Internet Archive, we run a library. We work hard to preserve things, and make those things available to the world.
As a non-profit library, we make the most of those two little strands of glass, and get really upset when one of them breaks because we need both fiber strands to make a working connection between our data centers. We get especially upset because we haven’t buried a backup pair of fiber strands, because that would be expensive. Those drinking straws of fast data cost a lot of money. Still, maybe we should splurge on a second drinking straw. No promises, but a donation to Internet Archive might make us more likely to splurge.
Most of you probably don’t know what an “optical transceiver module” is, or the relative performance characteristics of a 1.25mm fiber strand versus a 2.5mm fiber strand, but I’m hoping at least one or two of you does. If that describes you, I’m guessing you are frustrated by all of this vague chatter about drinking straws to describe a fiber cut. Or maybe you don’t know anything about fiber strands, but you know all about the problems caused when a faulty NIC causes a broadcast storm that brings an office network to its knees. If that’s you, we want to talk to you. And you probably want to talk to us
The Core Infrastructure Team at Internet Archive is the small and knowledgeable unit that really understands what happened. They can provide much more interesting details, and they want to tell the story. I help manage this team and I plan to help them tell their story. But I also want to give them time to get their jobs done and give them a little time to rest. And I need to hire someone to help them out. Like an Operations Engineer, who we are looking for now. If you understand any of the jargon above, please apply! You’ll really get our attention if tell us in your cover letter why you want to learn more about the fiber cut.
We plan to provide a much more detailed summary of our fiber cut story by Friday, July 12th. Watch this blog to learn more….in the meantime, thank you for your patience, your loyalty and your understanding that at the Internet Archive, we don’t glorify breaking things; we fix them.