skip to content

Arctic Code Vault

code ascii pattern

The GitHub Arctic Code Vault is a data repository preserved in the Arctic World Archive (AWA), a very-long-term archival facility 250 meters deep in the permafrost of an Arctic mountain.

The archive is located in a decommissioned coal mine in the Svalbard archipelago, closer to the North Pole than the Arctic Circle. GitHub captured a snapshot of every active public repository on 02/02/2020 and preserved that data in the Arctic Code Vault.

acrtic shelf people in the arctic two people walking down a path in the arctic acrtic shelf people reading data in the arctic arctic mountain person in snow suit in the arctic arctic mountain

How the cold storage will last 1,000 years

Svalbard is regulated by the international Svalbard Treaty as a demilitarized zone. Home to the world’s northernmost town, it is one of the most remote and geopolitically stable human habitations on Earth.

The AWA (Arctic World Archive) is a joint initiative between Norwegian state-owned mining company Store Norske Spitsbergen Kulkompani (SNSK) and very-long-term digital preservation provider Piql AS. AWA is devoted to archival storage in perpetuity. The film reels are stored in a steel-walled container inside a sealed chamber within a decommissioned coal mine on the remote archipelago of Svalbard. The AWA already preserves historical and cultural data from Italy, Brazil, Norway, the Vatican, and many others.

While Svalbard is affected by climate change, it is likely to affect only the outermost few meters of permafrost in the foreseeable future. Warming is not expected to threaten the stability of the mine. The mine’s proximity to the famous Global Seed Vault, only a mile away, reinforces Svalbard’s status as a stable, very-long-term archive site for humanity’s collective knowledge.

02/02/2020 snapshot and code deposit

The 02/02/2020 snapshot archived in the GitHub Arctic Code Vault swept up every active public GitHub repository. It included every repo with any commits between the announcement at GitHub Universe on November 13th and 02/02/2020; every repo with at least 1 star and any commits from the year before the snapshot; and all repos with at least 250 stars. (It also included gh-pages for those repositories.) The snapshot consists of the HEAD of the default branch of each repository, minus any binaries larger than 100KB in size. (Repos with 250+ stars retained their binaries.) Each was packaged as a single TAR file.

For greater data density and integrity, most data was stored QR-encoded, and compressed. A human-readable index and guide found on every reel explains how to recover the data.The 02/02/2020 snapshot, consisting of 21TB of data, was archived to 186 reels of film by our archive partners Piql and then transported to the Arctic Code Vault, where it resides today.

cold data storage cold data storage servers on crates label being applied to tote boxes with GitHub labels on them cold storage tote with GitHub label worker in vault moving items

badge-star Arctic Code Vault Badge

Millions of developers around the world contributed to the open source software now stored in the Arctic Code Vault. To recognize and celebrate these contributions, we designed the Arctic Code Vault Badge, which is displayed in the highlights section of a developer’s profile on GitHub. Hover and you can discover some of the repositories an individual contributed to.

Featured projects // 2020 Arctic Vault Program

Watch the Arctic Vault Video

arctic town
GitHub Arctic Code Vault Svalbard 2019