The Internet Archive is an absolute treasure with a gigantic task ahead of them. They have now set their sights on vinyl LPs and started the work of digitising and archiving these recordings.
Earlier this year, the Internet Archive began working with the Boston Public Library (BPL) to digitise more than 100,000 audio recordings from their sound collection. The recordings exist in a variety of historical formats, including wax cylinders, 78 rpms, and LPs. They span musical genres including classical, pop, rock, and jazz, and contain obscure recordings like this album of music for baton twirlers , and this record of radio’s all-time greatest bloopers .
Since all of the information on an LP is printed, the digitisation process must begin by cataloging data. High-resolution scans are taken of the cover art, the disc itself and any inserts or accompanying materials. The record label, year recorded, track list and other metadata are supplemented and cross-checked against various external databases.
The Archive is partnering with Innodata Knowledge Services, who digitise the LPs in their facility in Cebu, Philippines. Setting up and turning over every album by hand and recording each side at normal speed.
Once recorded, there is a large FLAC file for each side of the LP, which needs to be segmented so listeners can easily begin at the desired song. There are two different algorithms used for segmenting; the first one looks at images of the vinyl disc to locate gaps in its grooves, which usually line up with gaps between songs. A second algorithm listens to the audio file to find the silent spaces between songs. When these two algorithms align, our engineers have a good measure of confidence that the machine has found the proper tracks.
(Taken from: https://kottke.org/19/11/the-internet-archive-is-now-working-to-preserve-vinyl-lps)