5 comments

  • uniqueuid 7 hours ago
    Since it seems unclear what they do and why it matters:

    Clockss seems to be an organization designed to make sure scientific content does not disappear library of Alexandria-style.

    The most important task here is being legally safe, which is why they emphasize ivy league credentials, distributed nature, audits and so on. Technically it's not really difficult (except perhaps for dealing with publisher captchas heh).

    They are legally safe because of this mechanism:

    > Digital content is stored in the CLOCKSS archive with no user access unless a “trigger” event occurs.

    All in all I think it's absolutely necessary.

  • zvr 1 hour ago
    I remember when the first LOCKSS papers appeared, the idea of "Lots Of Copies Keep Stuff Safe" was so obvious, once explained. The example used at the time was "can you imagine that we lose the Harry Potter book? No, since it's everywhere!".
  • yuvadam 7 hours ago
    I've read several pages on their website and still have no idea what this is.

    The ability of the internet to collectively archive content that is important to humanity as a whole - in fully distributed and legally questionable ways - is much more impressive IMO.

    • millicentricism 7 hours ago
      LOCKSS is short for: Lots of copies keeps stuff safe.

      This is for digital material where perpetual usage rights have been granted, but the original source might not be available. It’s also used to make sure that journal content from countries that censor remains available.

      This all is mainly for libraries and their online collections.

    • phrotoma 7 hours ago
      It looks like the core value prop is that it will publish documents which, for one reason or another, stop being available from the original source.
    • sciurus 7 hours ago
      See https://clockss.org/about/how-clockss-works/ for an overview of the implementation.
  • lyu07282 4 hours ago
    This would also be very helpful for datasets, we need to preserve those indefinitely in order to be able to make meaningful benchmarks in the future. Some datasets will only contain references to the source material and labels, each researcher is then supposed to assemble the actual dataset from those references.