Now that the software is running with (at least for me) a low level of jank, it seems worth considering what we do with the years of accumulated sneer-strata over at the old place. Just speaking for myself, I think it would be nice if we had a static-site backup of the whole shindig. Unfortunately, since I’m a physicist by trade, anything I do with webstuff tends to involve starting from scratch with compass, straightedge and wget. There’s got to be a better method of archiving.
The other, not-mutually-exclusive option I can think of is to manually rerun “SneerClub classics”, the posts that one way or another helped define what sneering is all about.
N.B. Some of the test posts made today involved writing more on the serious-discussion side and have accordingly been marked NSFW.
Sounds like a good plan.
some work in progress on this is available here. the
SneerClub
directory is the output of the bulk downloader for all 1000 (deduplicated) posts it could grab from each of SneerClub’s hot, top, new, rising, and controversial tabs, and thejsonl
files are just the ones you posted decompressed for convenience. so far I’m just usingjq
to process the data setsSneerClub
has 1940 posts with nested comments and attached media where the downloader could parse it; the archive team files have 3851 posts and 100149 comments in a (much less convenient) flattened format without media. both sets have a few posts from 2015, so I’ll need to do more looking to see how much we’ve salvaged overall