Re-purposing a Discourse installation for a yearly event

It’s time to update this topic! I just finished the process of archiving all topics. I went with this suggestion:

Of course, the devil lies in the details, so here are all the details, for anyone interested. Note that I’m not claiming this is the perfect way to do it (I know it’s not), I just want to document this as a starting point for others. Prepare for a wild ride :wink:


Locking out all users

The first step is to prevent all those pesky users from getting in again. I did this by configuring my SSO provider to only allow system admins to log in. How to do this depends on the SSO provider. Either way, it’s good to leave yourself a way to log in :wink:

Deactivated users also cannot reply by mail :thumbsup:

The next step is to disable and lock out all users. Unfortunately, admins and moderators cannot be disabled (Why?), so this also means revoking all access right. Oh, and revoking nonexistent access rights fails with a 403 error (Why?), so this should be fault-tolerant.

Of course, doing this for hundreds of users isn’t fun, so let’s talk to the API via some JavaScript:

var request = require('request');
var q = require('q');
var _ = require('lodash');

var processRequest = function (requestData, allowForbidden) {
    var deferred = q.defer();

    var req = request(requestData.request, function (error, response, body) {
        if (error) {
            console.log("Got error: " + error.message);
            deferred.reject(error.message);
        } else {
            var status = response.statusCode;

            if (status === 200 || (status === 403 && allowForbidden)) {
                deferred.resolve(body);
                process.stdout.write(".");
            } else if (status === 429 || status === 502) {
                // rate limited
                deferred.resolve(q.delay(Math.random() * 10000).then(() => {
                    return processRequest(requestData, allowForbidden);
                }));
            } else {
                console.log(`Got error ${response.statusCode} from server when requesting ${requestData.request.uri}: ${body}`);
                deferred.reject(response.statusCode);
            }
        }
    });

    return deferred.promise;
};

var buildRequestBuilder = function (domain, apiKey) {
    return (method, fragment, headers, body) => ({
        request: {
            port: 443,
            uri: `https://${domain}/${fragment}?api_key=${apiKey}&api_username=system`,
            method,
            headers: headers || {},
            body: body
        }
    });
};

var requestBuilder = buildRequestBuilder('example.com', '■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■');

var logOutUser = function (id) {
    return processRequest(requestBuilder('POST', `admin/users/${id}/log_out`));
};

var deactivateUser = function (id) {
    if (id < 0) return; // don't mess with system

    return logOutUser(id).then(() => {
        processRequest(requestBuilder('PUT', `admin/users/${id}/revoke_admin`), true).then(() => {
            processRequest(requestBuilder('PUT', `admin/users/${id}/revoke_moderation`), true).then(() => {
                processRequest(requestBuilder('PUT', `admin/users/${id}/deactivate`));
            });
        });
    });
};

This code includes a retry if a request is rejected by rate limiting.

Now, we have a function the can log out, revoke admin, revoke moderation and disable a user, given the user id. Where do we get the IDs? That’s a case for the Data Explorer:

SELECT id
FROM users

The results can be downloaded as JSON (did I say that I :heart: the Data Explorer plugin?), which I just bind to a variable named input. Now, we can work with that:

_.each(input.rows, (row) => {
    var userId = row[0];
    deactivateUser(userId);
});

The result is a program that writes a lot of dots to the console, and logs out and disables all users. Hooray! Oh, it also logs out you. Did I say you should keep a way for you to log in? :wink:

Preparing the archive

Create one category for the archive. Allow no-one to write to it, and only your staff to read it. If you want different permissions for some topics, use sub-categories. Also, enable tagging if it hasn’t been enabled yet. You’ll probably want to lock it down so no-one can apply the tags you want to use for archiving. Easy-peasy!

Tagging and moving topics?

Next, we need to move topics and tag them. Brace for more JavaScript:

var addTagsToTopic = function (topicId, newTags) {
    return processRequest(requestBuilder('GET', `t/${topicId}.json`)).then((topicJson) => {
        var topic = JSON.parse(topicJson);
        var slug = topic.slug;
        var tags = topic.tags || [];
        tags = _.uniq(_.concat(tags, newTags));

        var tagsEncoded = '';
        _.each(tags, (tag) => {
            if (tagsEncoded) tagsEncoded += '&';
            tagsEncoded += 'tags%5B%5D=' + encodeURIComponent(tag);
        });

        return processRequest(requestBuilder('PUT', `t/${slug}/${topicId}`, { 'Content-type': 'application/x-www-form-urlencoded' }, tagsEncoded));
    });
};

var moveTopicToCategory = function (topicId, categoryId) {
    return processRequest(requestBuilder('GET', `t/${topicId}.json`)).then((topicJson) => {
        var topic = JSON.parse(topicJson);
        var slug = topic.slug;

        return processRequest(requestBuilder('PUT', `t/${slug}/${topicId}`, { 'Content-type': 'application/x-www-form-urlencoded' }, `category_id=${categoryId}`));
    });
};

Both actions require the slug (Why?), so this code retrieves the slug via the API before continuing. Yuck!

Next up, let’s get the topic IDs for a category. Data Explorer (:heart:) to the resuce:

-- [params]
-- string :category = 

SELECT id
FROM topics
WHERE archetype = 'regular'
    AND category_id = (
        SELECT id
        FROM categories
        WHERE name = :category
    )

I hope you don’t have two categories with the same name like we do ¯\_(ツ)_/¯

So let’s grab the category ID where the topics should go (hint: Go to the category page and add .json to the URL), think of some tag names, and get moving!

_.each(input.rows, (row) => {
    var topicId = row[0];
    addTagsToTopic(topicId, ["archiv-2015", "archiv-intern"]).then(() => {
        return moveTopicToCategory(topicId, 17);
    });
});

This will likely throw some errors, because deleted topics apparently either cannot be moved or cannot be tagged. Why, oh why? :crying_cat_face:
I decided to ignore that (for now), and just watch the error messages scroll through.

Rinse and repeat with all your categories! Or write more code to automate requesting the IDs and build a tag name. I decided to do that part manually, so you’re on your own. :blush:

Clean up

All of this mostly worked, but some cleanup was needed.

First of all, some topics were already about organizing the next event, so I manually un-tagged them and moved them back.

Also, category description topics are special: Trying to move them fails silently (Why?). I just un-tagged them manually.

Results

Now, the result is a pristine forum with a staff-readable, searchable archive.
After re-enabling SSO (with the new event), old staff users that stayed with you can log in via SSO again, which also gives them back moderator privileges. You’ll also find that they cannot access the archive, start to be infuriated when you don’t understand why, then dig up an old bug report by you :wink:


I hope this helps others in a similar situation. If you have any questions, feel free to ask.

Maybe the :discourse: team can help reducing the number of occurrences of Why? and Why, oh why? above :wink:

6 Likes