Import questions in bulk

Hello

I have my questions and responses in a JSON format:

sample JSON document here
{
    "id": "x017c4h221p7T8sboHglB-7kQ==",
    "created": "2018-05-09T20:13:23Z",
    "title": "Docker i/o error",
    "body": "<p>Hey Stephane, I am having to restart docker to run Kafka everytime I exit since it errors out on tcp port binding. My understanding is that when I stop Kafka services and exit, the port is released.</p>",
    "course": {
        "_class": "course",
        "id": "x0190RCkGpZ6FMe4CPAF8aOoQ==",
        "title": "Apache Kafka Series - Learn Apache Kafka for Beginners v2",
        "url": "/apache-kafka/"
    },
    "replies": [{
            "_class": "answer",
            "id": "x01Qx2rNaX48kxP4NFFSSCK7g==",
            "created": "2018-05-10T07:04:10Z",
            "user": {
                "_class": "user",
                "id": "x01N-Fup_OULjTEtHPLwc8JSQ==",
                "name": "Ivan",
                "locale": "en_US"
            },
            "is_top_answer": null,
            "body": "<p>Hi Nandini,\u00a0</p><p>Can you please elaborate in more details what your problem is? If you stop Kafka services, ports used by Kafka\u00a0will be released, of course.</p><p>Regards,</p><p>Ivan</p>",
        },
        {
            "_class": "answer",
            "id": "x01bLG2QPhyLwZ_RJsbMge16A==",
            "created": "2018-05-10T20:45:39Z",
            "user": {
                "_class": "user",
                "id": "x01oX4mwhRQoLXKuhHXDHg3zg==",
                "name": "Nandini",
                "locale": "en_US"
            },
            "is_top_answer": null,
            "body": "<p>I stop kafka services and when i restart, docker ports are not released and I get a TCP binding error on ports 2181 and 3030</p>",
            "is_upvoted": false,
            "num_upvotes": 0
        },
        {
            "_class": "answer",
            "id": "x01yL8D1-inVZE3njAo08-uMw==",
            "created": "2018-05-11T00:32:46Z",
            "user": {
                "_class": "user",
                "id": "x01lNfqEyIqBf47SM76dxq0rw==",
                "name": "Stephane Maarek",
                "locale": "en_US"
            },
            "is_top_answer": true,
            "body": "<p>Restart the docker engine if you can, or your computer. See if that helps !\u00a0</p><p>Otherwise you have something running on port 2181. Please check the FAQ lecture (lecture 22) as a lot of students have been through these issues before </p>",
            "is_upvoted": false,
            "num_upvotes": 0
        }
    ],
    "user": {
        "_class": "user",
        "id": "x01oX4mwhRQoLXKuhHXDHg3zg==",
        "name": "Nandini",
        "locale": "en_US"
    }
}

I have developed a python script to hit the API but I’m getting a bunch of errors on the API related to throttling… :

You’ve performed this action too many times. Please wait X seconds before trying again.

I have to import about 3000 questions total (alongside an average of 2 answers) so I feel the API route might be too long.

Is there a way to disable this API throttling issue?

Is there any other way I can import my data? They’re only linked to two users (so no need to create users). I’m also using a hosted discourse so I don’t know if I have direct access to the underlying DB

Happy to share the Python code I have, or open a bounty if this requires a lot of effort

Make sure the user performing the actions via the API key has staff privileges, even if only temporary.

That will help with some of the rate limits.

An easy approach I used was to add
sleep(0.7)
inside the Ruby loop. (you may need to tweak that)

For 3000 requests it would take 35 minutes to complete. A bit painful, but for a once off I don’t think it would be all that bad.

5 Likes

I have admin permissions (using my own API key) and I still get a throttle it seems every 60 API calls. I tried to change things in settings > Rate limits but doesn’t seem to help

3 Likes

I get that as well.

If I artificially add a 1 second delay I dont get anymore

Okay the 60 calls per minute limit can be avoided as follows:

On server

cd /var/discourse

Open containers/app.yml in editor (I use vi), add following line to env section

DISCOURSE_MAX_ADMIN_API_REQS_PER_KEY_PER_MINUTE : 6000

and save

Rebuild (just restart won’t have any effect)
./launcher rebuild app

2 Likes