After upgrading one of my Discourse instances and loading an old backup I can no longer get past the initial discobot message for new users where it wants you to bookmark the message. Telling it to skip the step gives no response either.
It responds just fine to other text commands, so I’m curious what mechanism is used by the bot to look for the bookmark status changing and where I should start troubleshooting.
I can see the bookmark being set in production.log but the bot doesn’t want to pick up on it for some reason
Started PUT "/posts/386/bookmark" for 127.0.0.1 at 2018-05-30 18:00:36 +0000
Processing by PostsController#bookmark as */*
Parameters: {"bookmarked"=>"true", "post_id"=>"386"}
Completed 200 OK in 72ms (Views: 0.2ms | ActiveRecord: 36.0ms)
The weirdest thing is that the entire advanced user tutorial works just fine, and it looks for deleted posts, edited posts, changed notification status and so on which probably work the same way under the hood…?
This seems to be a queue of bookmarked tutorial posts building up, I asked another user to try the tutorial and their post has been added to the end of the queue
I’ve tried restoring the same database dump to a different instance, it seems to be working fine there. The broken one is running on an isolated intranet, I must have messed up something during the installation.
I will try redoing the installation from scratch and restoring again, but it would be nice to know more about the internal workings of the bot and how I could further troubleshoot the job that is causing issues
Well then, that sure was a weird root cause!
After spending some time with tcpdump I noticed my Discourse host reaching out to text-lb.esams.wikimedia.org on port 443 right after I click the bookmark icon. Our firewall drops traffic it doesn’t like instead of rejecting it, and that often causes sessions to hang for a while.
The problem was that I had “enable inline onebox on all domains” turned on and since the next message in the script from discobot contains three wikipedia links it got hung up on something in the onebox handler for too long. As soon as I disabled that setting everything works fine and discobot responds right away
Sorry about the unsupported edge case, this installation is using the How to install Discourse on an isolated CentOS 7 server method which makes my life harder than it has to be (at least until I can convince the security guys at work that Discourse isn’t scary and just let it have internet access)