Error "Killed" while running rake posts:rebake

On a new Ubuntu 18 host, I’m having a problem with this step:

# cd /var/discourse/
# ./launcher enter app
# rake posts:rebake
Rebaking post markdown for 'default'
        5 / 2950 (  0.2%)/usr/local/bin/rake: line 2:  6788 Killed                  RAILS_ENV=production sudo -H -E -u discourse bundle exec bin/rake "$@"
# echo $?
137

In troubleshooting, I did a ./launcher rebuild all and then ran the rebake command, I got a bit further:

# rake posts:rebake
Rebaking post markdown for 'default'
      503 / 2950 ( 17.1%)/usr/local/bin/rake: line 2:  4588 Killed                  RAILS_ENV=production sudo -H -E -u discourse bundle exec bin/rake "$@"
# echo $?
137

what would cause the rebuild to fail (lack of memory? but this is a 2GB host, and only 2600 posts.)

EDIT - I’ve remove everything about missing image… and re-asked that here.

You shouldn’t need to rebake if you kept using the same domain.

2 Likes

Ok, so now I have two problems :joy:

I can’t rebake, i do have a lot of (all?) images showing as broken.

I am 99.9% certain that the backups were set to include images…

I see files in shared/standalone/uploads/default/original/1X/ (etc_ but none in default/238/*

Do you still have both servers up? Can you compare folders? Maybe rsync between old and new and see the differences?

2 Likes

The old server is gone (the process seemed to go so smoothly, I didn’t think to check images before shutting the old one down)

Would rebaking re-create the missing images?

Any thoughts as to why it’s failing?

This evening I rebooted the host (though it’s uptime was under 2 weeks)

rebaking succeeded

# rake posts:rebake
Rebaking post markdown for 'default'
     2950 / 2950 (100.0%)
2950 posts done!
-------------------------

but, as you suggested, this didn’t bring back the images. Ill start a new thread for that.

As for this issue (not being able to rebake ~2600 posts) Ill note that this thread seems useful:

EDIT: As for the initial issueI was trying to solve, this thread indicates that recover_from_tombstone is needed in addition to the rebuild, but that hasn’t worked yet.

I tried recover_from_tombstone. The command exited without warnings, and so I’m trying to rebuild.

That process fails at random places. I’ve tried rebooting the host, rebuilding the app, and it fails anywhere from 5-55% of the way through the rebake process. Note that there are only ~2950 posts… pretty small by discourse’s standards.

I am running a number of plugins, I’m not sure if that’s related:

      - git clone https://github.com/discourse/docker_manager.git
      - git clone https://github.com/cpradio/discourse-plugin-checklist.git
      - git clone https://github.com/discourse/discourse-assign.git
      - git clone https://github.com/discourse/discourse-data-explorer.git
      - git clone https://github.com/discourse/discourse-feature-voting.git
      - git clone https://github.com/discourse/discourse-push-notifications.git
      - git clone https://github.com/discourse/discourse-chat-integration.git
      - git clone https://github.com/discourse/discourse-solved.git
      - git clone https://github.com/discourse/discourse-staff-notes.git
      - git clone https://github.com/watchmanmonitoring/discourse-saml.git # cloned from discourse, to allow for more settings.

I’ve tried a couple of times, and still had errors.

Since I posted, the missing images were fixed, and I’ve rebuilt the container. The rebake still failed, but after a reboot of the host I was able to get the rebake to succeed.

# ./launcher enter app
root@community-2019-app:/var/www/discourse# rake posts:rebake
Rebaking post markdown for 'default'
     2976 / 2976 (100.0%)
2976 posts done!

FYI this nearly always means “ran out of memory or some other hard limit”

You can probably see a log entry in your kernel ring buffer (dmesg) saying something got killed to free memory.

3 Likes

You mentioned it’s a 2GB server, but does it have swap?

3 Likes

Now that you mention it, no.

root@community-2019:~# swapon --show
root@community-2019:~# free -h
              total        used        free      shared  buff/cache   available
Mem:           1.9G        1.4G        193M         70M        317M        273M
Swap:            0B          0B          0B
root@community-2019:~# 

I followed the steps here at this version of INSTALL-cloud.md

Which don’t mention swap, and so I didn’t even consider it, given that I wasn’t using the bare minimum 1GB.

I’ve added swap via Digital Ocean’s guide. Will see how this goes.

1 Like

That’s it, without a doubt. On 1GB it’s mandatory, on anything more than 1GB it’s only a matter of time before processes are killed when the system runs low on free memory:

The way linux handles memory changes fundamentally if you don’t have swap available. Think of an Android smartphone terminating background processes for hungrier foreground processes.

3 Likes

Sure, makes sense.

I just did a rebake again, and sure enough, it was eating in to the new swap.

  CPU[||||||||||||||||||||||||||||||||||||||||||||100.0%]   Tasks: 77, 84 thr; 1 running
  Mem[|||||||||||||||||||||||||||||||||||||||1.58G/1.95G]   Load average: 2.05 1.51 1.15 
  Swp[||||||||||||||||||||||                  431M/1024M]   Uptime: 2 days, 01:58:25

and the rebake finished fine. :raised_hands:

I don’t know if the powers-that-be want to (re)add a note about adding swap in the basic setup guide, but it would have prevented the issue here.

1 Like

When you run discourse-setup, it says:

WARNING: Discourse requires at least 2GB of swap when running with 2GB of RAM
or less. This system does not appear to have sufficient swap space.

Without sufficient swap space, your site may not work properly, and future
upgrades of Discourse may not complete successfully.

Ctrl+C to exit or wait 5 seconds to have a 2GB swapfile created.

I guess the assumption is that power users who bypass discourse-setup already understand how to configure a server.

2 Likes

Sure but I had 2GB ram, and have nothing but discourse on this host, and still ran in to issues.

It’s quite possible that the rebake process is the only thing that would cause me grief, and I’m not sure I would have been doing that if hadn’t hit some obscure issue on my host, when trying to restore from a backup.

Correct, because as I quoted

That perfectly describes your situation.

2 Likes

oh, you think that maybe I skipped a step? Wouldn’t shock me… I was setting up the new host knowing full well that I was just going to restore from backup. There’s a good chance that I never ran discourse-setup

If you copied the app.yml across you skipped that step entirely you’re basically in ‘no seatbelt’ mode.

You wouldn’t see the message about that, or any other of the system requirement checks which can save you pain down the line.

Discourse-setup doesn’t just create a fresh app.yml, it checks all of the common gotchas which hinder new installs. Docker, disk, memory, scaling defaults, whether ports are already in use, it’s all there.

Take a look, it’s quite comprehensive.

What would be useful would be a way to pipe a migrated app.yml into discourse-setup, so that site-specific stuff could be treated as a default and preserved, while the other stuff would be configured for the system for the first time.

1 Like

Thanks @Stephen… yeah I was out in a wasteland because I was “just” restoring from a backup.

In the end, anyone else having issues with rebake can find this thread & its solution. Also, due to the work @vinothkannans & team did recently relating to migrating images, my specific issue should be a moot point going forward.

Thanks again for the pointer on swap, it’s the solution to the problem I started with this thread.

3 Likes