Best way to combine/merge categories when importing topics?


(Thomas Wilson) #1

I’ve just about finished migrating my old forum to discourse. The last thing I’d like to do is merge some of the categories as part of a cleanup exercise.

I’ve had a look around these forums and found this topic (and also this one), however neither solutions will scale to the number of topics I’m dealing with. Does anyone have any other suggestions?


Refresh database after modifications via SQL
Best Way To Merge Categories
(Kane York) #2

You can use the rails console if you have way too many topics for the bulk select. But first, take a backup at /admin/backup and download it to your system; be prepared to restore from that if this goes wrong.

cd /var/discourse
./launcher enter app
rails c
cat_from = Category.find_by_slug('help')
cat_to = Category.find_by_slug('support')

Topic.where(category_id: cat_from.id).update_all(category_id: cat_to.id)
exit
exit

Edited to incorporate the ID fix


Bulk move many topics from one category to another
Batch operations on multiple users
Bulk move many topics from one category to another
(Thomas Wilson) #3

Just what I was looking for - thanks!

Required a bit of tweaking (using the category ID’s) - but apart from that, worked a treat.

Topic.where(category_id: cat_f.id).update_all(category_id: cat_t.id)
=> 2326

(Yanhao Mo) #4

I have tried this way to combine two categories, but after that I fount the topic counts of each category still don’t change. even though the topics actually have been combined to cat_to category, I’m still forbidden to delete the cat_from category. Is that a bug ?


(Yanhao Mo) #5

never mind, fixed this by update database manuly


#6

I had to do a major migration and had to do quite some DuckDuckGo-ing to be able to migrate everything the way I wanted to. So I’ve now extended the script from @riking to do the following:

  • Require less typing
  • Allow tagging all topics of a category at once
  • Update the topic count after moving topics from one category to another
  • Leave the “About the xxx category” topics untouched

Here’s how to use it (as suggested by @riking you should make a backup before doing this):

cd /var/discourse
./launcher enter app

There you should create a file named “MigrationCategory.rb” and copy the following into it:

class MigrationCategory < Category
    # Overwriting this to be able to also find sub-categories.
    def self.find_by_slug(category_slug)
        self.where(slug: category_slug).first
    end

    def self.retag(category_slug, tag_slugs)
        category = self.find_by_slug(category_slug)

        tags = []
        tag_slugs.each do |tag_slug|
            tags.push(Tag.find_or_create_by(name: tag_slug))
        end

        self.get_category_topics(category).each do |topic|
            topic.tags = tags
        end
    end

    def self.move(category_slug_from, category_slug_to)
        from = self.find_by_slug(category_slug_from)
        to = self.find_by_slug(category_slug_to)

        # Move to new category.
        topics = self.get_category_topics(from)
        topics.update_all(category_id: to.id)

        # Update topic counts in from and to category.
        topic_count_from = self.get_category_topics(from).count
        Category.where(id: from.id).update_all(topic_count: topic_count_from)
        topic_count_to = self.get_category_topics(to).count
        Category.where(id: to.id).update_all(topic_count: topic_count_to)
    end

    # The "About the ... category" topic doesn't count.
    def self.get_category_topics(category)
        topics = Topic.where(category_id: category.id)
        about_topic = Topic.where(id: category.topic_id).first
        topics_without_about_topic = topics.where.not(id: about_topic.id)
        return topics_without_about_topic
    end
end

Then do this: rails c. Wait until the rails-console has started and import the file you just created with require '/var/www/discourse/MigrationCategory.rb'

Now everything is set up (but don’t leave the rails-console). You can start tagging topics like this:

MigrationCategory.retag('category-1-slug', ['tag1', 'tag2'])

You can move topics between categories like this:

MigrationCategory.move('category-1-slug', 'category-2-slug')

A few questions about batch moving or deleting topics
(Bertrand Bellenot) #7

@dailyph I juts tried, and here is what I got:

[1] pry(main)> require '/var/www/discourse/MigrationCategory.rb'
=> true
[2] pry(main)> MigrationCategory.move('General Chat', 'Non-ROOT Specific')
NoMethodError: undefined method `id' for nil:NilClass
from /var/www/discourse/MigrationCategory.rb:37:in `get_category_topics'
[3] pry(main)>

Any suggestion?

Cheers, Bertrand


#8

It looks like you used the names of the categories with MigrationCategory.move(). You have to use the categories’ slugs instead. (probably something like MigrationCategory.move('general-chat', 'non-root-specific'))


(Bertrand Bellenot) #9

Oh, OK, thanks! It works (stupid me…) :confounded:


(Stefano Maffulli) #10

Thanks for sharing this. I’m moving a quite large forum to Discourse while simplifying its categories and this is exactly what I was looking for. I want to write a batch file to do all the tagging and moving in one big swoop. Does anybody know how to run this in a rails runner instead of console? Or some other way of running a sequence of MigrationCategory.retag and MigrationCategory.move in one command?


(Jay Pfaffman) #11

You could add that code to the importer.


(Stefano Maffulli) #12

I’d rather have them running separately: first check that the import has been executed correctly, then reshuffle things around. The testing process is more modular this way.

What I don’t understand (and it’s all my fault) is why if I run MigrationCategory.move('category-1-slug', 'category-2-slug') in rails console it gets executed but if I put a series of MigrationCategory.move in a .rb file and run rails r I get NameError: undefined local variable or method `migrate' for main:Object.

I put the content of the class MigrationCategory in a file called MigrationCategory.rb. Then I created a new file

require MigrationCategory.rb

MigrationCategory.retag('dreamobjects', ['dreamobjects'])
MigrationCategory.retag('dreamcompute', ['dreamcompute'])

Then run rails r migrate.rb which spits this error:

# rails r migrate.rb 
bundler: failed to load command: script/rails (script/rails)
NameError: undefined local variable or method `migrate' for main:Object
  /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/railties-4.2.8/lib/rails/commands/runner.rb:62:in `<top (required)>'
  /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/railties-4.2.8/lib/rails/commands/runner.rb:62:in `eval'
  /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/railties-4.2.8/lib/rails/commands/runner.rb:62:in `<top (required)>'
  /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/railties-4.2.8/lib/rails/commands/commands_tasks.rb:123:in `require'
  /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/railties-4.2.8/lib/rails/commands/commands_tasks.rb:123:in `require_command!'
  /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/railties-4.2.8/lib/rails/commands/commands_tasks.rb:90:in `runner'
  /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/railties-4.2.8/lib/rails/commands/commands_tasks.rb:39:in `run_command!'
  /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/railties-4.2.8/lib/rails/commands.rb:17:in `<top (required)>'
  script/rails:6:in `require'
  script/rails:6:in `<top (required)>'

(Coin-coin le Canapin) #13

Hi, I successfully used your excellent script, what a time saver.

How can I modify it so I can move topics from a certain category when the topic title mentions a specific word?

My old forum has a buy/sell category, and after my import, I wish to have separate categories for buying and selling.
So I want to move the topics from the buy/sell category to the new ones.
Topics mentioning “buy” or “looking” or “search” for example will go in my new buy category, and those mentioning “sell” etc will go in my new sell category.


(Jay Pfaffman) #14

Something like

buycat = Category.where(title: "buy")
Topic.where("title like '%BUY%'").update_all(category_id: buycat.id)

(Coin-coin le Canapin) #15

Sounds nice, I’ll try that tomorrow. :slight_smile:

edit:

    def self.move(keyword, category_slug_from, category_slug_to)
        from = self.find_by_slug(category_slug_from)
        to = self.find_by_slug(category_slug_to)

        # Move to new category.
        topics = self.get_category_topics(keyword, from)
        topics.update_all(category_id: to.id)

        # Update topic counts in from and to category.
        topic_count_from = self.get_category_topics(from).count
        Category.where(id: from.id).update_all(topic_count: topic_count_from)
        topic_count_to = self.get_category_topics(to).count
        Category.where(id: to.id).update_all(topic_count: topic_count_to)
    end

    # The "About the ... category" topic doesn't count.
    def self.get_category_topics(keyword, category)
        topics = Topic.where("title LIKE '%#{keyword}%'", category_id: category.id)
        about_topic = Topic.where(id: category.topic_id).first
        topics_without_about_topic = topics.where.not(id: about_topic.id)
        return topics_without_about_topic
    end

How does this line look like?

topics = Topic.where("title LIKE '%#{keyword}%'", category_id: category.id)

Should I use SQL syntax in the where string?
Maybe add a lower(title) to not bother with case sensivity?


(Gerhard Schlager) #16

You could use the following line if you want to do a case insensitive search and don’t want your keyword to break the query.

topics = Topic.where("title ILIKE ? AND category_id = ?", "%#{keyword}%", category.id)

(Coin-coin le Canapin) #17

Thanks!
It’s almost good. It move the topics but doesn’t update the posts count from the categories when I run the command, here’s my full code:

class MigrationCategory < Category
    # Overwriting this to be able to also find sub-categories.
    def self.find_by_slug(category_slug)
        self.where(slug: category_slug).first
    end

    def self.retag(category_slug, tag_slugs)
        category = self.find_by_slug(category_slug)

        tags = []
        tag_slugs.each do |tag_slug|
            tags.push(Tag.find_or_create_by(name: tag_slug))
        end

        self.get_category_topics(category).each do |topic|
            topic.tags = tags
        end
    end

    def self.move(keyword, category_slug_from, category_slug_to)
        from = self.find_by_slug(category_slug_from)
        to = self.find_by_slug(category_slug_to)

        # Move to new category.
        topics = self.get_category_topics(keyword, from)
        topics.update_all(category_id: to.id)

        # Update topic counts in from and to category.
        topic_count_from = self.get_category_topics(keyword, from).count
        Category.where(id: from.id).update_all(topic_count: topic_count_from)
        topic_count_to = self.get_category_topics(keyword, to).count
        Category.where(id: to.id).update_all(topic_count: topic_count_to)
    end

    # The "About the ... category" topic doesn't count.
    def self.get_category_topics(keyword, category)
        topics = Topic.where("title ILIKE ? AND category_id = ?", "%#{keyword}%", category.id)
        #about_topic = Topic.where(id: category.topic_id).first
        #topics_without_about_topic = topics.where.not(id: about_topic.id)
        #return topics_without_about_topic
        return topics
    end
end

Edit : I avoided the issue by hard writing the number of topics.


#18

Is it about time we had a GUI facility to do this?

Perhaps there is one and I’ve missed it?