Import posts from Facebook group into Discourse


(Vu Huynh) #85

Thank you very much, as you talk, I’m reading some Ruby documents.

(Vu Huynh) #86

@meriksson Hello, I have this error when run script

root@ubuntu-1gb-nyc3-01-discourse-app:/var/www/discourse# bundle exec rake import:facebook_group
URGENT: FATAL:  Peer authentication failed for user "discourse"
 Failed to initialize site default
*** Running in TEST mode. No changes to Discourse database are made
*** Using fake email addresses
*** Storing fetched data to disk, loading from disk when possible
rake aborted!
PG::ConnectionBad: FATAL:  Peer authentication failed for user "discourse"
/var/www/discourse/lib/tasks/import_facebook.rake:678:in `dc_user_exists'
/var/www/discourse/lib/tasks/import_facebook.rake:39:in `block in <top (required)>'
/usr/local/bin/bundle:22:in `load'
/usr/local/bin/bundle:22:in `<main>'
Tasks: TOP => import:facebook_group
(See full trace by running task with --trace)

(Martin Eriksson) #87

This has to do with basic database setup. PG stands for PostgreSQL, the database technology used by Discourse. Basically, you are unable to connect to your database.

Check the link below for a possible solution but note that you need to have a working Discourse instance before running the import script. So you need to do additional steps before you can import anything, i.e. fix your database setup and complete the Discourse installation including creating and populating the database – commands for this include rake db:create and rake db:migrate – if you have not successfully run these and other commands, you will not be able to run the importer. You need to follow all the instructions for installing Discourse and make sure that it is working. There is no point of importing anything if you do not get Discourse running first.

(Vu Huynh) #88

@meriksson Hello, thank you for your help. When I use rake db:create it show this:

Couldn't create database for {"adapter"=>"postgresql", "pool"=>8, "timeout"=>5000, "socket"=>"/var/run/postgresql", "username"=>"discourse", "host_names"=>[""], "database"=>"discourse", "prepared_statements"=>false}

If I can’t connect to database, why Discourse still alive and I can post anything :frowning:?

(Vu Huynh) #89

@meriksson Hello, the script is run finally, the facebook-data folder is created with many json files :smiley: but nothing post to discourse forum, this show me some notifications but when I click to category, nothing show up. Can you help me?

(Martin Eriksson) #90

Great to hear that you got the importer to run!

By default the script runs in test mode which means that nothing is imported to Discourse. To change this, look for this line in the config file:

test_mode: true

Change it to false and run the script again to import to Discourse. Everything you have already exported and saved to disk, i.e. the JSON files, will be imported from the files instead of downloaded from Facebook an additional time. The best way to do it is to make a complete export in test mode, then run the import completely from disk.

(Vu Huynh) #91

Hello, thank you for your reply. I’ve just edit as readme file, when it finish import to facebook-data folder, I edit import_facebook.yml file test_mode to false, and try to run script again, it just show me this:

(Martin Eriksson) #92

When it says “Already imported post …” – that means that the importer has found a post in the Discourse database with matching Facebook ID. In other words, your import is complete.

A simple way of accessing a particular post is to put the topic ID in the URL. For example, to open the last post in your screenshot, you would go to /t/3142 – but note that this of course requires that you have Discourse up and running.

If you have trouble running Discourse, I can not help you and you will need to find support in some other thread since this is not related to the importer script.

Good luck!

(Vu Huynh) #93

thank you @meriksson, my discourse is running normally, when I try to import, the process run very ok, but nothing imported to my discourse, although notification push up

(Vu Huynh) #94

Hello, @meriksson @Sander78 I think there are some errors in this script.

I install a clean package from Digital Ocean on Ubuntu 14, I tried to run script step by step, but the command line show me a success screen, nothing posted to my discourse even when it show finish created category and imported posts.

I don’t know where database the script post into.

Did you ever meet with this error?

Thank you

(Martin Eriksson) #95

I don’t think this is an error in the script. If it says that something is imported it means that it successfully fetched this object from the database. So your posts have been saved somewhere and the script can access them.

If I would take a guess, perhaps you have created two databases. The import script is writing to one of them, your Discourse is reading from another one. This could happen depending on how you run the script, e.g. you might need to specify what environment you are running in. For example, you could be running the script in development and your site in production. In that case you would need to run the script differently, for example like this:

ENV=production bundle exec rake import:facebook_group

(Note that this is explained in the readme under “Instructions for usage”.)

In any case, here is the part of the script which determines if something is already imported, note on line 189 that it fetches the post from the database. If it gives the message on line 192, that means that it was able to fetch a post from the database with a matching Facebook ID:

(Vu Huynh) #96

Yes, may be I fall to wrong-step.

At the begin, I install a clean package, after add gem, run bundle install, edit and copy the import_facebook.yml/ import_facebook.rake, I try to run ENV=production bundle exec rake import:facebook_group

The command line shows me psql: fatal: database "root" does not exist, so I create this user by sudo -u postgres createuser root

Next, the command line shows me the discourse_development doesn't exist, so I create the database by sudo - u postgres createdb discourse_development

And rake db:migrate before run ENV=production bundle exec rake import:facebook_group as you talk

The command line show me admin user does not exist, fine, I create admin by rake admin:create and run rake import again

Now the script run normally but nothing show in the discourse, sadly, I don’t know what step I failed.

(Vu Huynh) #97

Hello all, thank you @meriksson very much, finally I success to finish running this script. Let’s me share some experiences; I will write down step by step:

  1. Install Discourse
  2. Add some gem to Gemfile (please read the readme on Github), clone git, and edit, copy necessary config file (import_facebook.yml, import_facebook.rake)
    You should run in test mode in the first run, the script will fetch all data from Facebook to your local storage
  3. If you are running by root access, change account to discourse by command su - discourse before running this script
  4. With discourse access, run export RAILS_ENV=production
  5. Run bundle exec rake import:facebook_group to fetch data
  6. After the first run in test mode, you should exit to root access by exit and edit import_facebook.yml again, now change test_mode to false.
  7. Back to discourse access (step 3), connect to discourse_development by re-type export RAILS_ENV=production (step 4) and run bundle exec rake import:facebook_group again.
  8. Congratulation!


rake aborted!
NameError: uninitialized constant DISCOURSE

Works like a charm! It been my mistake :slight_smile:

(Andrew ) #99

We used the older script a couple years ago. One of the sticky bits was matching up existing Discourse users with imported Facebook users. Is there a new way to handle this in your update?

(Martin Eriksson) #100

I have only used this script for importing to an empty Discourse, letting the importer create all Discourse users. This seems to be the main use case of the script, i.e. a one-off migration from Facebook to a clean Discourse. I have not really thought about the case you seem to be describing but here are som thoughts off the top of my head:

Since the importer matches users by Facebook ID, you probably need to store Facebook IDs for all affected accounts in your Discourse database and you need to do this before running the import. Regardless of how you manage the import, you will probably need to ask users to connect their Facebook accounts (i.e. with login/SSO) to their existing Discourse account, i.e. to properly claim their imported Facebook data. So you could probably solve your problem by first requiring users to connect their Facebook accounts (if they want their Facebook data imported) and then running the import.

You will have some additional issues to think about. For example, what to do with Facebook posts made by people who have not opted in by connecting their Facebook account? The most reasonable approach is probably to skip importing them. The importer does not currently have support for this but it would be a small modification. There is already more general support for skipping posts by certain users, since this is sometimes necessary due to e.g. privacy settings. Depending on your relationship to your users, this might also solve a moral issue. Are users OK with you importing their data? By having them opt in explicitly beforehand you eliminate any such doubt.

(Heikki Wilenius) #101

Does anybody know if the script still works? The lastest commit a few months back by @Sander78 implies that it needs an update. I’m asking this because I have a couple FB groups in mind to which I’d like to propose a FB to Discourse migration.

(Sander Datema) #102

I’m pretty sure my latest version doesn’t work. Since then Discourse has changed a lot (for the better).

You could have a look at some of the forks people already made, some of them recently.

(Heikki Wilenius) #103

Thanks for the quick reply. =) I clicked through the forks, none of them unfortunately have any commits since @meriksson’s work two years ago.

(Martin Eriksson) #104

I just did a quick test run of the script. Only spent a few minutes on this, but for what it’s worth, here is what I found:

It seems that there are some problems with fetching stuff from Facebook. This is not surprising since Facebook has made API changes during the past year, with a particular focus on privacy protection (in the wake of the Cambridge Analytica affair). The script connects to the Facebook API without any issues. And it fetches some kinds of content without problems (e.g. posts). But some other kinds of content are not fetched (e.g. user names).

So the first step to make the importer work again is probably to debug the collection of data from Facebook. It might just be a matter of making some small modifications to the API calls. If Facebook no longer allows some calls, it would be easy to modify the script to do imports but skip some kinds of content (in fact, this would probably happen by default since the script is fault-tolerant as it is).

I also tried to import collected data to Discourse and I did not run into any issues at all. So while the export-from-Facebook parts have issues, I think the import-to-Discourse parts are fine. If there are any issues with creating topics, posts, user accounts etc in Discourse, I am sure they are very easy to resolve. But as far as I can tell this part is working perfectly.