Import from Russian language phpbb3


(Denis Safronenkov) #1

I’m using script/import_scripts/phpbb3.rb, it works fine, but there is little problem - my installation of phpbb3 contains cyrillic almost everywhere, so as result am getting wierd texts everywhere - posts, categories etc. sample output:

да, самостоятельно, помещение уже куплено, щас проводится ремонт и другая мелочовка. через 2-3 недели открытие.

And usernames, which are sometimes in cyrillic too are became something strange like this:

D_D_D_D_D_D_D_D_D_D6

At same moment everything in english works perfect.
My phpbb3 charset database settings are:

ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

Please help.


(Gerhard Schlager) #2

Discourse doesn’t support usernames with cyrillic letters in it.

See also

Anyway, I’m not sure how the importer should handle those usernames. @neil Do you have any suggestions?

However, everything else (topic titles, posts) should work with unicode text.

Edit: I’ve just tested it with the exact same charset and collation. Posts look fine. Only usernames get converted to rubbish.

@d3zorg Are you using the original phpbb database or are you using a database dump and importing that into a new MySQL database for the import?
I am using the following commands for exporting / importing a database dump:

mysqldump -uroot -p phpbb -r phpbb.sql
mysql -uroot -p --default-character-set=utf8 --database=phpbb < phpbb.sql

(Denis Safronenkov) #3

I’ve tried to use dump and restore it by multiple ways, but that didnt help. So then I switched phpbb3.rb directly to current phpbb DB and it worked perfectly. So it possibly mysql charset problem, but actually i dont care about it anymore. Thanks for help :smile:


(Gerhard Schlager) #4

Just out of curiosity: What is the engine, charset and collation of the database that worked?
And how did you handle the usernames containing Cyrillic letters? Do the users not mind when their username changes?


(Denis Safronenkov) #5

its default phpbb3 installation on percona server 5.5 with utf8/utf8_bin charset/collation

handling usernames is a bit tricky in my case - i just asked everyone to change username.
migration to discourse planned in next few days.