1: from /usr/local/rvm/rubies/ruby-2.6.5/lib/ruby/2.6.0/uri/rfc3986_parser.rb:73:in `parse' /usr/local/rvm/rubies/ruby-2.6.5/lib/ruby/2.6.0/uri/rfc3986_parser.rb:21:in `split': URI must be ascii only "http://url-redacted.com - \u00E6\u00E5\u00F1\u00F2\u00EA\u00EE\u00E5 \u00EF\u00EE\u00F0\u00ED\u00EE \u00F1\u00EC\u00EE\u00F2\u00F0\u00E5\u00F2\u00FC \u00EE\u00ED\u00EB\u00E0\u00E9\u00ED" (URI::InvalidURIError)
および:
1: from /usr/local/rvm/gems/ruby-2.6.5/gems/addressable-2.7.0/lib/addressable/uri.rb:2394:in `defer_validation' /usr/local/rvm/gems/ruby-2.6.5/gems/addressable-2.7.0/lib/addressable/uri.rb:2475:in `validate': Invalid character in host: 'url-redacted.com.com - æåñòêîå ïîðíî ñìîòðåòü îíëàéí' (Addressable::URI::InvalidURIError)
When we migrated our old vB3 forum we had a lot of errors in the migration script when some strange mojibake or some odd-ball attachment that was polluting the DB from 15 years of posts.
What I did, not very elegant, but it worked for us:
When we ran into these errors, I simply edited the migration script where the errors happened; and wrapped the offended line with
begin
#offended ruby line here
rescue
puts "here is some interesting offending information"
end
My experience with migrating around 1 million vB3 posts to Discourse is that one of my best friends was:
begin
rescue
end
We lost a few malformed posts, but that was OK by me to drop a handful of posts out of a million.
Since I don’t know rails, I just replaced the website field content by an empty string. It was a spam account anyway, and it’s the only error on almost half the users already processed.
So it’s not a big issue if it is not fixed in the script code after all, though it would be nice if it didn’t block the import
As for the attachments, I’ll try to import the messages anyway and I’ll see if attachments are imported or not (or if there are any kind of error message) if I don’t get this info in the meantime.
We also had to write some Ruby begin rescue end wrappers when we imported our vB3 attachments into Discourse; but in the end, we got almost all attachments and avatars and profile images migrated.
When we started, I had no experience with Ruby and Rails; but now I am a big fan of Ruby and Rails, thanks to Discourse; and write a bit of Rails code every day now.
It’s good to learn a bit when migrating, I think, if you have the interest.