vBulletin 5 import error (invalid character in website field) and a quick question about attachments

Hi guys!

After importing my first old forum from phpbb 3 years ago, I’m happy to have motivated an international community forum admin to migrate another, larger forum (180000 members, 1.6M messages) to Discourse. :tada:

This forum uses vBulletin5.

The import worked well until user number 71712, the reason being invalid characters in the website field:

:website=>"http://url-redacted.com - æåñòêîå ïîðíî ñìîòðåòü îíëàéí",

The two resulting error messages:

1: from /usr/local/rvm/rubies/ruby-2.6.5/lib/ruby/2.6.0/uri/rfc3986_parser.rb:73:in `parse' /usr/local/rvm/rubies/ruby-2.6.5/lib/ruby/2.6.0/uri/rfc3986_parser.rb:21:in `split': URI must be ascii only "http://url-redacted.com - \u00E6\u00E5\u00F1\u00F2\u00EA\u00EE\u00E5 \u00EF\u00EE\u00F0\u00ED\u00EE \u00F1\u00EC\u00EE\u00F2\u00F0\u00E5\u00F2\u00FC \u00EE\u00ED\u00EB\u00E0\u00E9\u00ED" (URI::InvalidURIError)

and:

1: from /usr/local/rvm/gems/ruby-2.6.5/gems/addressable-2.7.0/lib/addressable/uri.rb:2394:in `defer_validation' /usr/local/rvm/gems/ruby-2.6.5/gems/addressable-2.7.0/lib/addressable/uri.rb:2475:in `validate': Invalid character in host: 'url-redacted.com.com - æåñòêîå ïîðíî ñìîòðåòü îíëàéí' (Addressable::URI::InvalidURIError)

If the URL “must be ascii only”, maybe the field content could be just removed or sanitized instead of throwing an error and stop the script? :thinking:
I’ll try to have this user removed from the database before trying to import again.

Plus I have another question. The avatars are stored in the database and they were successfully imported.

But what about the attachments? In my case, they are also in the database as it is by default with vBulletin. Will they be imported, or must they be stored as separate fields?

1 Like

Hey @Canapin

When we migrated our old vB3 forum we had a lot of errors in the migration script when some strange mojibake or some odd-ball attachment that was polluting the DB from 15 years of posts.

What I did, not very elegant, but it worked for us:

When we ran into these errors, I simply edited the migration script where the errors happened; and wrapped the offended line with

begin
#offended ruby line here
rescue
puts "here is some interesting offending information"
end

My experience with migrating around 1 million vB3 posts to Discourse is that one of my best friends was:

begin
rescue
end

We lost a few malformed posts, but that was OK by me to drop a handful of posts out of a million.

Hope ths helps.

1 Like

Since I don’t know rails, I just replaced the website field content by an empty string. It was a spam account anyway, and it’s the only error on almost half the users already processed.

So it’s not a big issue if it is not fixed in the script code after all, though it would be nice if it didn’t block the import :wink:

As for the attachments, I’ll try to import the messages anyway and I’ll see if attachments are imported or not (or if there are any kind of error message) if I don’t get this info in the meantime.

1 Like

Hi @Canapin

We also had to write some Ruby begin rescue end wrappers when we imported our vB3 attachments into Discourse; but in the end, we got almost all attachments and avatars and profile images migrated.

When we started, I had no experience with Ruby and Rails; but now I am a big fan of Ruby and Rails, thanks to Discourse; and write a bit of Rails code every day now.

It’s good to learn a bit when migrating, I think, if you have the interest.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.