Email domain blacklist with wildcards (revisited)


(Stephen Chung) #1

The current implementation already matches black-listed domains with EndsWith type string matching.

I would like to propose adding regex type matching as well.

Reason: Recently a lot of spammers come from email addresses with a long string of numbers, for example 76987.com, 245934.net etc.

I would like to be able to filter them out, but with the standard EndsWith matching, it is impossible.


(Stephen Chung) #2

EDIT:

Oops, just looked through the commit. Seems like the domain string is passed straight into the regex:

regexp = Regexp.new("@(.+\.)?(#{domains})", true)

Not sure if #{domains} gets sanitized before-hand. Unlikely because of the line:

domains = setting.gsub('.', '\.')

which only sanitizes the dot.

But it does appear to support putting a regex inside #{domains}, except that the regex cannot use the dot .


(Stephen Chung) #3

I’ve tried it with the following blacklist domain:

\d+\d\d\d.\w+

which should match any domain which is made up of four or more numbers (and numbers only).

I checked the setting with Data Explorer and confirms that email_domains_blacklist is mailinator.com|cc|\d+\d\d\d.\w+, which is correct.

The code regexp = Regexp.new("@(.+\.)?(#{domains})", true) should now filter off emails coming from domains with long digits.

EDIT: Note: not sure what the second parameter true is doing in a call to Regexp.new(), which takes as a second parameter the matching options.

However, I just had three new SPAMmers came through:

ddef@445555.com
mnopp@89990001.com
gg5@996399.net

So obviously it is not working.

Trying the RegExp in a JavaScript console confirms that it works:

/@(.+\.)?(mailinator\.com|cc|\d+\d\d\d\.\w+)/.test("ddef@445555.com")
true

There, the domain blacklist is not working as it should.


(Stephen Chung) #4

I wonder… does it require restarting the VM for the settings to “set”?


(Stephen Chung) #5

Tracing through the code, there is something very suspicious.

EmailValidator seems to be only used when a user updates his/her email address (in EmailUpdater).

When a user is created, it only validates against proper email format but not whether the email domain is blacklisted.

Emphatically, it is not used to verify when a staged user is created via email in, because email/receiver.rb, in process_internal, it only checks against things:

Regexp.new(SiteSetting.ignore_by_title) =~ @mail.subject  // Blacklisted TOPIC TITLE
raise BouncedEmailError  if is_bounce?  // Bounce mail
raise NoSenderDetectedError if @from_email.blank?    // No From field
raise ScreenedEmailError if ScreenedEmail.should_block?(@from_email)   // Screend Email address

After this, a new staged user is created via find_or_create_user.

Shouldn’t EmailValidator.validate_each be called on @from_email to make sure that the incoming email in is not from a blacklisted domain?

Or, better, check first if the user with that email address already exists. If so, let it pass. Otherwise, call EmailValidator.validate_each to check if it is blacklisted. DO NOT create a staged user if the email is blacklisted.


(Stephen Chung) #6

Created bug report. Hope I got it right. :grin: