Where does Discourse stand on minor refactoring PRs


(Sam Saffron) #1

I got this question on twitter and wanted to touch on it here.


Discourse accept the vast majority of pull requests we receive. We love all the contributions and thank you all for being so awesome.

Recently however we rejected the following PR

It was not that the PR was bad, most of it was good, its just that it became a big burden to take it from good to the state that everybody is happy with it.

Style is a very complicated topic. Certain things are completely non controversial.


# worse
def hello
  result = "hello"
end

# better
def hello
  "hello"
end

# worse
def self.[](name)
  unless g = lookup_group(name) 
      g = refresh_automatic_group!(name) 
  end
  g   
end

# much better
def self.[](name)
   lookup_group(name) || refresh_automatic_group!(name)
end

Certain things are a little bit more open to interpretation:


# which is better?
string.upcase! unless string.blank?
string.upcase! unless string.nil?
string.upcase! if string.present?
string.upcase! if string

# I much prefer the last but its taste thing 

# which is better?
something if Rails.env == "development" || Rails.env == "profile"
something if Rails.env.development? || Rails.env == "profile"

# which is better?
products.where('deleted_at IS NULL')
products.where(:deleted_at => nil)


As a rule, we always prefer PRs with substance

  • Add / refactor a few tests AND improve surrounding code.
  • Add some new functionality, improve surrounding code.
  • Highlight a new interesting pattern we should be using
  • Do some “real” refactoring, break up a big class, extract the onebox gem, and so on.

We also accept PRs that address egregious style crimes:

  • non-controversial lookup_group(name) || refresh_automatic_group!(name) change.
  • clean up some of the mess left from the coffeescript -> js move

We are very worried about huge PRs that enforce a certain style, the famous remove all trailing white spaces or for example walking the entire code base to enforce a “taste” style change. Its tricky, we don’t want to discourage contributions.

If you are looking for projects we try to keep: Discourse Development Contribution Guidelines up to date.

I would also like to open this whole topic to discussion, and ideas on what you think should be our best approach.


Discourse Development Contribution Guidelines
(Sander Datema) #2

I wonder what you think about this situation:

  • Someone (me) is smart enough to add a bit of useful code (like requested by you here)
  • However, that person doesn’t know how to write a test for this case

Would you rather have no PR at all, meaning it might have to wait for ever to be fixed, or is it ok to submit the PR with a request for someone else to add a test to it?


(Dan Neumann) #3

I had to look up inquiry.

Rails’ API introduces some nice extensions to Ruby’s standard lib. But this is not one of them. Props to the contributor for attempting to be idiomatic in the sense of leaning on Rails’ API, but the pull request should’ve been sent github.com/rails/rails’ direction to remove that code from ActiveSupport. It probably would’ve been merged.

Also, class<<self should be reserved as a last resort for when you need to, say, apply attr accessors or alias_method to a class (because it’s the only way). Otherwise it just makes class methods look like instance methods.


(Sam Saffron) #4

nahh … you can do it, I am totally confident, have a look at the tests in the components/post_creator_spec.rb

All you need is to add a test that sets the created date and then tests that it was created, you can start with the other test as a template.

In general we prefer tests, we will help you get them going, you can discuss code or attempts at this on meta.


(Jeff Atwood) #5

We already updated the contributing.md to reflect this a while ago:

However, please note that pull requests consisting entirely of style changes are not welcome on this project. Style changes in the context of pull requests that also refactor code, fix bugs, improve functionality are welcome.


(Matt Van Horn) #6
# which is better?
string.upcase! unless string.blank?  
string.upcase! unless string.nil?  
string.upcase! if string.present?  
string.upcase! if string

The above example shows the blurry line between ‘style’ changes and ‘substance’ changes.

#blank? and #nil? have different meanings.

Upcasing a blank string is fine, it has no effect.
Upcasing nil will raise an error.
So in the example there is never really a need for using blank? or present?, just nil?

Better Example:

# which is better?
string.upcase! unless string.nil?  
string.upcase! if string
string && string.upcase!
string.try(:upcase!)

And I would argue that all of these are best in certain contexts.

That said, I don’t consider my original request to be an example of a purely stylistic change. Even though I find using !foo.present? to be confusing, the reason I would change it is more that the implementation of #present? is:

def present?
  !blank?
end

which means that using !foo.present? is equivalent to !(!foo.blank?)) which I think anyone would be quick to change.


(Sam Saffron) #7

You would think that but I have an open disagreement with @eviltrout about a similar issue, which I do not see being resolved


# I strongly prefer
def can_recover_post?(post)
   return false unless @user # or perhaps return false if @user.nil?
   is_staff?
end

# Robin strongly prefers 
def can_recover_post?(post)
   return false unless @user.present?
   is_staff?
end

Robin argues that the use of .present? is more consistent and reads better.

I argue that its very annoying that I need to patch active record to make it have no performance impact (1.5%), and more importantly I really dislike its mode of failure.

If you initialise a Guardian object with Guardian.new(" ") or Guardian.new([]), when only .present? is used everything just works and and behaves just the same as passing in nil, which is the semantics for no user is present. I think it should fail earlier, that is a good thing cause initializing Guardian with " " or [] is an error.

Robin sees, this as a virtue, I see it as a crutch.


That said, I could not accept a PR that just made that sweeping change across Guardian, even though I agree with it.


Note its a bit of a bad example, cause this works fine:

def can_recover_post?(post)
   is_staff?
end

But the point here is the use of .present? on Active Record objects.


(Matt Van Horn) #8

Well, I have to disagree with Robin. I save present/blank for situations where there is more ambiguity than just nil or not nil. It’s really for things like strings that might be empty or whitespace, or arrays that might be empty, and the like. I try to code with a minimum of nil checks anyway - I prefer using NullObjects where feasible.

For the example, I think both are really confusing compared to:

def can_recover_post?(post)  
   @user && is_staff?
end

I will have a look at the Guardian you mentioned - it sounds like there may be a better way of handling that situation. In general, I am in favor of failing early, and close to a boundary. Check out Avdi’s talks on confident code (and also Exceptional Ruby) for more on that.

In my code, I would tend towards something like:

class User
  def do_something
    keep(:calm).carry_on
  end
end

class NullUser
  def do_something
    # maybe a NO-OP or some other appropriate handling...
  end
end

class Guardian
  def initialize(user = NullUser.new)
    user.do_something
  end
end

(Robin Ward) #9

I’d like to explain myself in more detail:

Aesthetics and English

Aesthetically, I do think .present? and .blank? produce code that is easier to read. Now the example Sam gives is not what I’d consider an ideal use. Personally I think return false unless @user.present? is much more confusing than return false if @user.blank?

We program in english, and “return false if user blank” makes a lot more sense than “return false unless user”. Unless user what?

Conventions

Nobody so far has argued that using .blank? and .present? on Strings or Collections is wrong as they do evaluate “truthiness” in a safer way in those cases.

I am hearing here that “well if you know the parameter is an activerecord object, use nil. If you know it’s a String, use blank?”.

First of all, does a programmer ever fully know that the input is going to be good? It’s a good habit to be overzealous when it comes to input. I bet there are a bunch of C programmers who said “I don’t have to check the length of this buffer for an overrun because my function can never receive the wrong input!”

By saying use blank? sometimes and nil? at other times, it demands that a programmer thinks about how the guard condition works and use the appropriate one. If she gets in the habit of always using .blank?, she doesn’t have to think about it. She adds it and moves on.

A secondary benefit to using it as a condition is that all code that works with an array can be refactored to an object and vice versa safely. Additionally, methods that we have in controllers such as requires_parameter don’t need separate signatures depending on type. If some parameters are strings and some are objects, it’ll work properly either way.

Performance

I argue that its very annoying that I need to patch active record1 to make it have no performance impact (1.5%)

Let’s be clear, the 1.5% came from benchmark where caching was turned off, and where no requests were served in parallel.

With caching turned on, like in any install of Discourse, the performance difference is unmeasurable. So to say you need to patch it is suspect.

Even if it wasn’t ever cached, I would argue that less than 1ms difference on average is worth it for readable code. If not, why are we even using Ruby? Why not use a much uglier language that gives us better performance?


(Sam Saffron) #10

Of course you do.

# Take this existing code we have in Discourse
topic = Topic.where(id: topic_id).first

if topic.present?
    url = "#{Discourse.base_url}#{topic.relative_url}"
    url << "/#{post_number}" if post_number.to_i > 1
end

# You argue its better cause, this works provided no topics are found
topic = Topic.where(id: topic_id)

if topic.present?
    url = "#{Discourse.base_url}#{topic.relative_url}"
    url << "/#{post_number}" if post_number.to_i > 1
end


# I argue it worse cause its good that this blows up EVERY time
topic = Topic.where(id: topic_id)

unless topic.nil?
    url = "#{Discourse.base_url}#{topic.relative_url}"
    url << "/#{post_number}" if post_number.to_i > 1
end

This pretty much sums the argument, I don’t know what more there is to say.

Which brings us full circle.

We can’t accept a PR that takes

if topic.present? and converts to unless topic.nil? we don’t even agree internally on what is better and in the context of the code I showed it make no practical difference.


We can accept a PR that takes:

def can_recover_post?(post)  
   return false unless @user # or perhaps return false if @user.nil?
   is_staff?
end

And turns it to:

def can_recover_post?(post)  
   is_staff?
end

No question whatsoever, its better code.


Small note: personally I prefer if user over unless user.nil? it provides adequate safety and is readable enough.


(Sam Saffron) #11

The caching we were talking about is unconditionally turned off for logged on users. The performance game is often a game of eliminating 100 small papercuts.

Anyway, I think

unless user.nil? is pretty much as readable as if user.present? in most cases. If you need to add clarity instead of having a ton of if @user.present? sprinkled in guardian we should add if anonymous? and if authorized? which much better communicate the intent.

I also strongly believe that Guardian.new(" ") should fail horribly, cause Guardian.new("bob") does and should.

Anyway, we can agree to disagree here. Bigger fish to fry.


(Jeff Atwood) #12

I am not a fan of this line of reasoning at all, because nobody can even agree which kinds of English are aesthetically superior. And programming languages that try to ape English are grotesque and misshapen beasts because of that one fundamentally bad decision:

http://www.codinghorror.com/blog/2006/08/computer-languages-arent-human-languages.html


(Robin Ward) #13

I admit that if your argument comes down to aesthetics you will never reach a consensus. I was just trying to explain why I personally prefer that syntax.

I do think my other arguments about consistency and safety still stand.

I agree.

I don’t think we should avoid an otherwise good pull request that converts .nil? to .blank? or vice versa. I think if the pull request has merit we can accept it rather than arguing endlessly. But if that’s all the pull request does, we can reject it based on what we’ve already agreed.

I’m much more concerned about the code we can all agree is bad rather than the debatable stuff.


(Dan Neumann) #14

It’s possible yall are discussing different things.

In most Ruby code, I’d argue for the this_thing if that_thing idiom. Sending that_thing messages like .nil?, .blank?, .present? may indicate that the sender knows too much about the possible values of that_thing. And a method that differentiates between "", nil, false, [], "hello" is too complex.

However, I think @eviltrout’s argument makes sense in, say, the context of a validator that does know a lot about the possible values of a symbol and those messages may help express the intent of the validation.

Finally, in response to the “Which is better” list of upcase conditions, I’d short-circuit nils early so that string && string.upcase can just be string.upcase. I feel bad for that method because it has to do the nil check at all. Can’t we be civil and just hand it a String?!


(Matt Van Horn) #15

I’d say another thing to be aware of is what conversions NilClass supports.
Often, something like:

value = maybe_nil.to_s.upcase

is all you need.


(Dan Neumann) #16

That’s a good one.

nil.to_i #=> 0
nil.to_f #=> 0.0
nil.to_a #=> []
nil.to_s #=> ""

(Sam Saffron) #17

On the topic of blank vs nil… blindly using .blank? is very very dangerous when you are dealing with AR queries.

https://github.com/discourse/discourse/commit/8ec6d0ea6c1e62b9410bca4e6bdcc163543d00fc

This change set saves about 40ms of server time just by avoiding 2 of the 3 AR queries.

.blank? is very magical, it can trigger queries. .nil? and similar can not.


(Kane York) #18

Should I convert the ruby side of the code to all single-quotes where that’s correct?

Based on the recommendations in

Note that there is no performance difference between the two, just style.

I bought a student license of RubyMine, and the two are visually very different (pretty sure that was on purpose), and double-quotes without any interpolation or escaping raises a style warning:

  versus  

Another screenshot, highlighting the randomness/inconsistency:


(Régis Hanol) #19

If you’re submitting a PR for a bug fix or a new feature, it’s fine to make the styles consistent.


(Sam Saffron) #20

To echo @zogstrip a bit.

Every PR we process has a cost, by changing " to ' you:

  1. Take ownership of the line making blame more complicated
  2. You introduce busywork for the team

Now, if you were to refactor a file and happen to also clean up style while at it, sure, fine.

But sweeping through is just too expensive.

I am sure you can amend your styling and get rid of the squigglies in ruby mine using some setting.

Also, strongly recommend you learn Vim and read Practical Vim. I use Vim exclusively, I try to learn a tiny bit every week. Its a skill that was very much worth having 20 years ago and still is very useful today.