Discourse, the GPL, and per-file notice

(steveklabnik) #1

Sorry if this is in the wrong place; I figured this would be the best place to put this.

How to use GNU licenses for your own software - GNU Project - Free Software Foundation describes how to make a software project GPL licensed, and it includes this:

Whichever license you plan to use, the process involves adding two elements to each source file of your program: a copyright notice (such as “Copyright 1999 Terry Jones”), and a statement of copying permission, saying that the program is distributed under the terms of the GNU General Public License (or the Lesser GPL).

Discourse does not include either of these with each file.

Is this a problem? Have you cleared that with a lawyer?

I’m interested because I will be GPL-ing (though probably AGPL) a Rails application of my own, and putting it in each source file is annoying. It seems required, though.

(Jeff Atwood) #2

Interesting. Have you verified this to be the case with other well known GPL projects like WordPress?

(Sam Saffron) #3

WordPress don’t do it:

Linux don’t do it:

We seem to be in safe territory

(steveklabnik) #4

Seems good!

Of course, I should have checked other big projects, brains are very silly things. My thought was “oh, I don’t know if this is true, I’ll look at Discourse!” And rather than keep looking, I just stopped there.

(Sam Saffron) #5

No worries Steve, any time :smile:

(Kevin P. Fleming) #6

Just because the other projects are not diligent about it does not mean that you should not try to do the right thing yourselves :slight_smile: I suspect that the vast majority of the files in the Linux kernel do have have copyright/license notices, but there are some that don’t.

As a consumer and distributor of open source software, it’s always far better to have each file clearly indicate the copyright ownership and license, because those files tend to have ways of becoming separated from the main source distribution. When your license places obligations on those who modify and distribute your code (like the GPL’s requirements to indicate which files were modified and what modifications were made), it’s in your best interests to ensure that each file clearly states that it falls under that license.

I’m curious what is ‘annoying’ about putting a 4/5 line block of text at the beginning of each source file; anyone care to elaborate? It’s a pretty mechanical (i.e. easily scriptable) task.

(Jeff Atwood) #7

Care to submit a PR for such a build task?

Also, if it is mechanical, why not provide a utility that users could use to execute on their own files to add it? Or GitHub could do it.

(steveklabnik) #8

Things like

“What about shell scripts? It says it goes at the top, but does that mess up the shebang?”

“What about HTML? Doesn’t the DOCTYPE have to be the first line?”

“I work on a 13” MBA most of the time, and having a 15 line copyright notice means I need to scroll down a bunch to even see what’s going on in the file."

“What about autogenerated files that get checked in? Am I going to have to re-add the notice to the top of db/schema.rb every time I migrate?”

All of these questions have answers, except the real-estate one, but they’re minor annoyances that don’t come into play if the license is just at the root.

(Luis Villa) #9

If you want your code to be reused, it is good practice to put at least some licensing information in each file. Individual files can (and will) be separated from the rest of the project, and once that happens, it is a courtesy to the people who come across it to have copyright information in the file. I’ve babbled about this at some length elsewhere.

Software Freedom Law Center also recommends having at least some licensing information in each file.

IANYL, TINLA. :slight_smile:

(Jeff Atwood) #10

Shouldn’t there also be a license file in the project? Well of course there already is.

Why wouldn’t you just check the license file in the root of the project rather than blindly looking in random files for important information?

(Luis Villa) #11

Files get separated from the project (and therefore from the license file in the root) on a very regular basis as people use/reuse code they find on the web and in github. Once that happens, it is hard (especially post-google-code-search) to match the file with the root.

You can always say “that’s the problem of the people who take it out of the root”. As far as the license goes, that is 100% correct - they are the ones who should add the license information when they take the file out. But it’s still sort of a pain for the people who end up with the file third-hand to figure out. So it is a courtesy to them (who are legitimately trying to figure out what your license is, and comply with it) to include at least a brief header/statement in each file.

(Callan Bryant) #12

Not necessarily, I have wondered the same thing but never bothered to research it. Thanks for posting.

(Sam Saffron) #13

I agree that the per file preamble protects you there, however I find it incredibly rare that developers would take a single file verbatim from a project and start using it. It is far more likely they would take a function or snippet. The preamble does nothing to protect you there.

This is the crux of the issue. As a developer I really dislike this noise at the top of my files it is purely noise and provides no value to me. We are optimising for the legitimate developers on the project and not optimising for the 3rd parties that get detached files.

Additionally, for Discourse, we require a CLA. Developers can never say to us, oh we did not know we were contributing to a GPL project there was no preamble on the file. We own the code they contribute.

I guess the crux of the question raised by @steveklabnik was, does lack of preamble somehow invalidate the GPL? Is there a clause in the GPL somewhere that says it only applies to files with preambles?

Seeing Linux and WordPress are not diligent about it (and on the other hand mediawiki are) it seems that pragmatically there is no requirement to have the preamble.

Having it increases usability for 3rd party consumers that happen upon the source, especially detached.

Not having it optimises the experience for developers on the project.

The big question seems to be, who are you optimising for?

(Kevin P. Fleming) #14

The GPL et. al. don’t place these requirements directly on the licensed work, they rely upon copyright law, which is what they are built on. In the most common interpretations of copyright law, all written works have copyright owned by their creators, whether the work includes that statement or not. In addition, without an explicit statement to the contrary, the copyright owner is the sole owner of all rights to the work. Thus, a file without a copyright statement and a license statement is owned by the author(s) of that file, and no other parties have permission to distribute it (or potentially even use it), even if it was published by the authors on a public website. Having the LICENSE file at the top of the tree is of course quite useful, as is a top-level copyright statement, but just like pages in a book, they aren’t useful if the files are separated from the tree. As @sam says this is not as common as copying individual functions, but it does happen quite often, and since the cost of having the explicit statements is generally low, it’s worth doing.

(Kevin P. Fleming) #15

I’ll give it a try, but I’ll have to keep the script pretty short, as otherwise I’d have to sign a CLA and that would require getting my employer to also sign one :slight_smile: