Add headers to denote version on API responses

(Sawood Alam) #1

In addition to the meta tag in the markup, it would be useful to also include a response header such as x-generator or x-powered-by. For example x-generator is used by Drupal CMS in combination with the generator meta tag. Having this information in the header would be great for tools that interact with the API where it is often easier to discover some meta information without requesting or parsing the content.

I think the right place to inject this header would be in the application controller where x-discourse-route header is being added, but I might be wrong as I don’t know the code base well.

How about an easier way to determine version of Discourse by end user
(Jeff Atwood) #2

It seems pointless to add overhead on every request for such a rarely needed function.

(Sam Saffron) #3

The API argument is reasonably strong, I am not really strongly against the extra header.

(Sawood Alam) #4

I would mildly disagree with that because advertising the software and version responsible for generating a response helps feature discovery/expectation/detection and third party tools can react accordingly. With that in mind, one might argue that a separate endpoint can be there just to query the capabilities or environment, but that would require the software to advertise such an endpoint (ref: HATEOAS). Additionally, HTTP is an stateless protocol, so the tool will have to keep some sort of session record to relate the successive requests after discovering the meta info, hence a self contained response would be more preferred.

Some tools (apart from those interacting with the JSON API) that might be interested in knowing the version of a web application include crawlers and web archives. For example, web archives later perform some rewriting to make sure pages render properly with relevant assets loaded from the archive from the nearest time frame. As the web progresses, technologies change, so the archive replay system would get great help if some specific tweak is needed to some known mass deployed frameworks/CMSs. I would point out that archives don’t store session information, but they do archive all the request and response headers along with the payload.

(Jeff Atwood) #7

Only on API responses, or on all responses?

(Sawood Alam) #8

I think it would be helpful to add it globally to all responses as some tools such as [search engine or web archives] crawlers would perhaps/preferably hit the user facing content. While other tools specifically made for Discourse would hit the API more often.

(Matt Palmer) #9

This is what we have API versioning for. User agent detection was a complete clusterfudge when it was sites trying to figure out how to be compatible with browsers (why does every browser advertise itself as Mozilla? Because UA detection), and it’s a equal clusterfudge when clients try to do it for servers.

That being said, I support reporting the version of Discourse used to generate the response, as part of the Server response header, because it is useful debugging information when trying to figure out why things may be behaving in an unexpected manner, and can be included in bug reports.

(Sawood Alam) #10

API versioning based on URL namespacing is good for allowing compatibility grace period for tools while they catch up with the new version (such as a few years ago when Twitter migrated to API v2, they had the older API available for more than a year to not break millions of applications). URL based API versioning also has some practical limits such as only major versions get their name space, not every single update, which will otherwise cause a disaster for the applications to catch up with. It also assumes that every instance of the service running the same software is bound to follow the same URL based versioning rules (which may be enforced in this case, but the point is still valid on a broader perspective).

The situation with the browser user agent string is very different. It is a legacy of the dark ages of browser war era when Mozilla and IE were the only two players in the market and they were intentionally introducing incompatibilities to gain the monopoly on the browser market. Web application developers started using browser detection based on the UA string to branch the code to execute on respective platforms. Later, when things on the web started to become more standard as the feature set of each browser started to converge, it was in the browsers’ interest to exploit the ways application programmers have coded to make certain features available to users. Let’s take an example here, say Mozilla introduced a unique capability that had no alternative in IE. Application developers added some features in their site with if UA ~= Mozilla condition to leverage that new capability. Then Microsoft introduced another capability in their browser that was unique to IE in the next release. Some web application programmers leveraged that capability with another conditional. Later when things started to converge on the browser end, and intentional incompatibility war was over, they had no choice but to add every term in the UA that application programmers had used in their past code so that without fixing legacy code, they allow those applications to work in all browsers. Chrome, that entered quite late in the market, had no choice but to carry on with that legacy practice or break the past web.

I would strongly vote against reusing Server header for this purpose because often web servers such as Apache or Nginx (not the framework) tend to override this header. Even if those severs don’t override it, one would lose the useful information about which server was used to serve the content.

(Matt Palmer) #11

URL based API versioning is not the only option. Even if it was, your assertions about the limitations of the technique are completely and utterly incorrect.

No, it really isn’t.

RFC7231 says otherwise.

(Sawood Alam) #12

I am aware of this, in fact almost two decades old RFC2616 says similar things, but I have seen it being overwritten. I think the issue there is the independence of application server and web server. The application server might not know about the web server/load balancer so it may not be able to include that while constructing the value string. Similarly, the web server might not be aware about the application server so it may just end up overriding the value. Even though if the web server decides to be a good citizen and just add it’s identity to the existing value, who will determine the order or precedence, should it report itself before or after the application server’s supplied value? In fact I have encountered proxies that override this header, although they are explicitly noted not to do so and use Via field instead.

That being said, I wont mind if it is reported in Server header reliably, honoring the RFC and being understood by tools more reliably as opposed to introducing another x-* header. However, my initial reasoning to use a different header is to avoid the above described issue.