Shields.io unable to retrieve Discourse statistics API

haikalpribadi · September 14, 2019, 1:25pm

Hi everyone,

We’ve had a Discourse shield on our repository for a while now, and it recently stopped working.

If you go to https://shields.io/category/chat and select any Discourse shield, you can enter your discourse domain address and it will show you the shield with the correct statistics. You can try this with meta.discourse.org.

However, when we enter our discourse host address (https://discuss.grakn.ai), for any statistics and for both http/https, it always returns “invalid”.

When a host is not found, Shields.io would return “inaccessible”. Thus we assume “invalid” means it’s accessible but there are access rights issues or invalid responses.

Is it possible that a recent update/upgrade broke something on Discourse statistics API that Shields.io uses?

Thank you so much!

marianord · September 15, 2019, 3:00am

It’s working for me in my site, maybe you’re not setting correctly the protocol? Or the Grakn Discourse has any kind on modification that breaks that endpoint.

gerhard · September 15, 2019, 10:35am

You might want to ask Shields.io about that problem. It works with all other sites I tested, so this isn’t our bug.

haikalpribadi · September 15, 2019, 7:20pm

@marianord that’s exactly my question: where are the “protocols” you’re mentioning? How can they be configured? I’ve not changed any settings.

@gerhard given that shields.io is working for other Discourse sites, it does not seem likely to be an issue on their side. Unless they’re re not reading the output from our site statistics properly - but how can we find out about this? What is the endpoint from Discourse that is used to query the statistics? Perhaps we should start there?

marianord · September 15, 2019, 7:27pm

I’m mentioning http vs https.

max_grakn · September 19, 2019, 11:05am

This happened because our Discourse installation blocked Shields.io’s user agent (Shields.io). This setting is named whitelisted crawler user agents and can be edited at
<discourse_server>/admin/site_settings/category/all_results?filter=crawler

haikalpribadi · September 19, 2019, 11:18am

Interesting! Thank you @max_grakn! We did add Googlebot to the whitelist recently, I think that may be the cause.

@codinghorror are we meant to use Blacklist and Whitelist at the same time? As in, if you add things to whitelist, does that mean everything else is blacklisted (which therefore make the blacklist redundant)?

codinghorror · September 19, 2019, 12:41pm

No, the crawler whitelist is very dangerous, and should only be used carefully per the help text.

User agents of web crawlers that should be allowed to access the site. WARNING! SETTING THIS WILL DISALLOW ALL CRAWLERS NOT LISTED HERE!

barto_95 · November 13, 2020, 10:16am

Hi, I have the same problem when a test I received invalid data… do you have idea ?

It’s OK now, it’s necessary to activated anonymous statistic in :

Admin → setting → other → share anonymized statistics = Enabled

and now it’s ok for shields.io

spdegabrielle · November 15, 2023, 6:22pm

I have the same problem, but share anonymous statistics is already enabled

[![Racket Discourse](upload://6fa5jbSn04vRLXdubAYmFJt5emf.svg)]

[]

Arkshine · November 15, 2023, 7:41pm

Related to:

github.com/discourse/discourse

DEV: Ability to collect stats without exposing them via API

discourse:main ← discourse:dev/ability-to-collect-stats-without-public-exposing

opened 06:37PM - 13 Oct 23 UTC

AndrewPrigorshnev

+228 -119

This adds the ability to collect stats without exposing them among other stats v…ia API. The most important thing I wanted to achieve is to provide an API where stats are not exposed by default, and a developer has to explicitly specify that they should be exposed (`expose_via_api: true`). Implementing an opposite solution would be simpler, but that's less safe in terms of potential security issues. When working on this, I had to refactor the current solution. I would go even further with the refactoring, but the next steps seem to be going too far in changing the solution we have, and that would also take more time. Two things that can be improved in the future: 1. Data structures for holding stats can be further improved 2. Core stats are hard-coded in the About template (it's hard to fix it without correcting data structures first, see point 1): https://github.com/discourse/discourse/blob/63a0700d45f755d0f432a9075ae7afbed9cd6ab0/app/views/about/index.html.erb#L61-L101 The most significant refactorings are: 1. Introducing the `Stat` model 2. Aligning the way the core and the plugin stats' are registered

It’s because a few fields have been renamed in their plural form.

topic_count → topics_count
post_count → posts_count
user_count → users_count
like_count → likes_count

Someone will need to modify the code here to fallback to the plural form:

github.com

badges/shields/blob/master/services/discourse/discourse.service.js

import camelcase from 'camelcase'
import Joi from 'joi'
import { metric } from '../text-formatters.js'
import { nonNegativeInteger, optionalUrl } from '../validators.js'
import { BaseJsonService } from '../index.js'

const schema = Joi.object({
  topic_count: nonNegativeInteger,
  user_count: nonNegativeInteger,
  post_count: nonNegativeInteger,
  like_count: nonNegativeInteger,
}).required()

const queryParamSchema = Joi.object({
  server: optionalUrl.required(),
}).required()

class DiscourseBase extends BaseJsonService {
  static category = 'chat'

This file has been truncated. show original

spdegabrielle · November 20, 2023, 3:52pm

This is still broken - is it working for anyone else ?

![Racket Discourse](upload://7asTK98zfWLXRebm11uEN3KPM5N.svg)

BryanV · December 6, 2023, 5:35pm

There was a PR opened a few days ago:

github.com/badges/shields

[Discourse] Update schema keys to use plural form (`topic_count` -> `topics_count`)

badges:master ← joshuacwnewton:jn/9776-fix-discourse-js-schema

opened 08:16PM - 03 Dec 23 UTC

joshuacwnewton

+110 -71

This PR does a simple find and replace to update the Discourse schema to use the current keys found in https://meta.discourse.org/site/statistics.json, as the older keys now return "Invalid response". > [!NOTE] > I was wondering, though, if the old keys should still be preserved, as to not break backwards compatibility with servers using the old names. > > (I tried finding the source of the change in Discourse's release notes/changelog, but there's surprisingly little information when querying e.g. "statistics.json" or "topic_count"/"topics_count". So, I'm not sure when this change actually took place, and thus it's hard to estimate the likelihood of there being servers out there using the old keys.) > > If we do want to support both sets of keys, then I'm not 100% sure what to do, as I've never written JavaScript before. (I tried reading the Joi docs, and got as far as using `.or()` in the schema definition for each pair of keys + using a `try`/`catch` when creating the `metricIntegrations` object, but I wasn't confident enough to commit that solution before discussing it first.) Fixes #9776.

But naturally this change places a burden on shields.io to support both versions (in perpetuity, I guess) since there is no guarantee that any given Discourse instance has updated or not, so it is not so simple as changing to plurals.

Topic		Replies	Views
Is there a nice GH badges (ala shields.io images) design to put in README to link from GitHub to our hosted discourse? Feature	17	5048	September 13, 2019
Discourse CDNs are blocked by privacy badger Site feedback	47	12781	August 15, 2017
Can I use the Discourse API on an officially-hosted instance? Support	1	314	July 9, 2021
Access reports JSON data via Pipedream Dev rest-api	2	797	July 28, 2021
Daily Summary Feedback Site feedback feedback	233	6375	October 21, 2025

Shields.io unable to retrieve Discourse statistics API

Related topics