Why doesn't Discourse support IndexNow?

In today’s fast-paced and rapidly obsolete information, indexing speed is one of the most important factors for success.

But why doesn’t Discource support this protocol at all? https://www.indexnow.org/

3 Likes

Because no one cared enough to build a plugin or a pull-request to support it. Which I’d say it’s probably caused by the fact that Google doesn’t support IndexNow, which is the search engine most people care about.

But if you want to build a plugin to add this feature that’s a welcome contribution!

13 Likes

I would like to contribute to the community and program this extension, but we are not programmers.

Google’s attitude towards IndexNow is that they are testing it and we’ll see.

Any news on Index Now? Now that even Open AI run a search engine that draws from the Bing index and that is linked to IndexNow, it makes even more sense.

Then you could commission the plugin in Marketplace. I can imagine solutions in the $500 to $2000 range. Others might have better imaginations than I.

3 Likes

I agree, now seems a great time for Discourse to support IndexNow :))

1 Like

After reviewing the IndexNow capability, I agree this should be one of the core features/plugins. I also understand that developer resources are limited.

Here are my thoughts on the required plugin to assist the core team. Please feel free to add additional comments.

Assumptions:

  1. IndexNow Plugin will use bulk notifications on a scheduled time model - See Design Consideration #1
  2. Bulk notifications will be set on a time interval
  3. Notifications will only use public topics
  4. Notifications will only be for new/changed/deleted topics when the plugin is enabled.
  5. Plugin will not retroactively notify historical changes/events.

Instructions for Users:

  1. Sign up with the IndexNow search engine of choice.
    • Obtain your API Key
    • Obtain the search engine endpoint URL
  2. Install Plugin
  3. Setup admin

Use Case - Admin Settings

  1. Allow user to turn on/off auto submission capabilities
  2. Allow user to enter the IndexNow search engine endpoint. See Design Consideration #3.
    • Input field is a text parameter
    • Input field must be a valid URL
    • default to Bing URL at https://www.bing.com/indexnow
  3. Allow user to input and store API key
    • Input string field to store API key
    • Input field is alphanumeric
    • Default value will be “”
  4. Allow user to define scheduled time parameters for bulk notifications
    • Time parameter will be set by interval hours
    • Input string to store hour value
    • Valid inputs will be integers
    • Valid inputs can vary from 1 to 24
    • Default value will be 12

Use Case - Text Key File

  1. The system will generate a file called indexnowkey.txt
  2. The key file must be stored at the root level.
  3. The system will populate the file with the API key
  4. The file will be accessible via any remote users/system via http/https

Use Case - Scheduling of Bulk Notification process

  1. The system will schedule the jobs to process on an interval basis based on the setting defined in the admin settings.
  2. The interval value defines the delay between jobs in hours. For example, an input value of 2 would indicate the job should run every 2 hours. A value of 4 indicates the job should run every 4 hours. A value of 24 would indicate that the job should run once a day.

Use Case - Bulk Notification process

  1. The system will determine if the notification process is activated via the site setting defined in the admin settings.
  2. The system will determine if an API key is valid in site settings - not “”.
  3. The system will create a list of topics based on the defined time interval setting. See Design Consideration #2 on query time frames. The topic parameters for inclusion are:
    • Topics must be Public View only
    • New Topics
    • Topics with new posts
    • Topics with posted edited
    • Deleted topics
    • Topic list must be distinct - no duplicates
  4. The system will create the JSON packet using the following format.
{
  "host": "current_site",
  "key": "api_key",
  "keyLocation": "https://current_site/indexnowkey.txt",
  "urlList": [
      "https://www.example.com/url1",
      "https://www.example.com/folder/url2",
      "https://www.example.com/url3"
      ]
}
  1. The JSON packet will be sent to the following
    • URL: sitesettings.search_engine_indexnow_endpoint
  2. The JSON packet will be sent with the following headers
    • Content-Type: application/json; charset=utf-8
    • Http/1.1
    • Host: bing
  3. Validate submission receipt of HTTP request
    • http 200 - successful submission - end process
    • Http 429 - Too many submission attempts - Send notification to the administrator to increase interval timing

Design considerations:

  1. Bulk Notifications vs. Single Notifications—A single notification would be acceptable for small domains, but for larger boards, adding a notification for every new/updated post could create many event processes. From a search engine indexing performance perspective, bulk notifications on an hourly basis would be acceptable for 80% of the forums.
  2. Bulk notifications query timing - SideKiq controls interval timings. If SideKiq is in a heavy process status, the bulk notification process could be delayed. The bulk notification process could miss new/updated topics if the query timeframe equals the scheduling interval. Should a time parameter extend the query to cover delayed processes? Or is it possible to have the Scheduler pass initiated timestamps to control query time intervals? Or do we need to create a database table/value for submitted topics with a time stamp?
  3. Should we build an internal table with each search engine and the defined IndexNow URL endpoint? The user can select the choices from a drop-down menu instead of entering a URL. This removes potential human error.

What is missing? What would you add?

Is there a way to leverage our existing outgoing webhook support to accomplish some/all of what you want?

1 Like

That seems like a pretty decent outline. I think I’d do only bulk/batch submissions to avoid having two methods to write, debug, test, and maintain.

Or maybe a single bulk/batched job could avoid the rate limiting issues and then have just one way of submitting stuff (just in a batch, never on a per-post level).

A version that submitted to a single endpoint might be $2000 for something that appeared to work and had minimal error handling to $5000 for something with at least some specs to do testing; and maybe could handle notifying multiple endpoints?

You are asking a great “How” question. I am not the best person to ask Discourse “How” questions.

I am good at documenting the “What” is needed. Getting a good, clean definition of “What” is needed will make the coding go faster and thus cheaper.

To answer the “What” for webhooks, I believe it references single vs. bulk notifications. I have a medium-sized forum and would prefer bulk notifications.

  1. I do not need the search engines to be notified when a topic is created or updated.
  2. I do not like adding lower-priority events within critical processes like topic and post-creation. Adding additional events increases the wait time for the users. A bulk method only requires one SQL query and an HTTP send. It can be processed as a back end event outside of user interaction.

The plugin would only need to be developed for one endpoint. The IndexNow agreement requires search engines to share submissions between them. i.e., you submit to Bing, and then Bing submits to the other IndexNow-compliant search engines.

We need 30 members to crowd-fund at $100 each to get the plugin developed.

1 Like