Algolia reindex problems

While performing algolia:reindex using the rake task, it crashed with the following error

Clearing users from Algolia
Pushing users to Algolia
...
Successfully pushed 1849 users to Algolia
Clearing tags from Algolia
Pushing tags to Algolia
..
Successfully pushed 53 tags to Algolia
Clearing posts from Algolia
Pushing posts to Algolia
rake aborted!
Algolia::AlgoliaHttpError: Record at the position 662 objectID=690 is too big size=20920/20000 bytes. Please have a look at https://www.algolia.com/doc/guides/sending-and-managing-data/prepare-your-data/in-depth/index-and-records-size-and-usage-limitations/#record-size-limits (Algolia::AlgoliaHttpError)

As far as I can see the plugin does not contain any functionality to split longer posts into separate chunks. I was able to work around the problem by not including long posts by adding

objects.reject! { |object| object.to_json.bytesize > 20000 }

right before the @index.save_objects call but it also means these posts are not being indexed at all.

2 Likes

Any ideas what the API expects? Are we supposed to truncate or send it through in multiple chunks?

Per Indexing long pages | Algolia it needs to be chunked, and distinct needs to be set to true when searching.

The maximum size is dependent on the plan but I cannot find a way to query it. Given the fact that there is also an “average record size across all records” limit on certain plans, it might be good to chunk at 10 KB 10000 bytes.

2 Likes