I think the approach I suggested in that topic needs to be improved. One problem with it is that only TL3 and above users can tag posts on Meta. That means that the majority of the site’s users wouldn’t be able to follow my instructions. The other issue is that we’ll end up with both unanswered topics and answered topics having the data-explorer tag. That won’t help much with searching for queries.
Sorry for the late answer. I got caught up in the question about how to organize Data Explorer queries on the site. Using the data-explorer tag seems like the ideal solution, but topics that contain a Data Explorer query will need to be tagged by a user with TL3 status.
I think that something like the following query will give you the information you’re looking for:
SELECT
topic_id,
category_id,
SUM(total_msecs_viewed) / 60000 AS estimated_minutes_read
FROM topic_users tu
JOIN topics t ON t.id = tu.topic_id
WHERE t.deleted_at IS NULL
AND t.archetype = 'regular'
GROUP BY tu.topic_id, category_id
ORDER BY estimated_minutes_read DESC
LIMIT 100
The LIMIT 100 statement in the last line of the query could be adjusted or removed if you want more results to be returned.
If I pick four or five at random, and compare the estimated read time column result in this query, to the estimated read time in the topic itself, I’m getting two very different numbers?
You can use the average time that the users take to read the topic.
In that case, you can just change the SUM function to AVG, it would look like this:
SELECT
topic_id,
category_id,
AVG(total_msecs_viewed) / 60000 AS estimated_minutes_read
FROM topic_users tu
JOIN topics t ON t.id = tu.topic_id
WHERE t.deleted_at IS NULL
AND t.archetype = 'regular'
GROUP BY tu.topic_id, category_id
ORDER BY estimated_minutes_read DESC
LIMIT 100
But that would mean, on average, a person takes 438 minutes to read that top topic? That seems unlikely. This may sound silly, but did you have enough 0s in 60,000?
Edit: Or maybe the AVG includes all the re-readings of a topic too? So once through would be 61 minutes, but actually users spend on average 438 minutes in there.
Though now I’m quite interested to know how the Estimated Read Time is worked out for the Summary, as ideally you’d want those to match. Even shrinking those by a factor of ten would only ballpark it.
I struggle with deciphering these things, but it seems it’s uses a word count x time figure (plus a minimum time to cover posts with no words, like images).
There was also this one which gave a hint as to what the final value may be called: (though its old, so may have changed?)
Not massively helpful, I’m sure, but thought I’d share just in case.
He vuelto a examinar esto y parece que (en su forma más simple) es el topic.word_count multiplicado por la configuración de administración de ‘tiempo de lectura por recuento de palabras’ (500 palabras/min por defecto). Así que creo que esta consulta produciría los “temas más largos de leer” principales X:
-- [params]
-- integer :limit = 10
SELECT t.id as topic_id, (t.word_count)/500+1 AS estimated_read_time
FROM topics t
WHERE t.word_count IS NOT NULL
AND t.archetype = 'regular'
ORDER BY t.word_count DESC
LIMIT :limit
Aunque también existe la alternativa de ‘4 segundos mínimo’: (cantidad de publicaciones x 4)/60. Que está ahí para tener en cuenta los temas de fotos sin recuento de palabras. Así que funciona para ambos y muestra el que sea mayor. Pero aún no he descubierto cómo añadir eso.
Desafortunadamente, no tengo un sitio lo suficientemente grande como para probarlo adecuadamente. Pareció funcionar en una pequeña muestra de prueba, pero puede que necesite ajustes.
Editar: Añadí un parámetro ‘limit’ para que se acerque más a la especificación del OP.
Lo he intentado de nuevo. No estoy 100% seguro de este, ya que no tengo una muestra lo suficientemente grande para probarlo, pero ha recogido mis temas de prueba.
-- [params]
-- integer :limit = 10
WITH read_time AS (
SELECT t.id as topic_id,
(t.word_count)/500+1 as word_count_time,
(t.posts_count*4)/60+1 as post_count_time
FROM topics t
WHERE t.word_count IS NOT NULL
AND t.archetype = 'regular'
AND t.deleted_at IS NULL
)
SELECT topic_id, CONCAT (CASE WHEN word_count_time > post_count_time THEN word_count_time ELSE post_count_time END, ' min') AS estimated_reading_time
FROM read_time
ORDER BY estimated_reading_time DESC
LIMIT :limit