For 1.9 we plan to add a search log.
Unfortunately, the “log every search the server makes” approach is incorrect as we perform searches as people are typing which will result in a massively noisy log.
- Create a new table
term, user_id (nullable), ip_address, created_at, clicked_topic_id (nullable), source_id (either header or fullpage)
- Log on server with the following algorithm on search
UPDATE term SET term = :new_term created_at: :now WHERE created_at < 5.seconds.ago AND position(term in :new_term) = 0 AND (user_id = :user_id OR ip_address = :ip_address) term: new_term, now: Time.zone.no, ip_address: request.ip If update touches zero rows, then insert a **new** search log row
Or, in English
Update existing search log row IF:
- Same user (for anon use ip address, for logged in use user_id)
- Previous search started with the text of current search, eg: previous was “dog” and new is “dog in white”
- Previous search was logged less than 5 seconds ago
On click on search result (in either full page search or header) update the
clicked_topic_id, (have search results return log id, then update it based on log id + user match + in last 10 minutes)
Limiting log size
So the log does not grow forever there should be a site setting for maximum rows to store. Default should be about a million.
A weekly job can delete oldest rows or something.