Discourse AI + Data Explorer?

With ChatCPT 3.5 I’m getting unexpected results when I provide the full schema. For example, it keeps trying to find the topic author in the topic_users table.

I get successful results by limiting the information in the prompt to the tables and columns that are required to write the query. Also by adding some additional information about how the columns are used. For example:

# Table name: user_actions
#
#  id              :integer          not null, primary key
#  action_type     :integer          not null (:like=>1,:was_liked=>2,:new_topic=>4,:reply=>5,:response=>6,:mention=>7,:quote=>9,:edit=>11,:new_private_message=>12,:got_private_message=>13,:solved=>15,:assigned=>16)
#  user_id         :integer          not null (the user who is to be credited with the action)
#  target_topic_id :integer
#  acting_user_id  :integer (the user who performed the action, for example, a staff user can perform an action on behalf of a regular user)
#  created_at      :datetime         not null

Another way to look at getting this to work with the Data Explorer plugin would be to have users fill out a dynamically generate form that lists the data they are looking for, and what conditions they want to apply to the data. The prompt could be generated programmatically by Discourse, then sent to the LLM to have the query written.

Edit: @sp-jordan-violet, this might be a useful starting point. Here’s an annotated partial schema that’s working for me. It works well as long as I limit my queries so that they can be answered by the provided schema. I’ve added additional details to the schema to clarify details that seem to confuse ChatGPT. The obvious downside of this approach is that a fully annotated schema will exceed ChatGPT 3.5’s token limit. I’ve included some information in the schema that could be omitted to reduce the number of tokens that are used.

Summary
# == Schema Information
#
# Table name: application_requests
#
#  id       :integer          not null, primary key
#  date     :date             not null
#  req_type :integer          not null ("http_total"=>0,"http_2xx"=>1,"http_background"=>2,"http_3xx"=>3,"http_4xx"=>4,"http_5xx"=>5,"page_view_crawler"=>6,"page_view_logged_in"=>7,"page_view_anon"=>8,"page_view_logged_in_mobile"=>9,"page_view_anon_mobile"=>10,"api"=>11,"user_api"=>12)
#  count    :integer          default(0), not null
#
# Table name: users
#
#  id                        :integer          not null, primary key
#  username                  :string(60)       not null
#  created_at                :datetime         not null
#  updated_at                :datetime         not null
#  name                      :string           (the user's real name)
#  last_posted_at            :datetime
#  active                    :boolean          default(FALSE), not null
#  username_lower            :string(60)       not null
#  last_seen_at              :datetime
#  admin                     :boolean          default(FALSE), not null
#  trust_level               :integer          not null
#  approved                  :boolean          default(FALSE), not null
#  approved_by_id            :integer
#  approved_at               :datetime
#  previous_visit_at         :datetime
#  suspended_at              :datetime
#  suspended_till            :datetime
#  date_of_birth             :date
#  ip_address                :inet
#  moderator                 :boolean          default(FALSE)
#  title                     :string
#  locale                    :string(10)
#  primary_group_id          :integer
#  registration_ip_address   :inet
#  staged                    :boolean          default(FALSE), not null
#  first_seen_at             :datetime
#  silenced_till             :datetime
#
# Table name: topics
#
#  id                        :integer          not null, primary key
#  title                     :string           not null
#  last_posted_at            :datetime
#  created_at                :datetime         not null
#  updated_at                :datetime         not null
#  views                     :integer          default(0), not null
#  posts_count               :integer          default(0), not null
#  user_id                   :integer          (the id of the user who created the topic)
#  last_post_user_id         :integer          not null (the id of the user who created the last post in the topic)
#  reply_count               :integer          default(0), not null
#  deleted_at                :datetime
#  highest_post_number       :integer          default(0), not null
#  like_count                :integer          default(0), not null
#  category_id               :integer
#  visible                   :boolean          default(TRUE), not null
#  moderator_posts_count     :integer          default(0), not null
#  closed                    :boolean          default(FALSE), not null
#  archived                  :boolean          default(FALSE), not null
#  bumped_at                 :datetime         not null
#  archetype                 :string           default("regular"), not null (can be set to either "regular" or "private_message")
#  slug                      :string
#  deleted_by_id             :integer          (the id of the user who deleted the topic)
#  participant_count         :integer          default(1)
#  word_count                :integer
#  excerpt                   :string
#  highest_staff_post_number :integer          default(0), not null
#
# Table name: posts
#
#  id                      :integer          not null, primary key
#  user_id                 :integer          (the id of the user who created the post)
#  topic_id                :integer          not null
#  post_number             :integer          not null (indicates the post's order in its topic)
#  raw                     :text             not null (the post's content)
#  created_at              :datetime         not null
#  updated_at              :datetime         not null
#  reply_to_post_number    :integer          (the post_number that the post is a reply to)
#  reply_count             :integer          default(0), not null
#  deleted_at              :datetime
#  like_count              :integer          default(0), not null
#  bookmark_count          :integer          default(0), not null
#  reads                   :integer          default(0), not null (the number of times the post has been read)
#  post_type               :integer          default(1), not null (:regular=>1, :moderator_action=>2, :small_action=>3, :whisper=>4)
#  last_editor_id          :integer          (the id of the user who last edited the post)
#  hidden                  :boolean          default(FALSE), not null
#  hidden_reason_id        :integer          (:flag_threshold_reached=>1,:flag_threshold_reached_again=>2,:new_user_spam_threshold_reached=>3,:flagged_by_tl3_user=>4,:email_spam_header_found=>5,:flagged_by_tl4_user=>6,:email_authentication_result_header=>7,:imported_as_unlisted=>8)
#  edit_reason             :string
#  word_count              :integer
#  wiki                    :boolean          default(FALSE), not null
#
# Table name: categories
#
#  id                                        :integer          not null, primary key
#  name                                      :string(50)       not null
#  topic_id                                  :integer          (the id of the topic that is used for the category's description)
#  topic_count                               :integer          default(0), not null
#  created_at                                :datetime         not null
#  updated_at                                :datetime         not null
#  user_id                                   :integer          not null (the id of the user who created the topic)
#  topics_year                               :integer          default(0)
#  topics_month                              :integer          default(0)
#  topics_week                               :integer          default(0)
#  slug                                      :string           not null
#  description                               :text
#  text_color                                :string(6)        default("FFFFFF"), not null
#  read_restricted                           :boolean          default(FALSE), not null
#  auto_close_hours                          :float
#  post_count                                :integer          default(0), not null
#  latest_post_id                            :integer
#  latest_topic_id                           :integer
#  position                                  :integer
#  parent_category_id                        :integer
#  posts_year                                :integer          default(0)
#  posts_month                               :integer          default(0)
#  posts_week                                :integer          default(0)
#  topics_day                                :integer          default(0)
#  posts_day                                 :integer          default(0)
#
# Table name: groups
#
#  id                                 :integer          not null, primary key
#  name                               :string           not null
#  created_at                         :datetime         not null
#  automatic                          :boolean          default(FALSE), not null
#  user_count                         :integer          default(0), not null
#  title                              :string
#  bio_raw                            :text             (the group's description)
#  allow_membership_requests          :boolean          default(FALSE), not null
#  full_name                          :string
#  visibility_level                   :integer          default(0), not null (who can see the group :public=>0, :logged_on_users=>1, :members=>2, :staff=>3, :owners=>4)
#  messageable_level                  :integer          default(0) (who can message the group :public=>0, :logged_on_users=>1, :members=>2, :staff=>3, :owners=>4)
#  mentionable_level                  :integer          default(0) (who can mention the group :public=>0, :logged_on_users=>1, :members=>2, :staff=>3, :owners=>4)
#  members_visibility_level           :integer          default(0), not null (who see the group's members :public=>0, :logged_on_users=>1, :members=>2, :staff=>3, :owners=>4)
#
# Table name: group_users
#
#  id                 :integer          not null, primary key
#  group_id           :integer          not null
#  user_id            :integer          not null
#  created_at         :datetime         not null
#
# Table name: user_actions
#
#  id              :integer          not null, primary key
#  action_type     :integer          not null (:like=>1,:was_liked=>2,:new_topic=>4,:reply=>5,:response=>6,:mention=>7,:quote=>9,:edit=>11,:new_private_message=>12,:got_private_message=>13,:solved=>15,:assigned=>16)
#  user_id         :integer          not null (the user who is to be credited with the action)
#  target_topic_id :integer
#  target_post_id  :integer
#  target_user_id  :integer          (for example, the id of the user whose post was liked)
#  acting_user_id  :integer          (the user who performed the action, for example, a staff user can perform an action on behalf of a regular user)
#  created_at      :datetime         not null
#  updated_at      :datetime         not null
#
# Table name: topic_views
#
#  topic_id   :integer          not null
#  viewed_at  :date             not null
#  user_id    :integer         (will be set if the topic was viewed by a logged in user)
#  ip_address :inet            (will be set if the topic was viewed by an anonymous user)
#
# Table name: user_visits
#
#  id         :integer          not null, primary key
#  user_id    :integer          not null
#  visited_at :date             not null
#  posts_read :integer          default(0)
#  mobile     :boolean          default(FALSE) (will be set to TRUE if the user visited on a mobile device)
#  time_read  :integer          default(0), not null (the value returned is the number of seconds of the visit)

Here’s a link to a chat session where I tried it out: Total Hours Anonymous August

If you want to try expanding on this list, the easiest way to do it is to copy the schema from the bottom of the model you’re interested in, directly from the Discourse code: https://github.com/discourse/discourse/tree/main/app/models.

3 Likes