Discourse AI 的可观察性

aas · 2024 年7 月 12 日 16:00

监控和评估大型语言模型至关重要：

五年前，当我领导创建 CodeSearchNet（GitHub CoPilot 的前身）的团队时，我开始从事语言模型的工作。从那时起，我看到了许多构建大型语言模型产品的成功和不成功的方法。我发现不成功的产品几乎总是有一个共同的根本原因：未能创建强大的评估系统。

如果 Discourse AI 要为企业关键的大型语言模型任务提供支持，我认为应优先支持 LangSmith 等监控工具。

使用 LangSmith 非常简单，只需运行 yarn add langchain langsmith 并添加一些环境变量即可。

Discourse 团队是否考虑过如何配置大型语言模型跟踪？另外，对于在 discourse-ai 官方支持此功能之前我们如何实现这一点，您有什么想法？

Falco · 2024 年8 月 1 日 16:10

哈哈，我希望如此。

我们将 LLM 的每个请求和响应都记录在一个表中，并允许管理员随时通过 Data Explorer 查询。您已经尝试过这个了吗？

{
  "max_tokens": 2000,
  "model": "meta-llama/Meta-Llama-3.1-70B-Instruct",
  "temperature": 0,
  "stop": [
    "\n</output>"
  ],
  "messages": [
    {
      "role": "system",
      "content": "You are a markdown proofreader. You correct egregious typos and phrasing issues but keep the user's original voice.\nYou do not touch code blocks. I will provide you with text to proofread. If nothing needs fixing, then you will echo the text back.\nYou will find the text between <input></input> XML tags.\nYou will ALWAYS return the corrected text between <output></output> XML tags.\n\n"
    },
    {
      "role": "user",
      "content": "<input>We log every single request and response to LLMs in a table, and allow admins to query those at any time via Data Explorer. Have you tried already?</input>"
    }
  ]
}

{
  "id": "chat-45cd241b6e0f4a58840fcc9f49dfa56a",
  "object": "chat.completion",
  "created": 1722528517,
  "model": "meta-llama/Meta-Llama-3.1-70B-Instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "<output>We log every single request and response to LLMs in a table, and allow admins to query those at any time via Data Explorer. Have you tried this already?</output>",
        "tool_calls": []
      },
      "logprobs": null,
      "finish_reason": "stop",
      "stop_reason": null
    }
  ],
  "usage": {
    "prompt_tokens": 135,
    "total_tokens": 174,
    "completion_tokens": 39
  }
}

为我们的功能创建评估肯定在我们的 3.4 版本路线图中，特别是针对我们相关主题和摘要功能的调整。

aas · 2024 年8 月 12 日 16:18

我没说只有这些。() 但我想这无关紧要，因为我认为 LLM 调用是从 Ruby 发出的。

我还没试过，但这太棒了——谢谢！理论上，我可以导出这些并以编程方式在 LangSmith 中创建跟踪，用于评估和实验。

话题		回复	浏览量
LLM prompt evals - Nice to know Community Building ai	0	134	2025 年1 月 6 日
What LLM to use for Discourse AI? Site Management how-to , ai	0	758	2025 年1 月 23 日
Run Discourse AI evals Developer Guides	1	138	2025 年12 月 1 日
Discourse AI - Large Language Model (LLM) settings page Site Management ai , how-to	20	2982	2025 年11 月 26 日
We need prompt chains: Allow custom AI persona tools to access LangChain.js and/or longer execution time Feature ai	5	162	2024 年9 月 19 日

Discourse AI 的可观察性

相关话题