Discourses API get just the number of search results

gradam · 21 ديسمبر 2017، 10:18ص

Hi. I am trying to get just the number of search results from the API.
I have the following query /search.json?q=query but i just need information about how many results there are. Not blurbs, cooked, etc.
Is it possible with discourse API?

blake · 21 ديسمبر 2017، 3:44م

I don’t think we return a “count” in the response, but it is something you can calculate yourself.

See the search API docs for a more detailed response example, but it will look something like this:

{
    "posts": [],
    "topics": [],
    "users": [],
    "categories": [],
    "grouped_search_result": {}
}

Be default the API will return a max of 50 results. To calculate the count you need to just count the number of items in the posts array. The number items in the topics array should be the same so there is no reason to count that array too.

vsoch · 22 سبتمبر 2019، 4:31م

أحاول كل الطرق التي يمكنني التفكير فيها لتحميل جميع المواضيع والمشاركات من موقعي – فالخيارات الأحدث والأفضل محدودة، والآن أحاول جلب جميع الفئات ثم إجراء بحث ضمن كل فئة (مشابهًا للطريقة التي أستخدمها داخل الموقع). على سبيل المثال، إذا بحثت في موقعنا عن “Q&A #q-a” هنا، أحصل على أكثر من 50 نتيجة. أما عند استخدام نفس السلسلة النصية بالضبط مع مكتبة discourse_api بلغة Ruby، فالحصول على 5 نتائج فقط:

irb(main):123:0> topics["posts"].length
=> 5
irb(main):124:0> topics["topics"].length
=> 5

لماذا لا يتوافق هذا مع الواجهة ومع ما يتم الإبلاغ عنه؟ ما أسهل طريقة لتصدير البيانات؟ أود إجراء بعض عمليات معالجة اللغة الطبيعية (NLP) على محتوى موقعنا، لكن مجرد الحصول على البيانات أثبت أنه أمر صعب للغاية. شكرًا لك!

sam · 30 سبتمبر 2019، 7:59ص

الصفحات في “الأحدث” متاحة، عليك فقط تمرير المعاملات (params) بشكل صحيح وستتمكن من الوصول إلى جميع المواضيع عبر واجهة برمجة التطبيقات (API).

كما أن البحث يدعم التصفح الصفحي.

أنصحك بزيارة Reverse engineer the Discourse API كدورة مكثفة لفهم جميع المعاملات التي تحتاجها.

vsoch · 30 سبتمبر 2019، 4:27م

شكرًا لك يا @sam! يمكنني أن أرى (حتى من طلب GET) أن الأمر يجب أن يكون بديهيًا إلى حد ما - عندما أريد الحصول على الصفحة 2، أضيف خيارًا إضافيًا للصفحة. كما يمكنني أيضًا رؤية أن “الخيارات” شيء يمكنني تعريفه باستخدام دالة discourse_api:

# frozen_string_literal: true
module DiscourseApi
  module API
    module Search
      # Returns search results that match the specified term.
      #
      # @param term [String] a search term
      # @param options [Hash] A customizable set of options
      # @option options [String] :type_filter Returns results of the specified type.
      # @return [Array] Return results as an array of Hashes.
      def search(term, options = {})
        raise ArgumentError.new("#{term} is required but not specified") unless term
        raise ArgumentError.new("#{term} is required but not specified") unless !term.empty?

        response = get('/search/query', options.merge(term: term))
        response[:body]
      end
    end
  end
end

إذن - عند تجربتها، أتوقع الحصول على نتائج مختلفة هنا للصفحة 1 والصفحة 2. أو دعنا نعطي فصلًا أكبر قليلاً ونجرب الصفحتين 1 و 3. الاستعلام هو عن جميع مواضيع الأسئلة والأجوبة:

 query = category["name"] + " #" + category["slug"]
=> "Q&A #q-a"

الآن دعنا نسترجع الصفحتين 1 و 3 باستخدام عميل discourse_api:

topics1 = client.search(query, options={"page": "1"})
topics3 = client.search(query, options={"page": "3"})

يمكنني النظر في الموضوع الأول لكل منهما:

=> {"id"=>220, "title"=>"Why am I exceeding the quota?", "fancy_title"=>"Why am I exceeding the quota?", "slug"=>"why-am-i-exceeding-the-quota", "posts_count"=>3, "reply_count"=>0, "highest_post_number"=>3, "image_url"=>nil, "created_at"=>"2018-06-01T12:56:12.120Z", "last_posted_at"=>"2018-06-15T16:41:44.736Z", "bumped"=>true, "bumped_at"=>"2018-06-15T16:41:44.736Z", "unseen"=>false, "pinned"=>false, "unpinned"=>nil, "visible"=>true, "closed"=>false, "archived"=>false, "bookmarked"=>nil, "liked"=>nil, "tags"=>["storage", "quota"], "category_id"=>26, "has_accepted_answer"=>false}

irb(main):148:0> topics3['topics'][0]
=> {"id"=>220, "title"=>"Why am I exceeding the quota?", "fancy_title"=>"Why am I exceeding the quota?", "slug"=>"why-am-i-exceeding-the-quota", "posts_count"=>3, "reply_count"=>0, "highest_post_number"=>3, "image_url"=>nil, "created_at"=>"2018-06-01T12:56:12.120Z", "last_posted_at"=>"2018-06-15T16:41:44.736Z", "bumped"=>true, "bumped_at"=>"2018-06-15T16:41:44.736Z", "unseen"=>false, "pinned"=>false, "unpinned"=>nil, "visible"=>true, "closed"=>false, "archived"=>false, "bookmarked"=>nil, "liked"=>nil, "tags"=>["storage", "quota"], "category_id"=>26, "has_accepted_answer"=>false}

هي نفسها تمامًا، مما أعتقد أنه يعني أن متغير الصفحة لا يعمل؟ عندما أفحص في أدوات مطوري Chrome، يتم تشغيل النقطة عند التمرير لأسفل (بما أن المنشورات يتم تحميلها تلقائيًا في النافذة)، يمكنني تأكيد أن page=2 هو المعامل الصحيح:

Request URL: https://ask.cyberinfrastructure.org/search?q=Q%26A%20%23q-a&page=2
Request Method: GET
Status Code: 200  (from ServiceWorker)
Referrer Policy: strict-origin-when-cross-origin

أو بشكل أفضل، انظر فقط إلى قائمة المعاملات:

Query String Parameters
q: Q&A #q-a
page: 2

هذا ليس نموذج إرسال، لذا لا أرى أي “بيانات نموذج” حسب المثال.

vsoch · 18 أكتوبر 2019، 8:48م

هل لدى أحدكم أي حكمة هنا؟ لقد جربت ما تم اقتراحه، لكنني لا أرى خطوة منطقية تالية. يبدو أن متغير الصفحة لا يعمل عند تقديمه مع الطلب.

simon · 18 أكتوبر 2019، 10:27م

يستخدم جيم API الخاص بـ Discourse مسار /search/query. يبدو أنه لا يستجيب للترقيم. بينما واجهة مستخدم Discourse تستخدم مسار /search. وهو يستجيب للترقيم.

يمكنك اختبار ذلك في متصفحك بالانتقال إلى http://forum.example.com/search.json?q=test ثم تجربة http://forum.example.com/search.json?q=test&page=2.

قد تحتاج إلى إيجاد طريقة لإجراء استدعاء API دون استخدام جيم Discourse API. إذا كان هدفك هو الحصول على جميع المواضيع والمنشورات في موقعك، فإن استخدام مسار /search لا يبدو أفضل نهج.

يمكنك تجربة إجراء استدعاء API إلى http://forum.example.com/c/your-category-slug.json. إذا لم يتم إرجاع جميع مواضيع الفئة في الطلب، فسيحتوي topic_list في الطلب على خاصية more_topics_url ستمنحك المسار إلى الصفحة التالية من المواضيع. سيبدو ذلك شيئًا مثل "/c/site-feedback?page=2". ستحتاج إلى إضافة .json إلى عنوان URL للحصول على بيانات JSON (/c/site-feedback.json?page=2).

vsoch · 19 أكتوبر 2019، 3:41م

شكرًا لك! لقد نجح الأمر بشكل مثالي تمامًا، وهو أسهل بكثير في بايثون باستخدام مكتبة requests (كنت أصعّب الأمر عمدًا لأصبح أكثر ارتياحًا مع روبي، لكن العميل لم يكن يحتوي على ما أحتاجه). لقد أتممت معظم عمليات التصدير ولم أقم بعد بأي عمليات تعلم آلي، لكن إذا كان أي شخص مهتمًا بالطلبات التي قمت بها، فستجد السكريبتات السريعة هنا: GitHub - hpsee/discourse-cluster: Simple scripts to export posts for a discourse category, and do a clustering · GitHub. آمل أن أقوم قريبًا ببعض التجميعات الرائعة!

vsoch · 20 أكتوبر 2019، 5:30م

شكرًا مرة أخرى @sam و @simon - في حال كان هناك آخرون مهتمون في أي وقت بإجراء تصدير بسيط للمواضيع، أو (المضي قدمًا) وإجراء تجميع مع تصورات باستخدام d3، فقد قمت بكتابة منشور سريع يشرح العملية بالتفصيل AskCI Discourse Clustering | VanessaSaurus. ومرة أخرى، كل ما تحتاجه للبدء موجود في المستودع الذي ربطت به سابقًا.

الموضوع		الردود	مرات العرض
Search API only giving 5 posts while testing in Postman Development rest-api	1	818	12 أكتوبر 2018
Api to fetch topic by page Development rest-api	4	1888	25 مايو 2017
Maximum Number of Search Results? Support	12	2661	10 نوفمبر 2017
Fetch All Posts from a Topic Using the API Integrations rest-api , how-to	4	3002	17 ديسمبر 2024
How to get all topics from a specific category using offset/page param in the API query? Data & reporting	4	160	23 يناير 2025

Discourses API get just the number of search results

الموضوعات ذات الصلة