Data Explorer Plugin

official

(Kane York) #66

All queries have a LIMIT 500 (?) by default; you can change that by either

  1. Downloading the results raises the limit to either 10K or 100K records, I forget which
  2. manually running the query via the HTTP API, and increase the limit parameter to what it needs to be.

As @Mittineague said, be careful: the posts table includes private messages, for example. You may want to exclude those from analysis. The schema browser on the right side of the editor flags some especially sensitive columns, such as password_hash.


(Justin Veenema) #67

@riking / @Mittineague:

Thanks for your responses. We’ll definitely need to make sure our approach is kosher, and ensuring the sensitive data isn’t available to the students. We’re mostly looking to mine things like text data / user behaviours… not anything personally identifiable, just things that are generally publicly available to anyone who can view our Community.

Good call on the direct user messages, didn’t even realize that was in there. This changes things considerably, so we’ll proceed with caution before moving forward.

However - any idea why we’re getting mixed results for the # of posts?


(Kane York) #68

No idea! That’s getting into the realm of software forensics; “what particular set of circumstances must have happened for one of all the revisions ran from time X until now to produce this result?”. If it’s not hurting anything, it’s not worth the time & money to figure it out.


(Pad Pors) #69

Hi,

are the outputs of this plugin only accessible as a file? or let me say can I somehow read the output using a query or a code?


(Rafael dos Santos Silva) #70

Using the results of queries as an endpoint for customization is out of scope, and dangerous.


(Pad Pors) #71

why dangerous? danger in terms of any harm to the main database or some data-manipulation?


(Rafael dos Santos Silva) #72

You would need to expose an admin api key.


(Pad Pors) #73

thanks, is there any other way to use the output data as the source for another code?

simple way that comes to my mind is to post the output on an IP and then get it from that IP for the code. but I’m not sure if this is the best way for large databases. any other possible way?


(Rafael dos Santos Silva) #74

Do the same thing you do with the query in a plugin.


(SMHassanAlavi) #75

How can execute query in javascipt plugin?the languages are diffrent
Is there any refrence of the discourse plugin?


(Justin Veenema) #76

Hi all,

Sorry to resurrect this thread, but we’d like to move forward with this without exposing any private info from our Community.

How can we differentiate the posts that are private messages, as opposed to public? We’d like to obviously take this out of the data-set (as well as any other personal/private information). We’ll have complete control over the dataset and will hand them a file they can work with, just need to know what I should be looking for.

Does anyone have any idea on what I would need to do in this case?


(Jay Pfaffman) #77

Another solution, that I don’t think I mentioned here is to use API to download the messages. You could use this as a starting point.


(Tom Newsom) #78

We use

WHERE topics.archetype = 'regular'

to only include regular topic posts in searches. Sometimes we also use

WHERE categories.read_restricted='false'

to exclude categories that are not visible by the general public.

Your code will vary depending on what JOINs you’re using etc.


#79

What would be the most convenient way to change the timeout for queries? It’s set to ten seconds and I believe I see where it’s set in plugin.rb but I’m not sure if it’s worth it to make a fork just for that.


(Jeff Atwood) #80

16 posts were split to a new topic: Strange problem with Data Explorer


Strange problem with Data Explorer
(Joe Buhlig) #81

The more plugins I build the more I find myself using the PluginStore. In most cases, it seems best to store data there as a JSON object. I’d love to have a way to use Data Explorer to parse a JSON object but I’m aware this is not something native to SQL. So I’m not sure if there’s anything that can be done here. But still, it’d be nice to have. :wink:


(Rafael dos Santos Silva) #82

PostgreSQL has a great JSON support, for example:

SELECT
  value::json->>'username'
FROM
  public.plugin_store_rows

will return the username key in a JSON store in our plugin_store_rows table.


(Jeff Atwood) #84

A post was split to a new topic: Find old topics with images?


(Joshua Rosenfeld) #86

3 posts were merged into an existing topic: Admin Statistics Report


Strange problem with Data Explorer
(Rob Meade) #87

Just a quick question if I may…

Is there any way to have a multiline description for the query that is created and saved? Despite being able to insert carriage returns, these seem to be strip out when viewed outside of the Edit functionality?


Updated

Not entirely sure where to post this, as it’s a bit of a bug I believe - please let me know if this isn’t the right place and I will re-post this part else where.

When you browse to the Data Explorer, the fields are filled with the last query you visited by default.
The dropdown menu shows that nothing is selected, could perhaps benefit from a default value of “Select Query” or similar?

It would be nice if the fields were not populated by default, as this could lead to a query being modified accidentally.