Data mining a Discourse site for links

(Jeffrey L Frusha) #1

Could a plug-in, like the feature where I can archive and download my posts, be adapted to archive and download the links from all posts in a Discourse forum, rather than reading all posts and digging each one out, manually?

This would allow someone to categorize and create a library of those links, for a ‘wiki’…

The advanced search ‘Google’ only brought up some 752 results (which seems low), with: http

and 732 results with: www

The question was brought up by another forum member. I’m just another forum user, not a mod, or anything.

(Jeff Atwood) #2

Just download your posts from your user page and extract the links using a regex in an editor, or write a program to do it.

The latter is beyond the scope of what we do here at meta – we can’t teach you programming, or regular expressions.

(Mittineague) #3

Might it be possible to use the Data Explorer plugin to query the Topic Links table?

(Jeffrey L Frusha) #4

If I wanted mine and mine alone, I could do that, however, the goal is to get all, or as many as possible into one download, to be categorized as a library of sorts.

(Jeff Atwood) #5

Perhaps @riking could weigh in on this?

(Kane York) #6
{"query":{"id":10,"sql":"SELECT url, count(*)\nFROM topic_links\nGROUP BY url\nORDER BY count DESC","name":"All outbound links","description":"Use the \"Download Results\" button to get it as a file.","param_info":[]}}

Import that, Run Query, then Click the “Download Results CSV” button.

(Jeffrey L Frusha) #7

I assume that’s something only admin can do. Thank you.

@kensims Would you be so kind as to try this? KatOn Tri would certainly like a copy, as would I. As I’m up to it, I’ll try to go through and categorize, so the wiki can be updated.

(Kane York) #8

Yes, you might want to filter it to exclude links from PMs…

(Jeffrey L Frusha) #9

Could this be done as a feature? Maybe a plug-in capable of archiving and downloading the links from a topic, or category, similar to the way I can archive and download my posts?

(Jeff Atwood) #10

No, it cannot. Not by us anyway.