Cross-site monitoring of external links?

Hi friends,
what’s your workaround with dealing with external links?

You might noticed debates about fake news and other sources. I‘d like have a feature or plugin, that allows me to keep track of content changes or at least a hash based solution. I‘d like to validate the orginal linked page :white_check_mark:, warn users about changes (by threshold) :warning:, and inform admins / moderators about dead links and some kind of simple fraud detection.

Nowadays, civilized discussions and knowledge base management needs more control about about the sources. Wikipedia pages and other sites „under construction“ should be handeled a little bit differently. For example by providing a link to the version protocol or exact post.

What’s your opinion on this?

Best

Er… what? I am not following? What are you asking for?

Imagine, some users of a discourse community talking about politics or whatever topic. By providing evidence based information, they are adding links to external websites. Let’s say, e.g. news articles, papers, or statistics… The target link could be a static (e.g. pdf or html document) or a CMS powered page.

My intention is to to see, if something changed (above a threshold), I‘ll get notified about the changes. Users should trust the linked third-party site as much as they do trust the user, that provided the orginal source. It’s some kind of cache functionality with plain text (not style or ad cahnges), that should do some kind of fingerprinting the external website.

And if community members and visitors read the post and a link becomes corrupted in the meanwhile or the content changed, they will see also little emoji or any kind of notice next to the link.

If an external link gets broken (404 erros or weired forwardings), the auther of the post and staff of the community will be informed. The staff by daliy or weekly reports and the post auther immediately (via push) and some PM report.

I want to improve the overall quality of my posts and get the opportunity to fix external links. Otherwise I‘ve no chance as user and admin to keep track of the information behind the link.

I could see you building a plugin that that grabs an archive.org for every link in your posts. It may be useful in very very extraordinary cases.

For example it could automatically insert:

https://edition.cnn.com (original)

Definitely not core functionality though. And the monitoring would be even harder.

3 Likes

Archive.org is an outstanding project. Unfortunately, they where never really in mind of ordinary people. Most likely professionals like journalists and very critical minds take the full advantage of the waybackmachine.

Fingerprinting of websites is actually an old thing. It prevents against man-in-the-middle attacts. For example by governments and other hackers. Have a look at the TOR network.

I agree with you, that it would be very challenging. Especially if we monitor all aspects of each site. There must be definitely a set of priorities by doing a feature detection. I studied medical engineering with focus on imaging technologies. We did a lot of crazy math with DICOM CT/MRI data, to provide image processing and feature detection algorithms. Website data is far more “one dimensional” and follows stable web standards. The anatomy and pathology of the human body don’t :stuck_out_tongue_winking_eye:

Design aspects: The official calendar plugin of Discourse uses a a pop-up message to display additional time zones. This is how links should look like, if they’ve to display more information without disturbing the users by reading too much.

I truly believe in going this way. I’m not sure if this task should be solved with client or server-side tools. But I wouldn’t trust any centralized SaaS providers to do this for thousands of sides. The biggest advantage of the internet, the decentralization, is also a critical security issue. And not just a technical one.