I am concerned about Discourse being spyware?

Some valid reasons for not using discourse include its copious amount of tracking and spyware-y things. I’m not sure if those can be disabled. This is what makes the antispam work tho - they just send everything to their servers and do stuff there to check if it’s spam across all forums.

Very spyware-y.

That is… not the case.

Zero traffic is sent to Discourse if you install Discourse on your own servers. There are periodic version checks to let you know if your Discourse is out of date in the admin console, but you can disable that in your site settings if you prefer. We do want Discourse instances to stay up to date so they get new features, and security fixes.

There is an Akismet plugin you can run, and I recommend that most sites do run, which is entirely optional. Even then, only new users first few posts are being checked for spamminess. And that data is sent to Automattic, Inc. (the creators of WordPress), who run the Akismet service – not us.

21 Likes

As someone who has deployed Discourse in environments which include high security sites I can guarantee that none of the above is true.

16 Likes

You sure you’re not confusing Discourse with Windows 10 there? :slight_smile:

In all seriousness though, even if that were the case, I’d let it happen for anti-spam sake. Maybe i’m just weird but having something like that take care of anti-spam would seem cool. Would put a lot of users off though, and would probably need a big disclaimer, though.

But, Discourse is open-source so this sort of claim is easily validated or debunked by looking through the code. I’m sure if that stuff were found it would’ve been put way out there already.

7 Likes

Which of the following are you saying here? Because only one of them is true.

An instance of Discourse sucks up and stores lots of data about its users, such as how much you read, in its own database

An instance of Discourse sucks up copious amounts of information and sends it to discourse.org

6 Likes

I stand corrected. But I still don’t like the amount of local tracking it does (including links you share off-site, etc).

1 Like

Isn’t that decided by the user?

Tracked
`https://meta.discourse.org/t/convincing-a-stubborn-admin-to-let-me-use-discourse/106314/40?u=remah

Untracked
`https://meta.discourse.org/t/convincing-a-stubborn-admin-to-let-me-use-discourse/106314/40

1 Like

Pretty much nothing you are saying here is based on any kind of facts or reality that I can tell @soni.

The only credible claim is version checks, but as previously stated

  • we need Discourse instances to stay patched for the security and health of the web

  • you can disable version checking in your site settings if you don’t want it

6 Likes

I only interact with discourse because I’m forced to, because everyone else decided discourse was good and futuristic and the new thing or w/e.

The moment I got a badge for sharing a link off-site I wanted to get as far away from it as I possibly could. which, happens to be “not very”, because of the aforementioned reasons.

Anyway, I’ll stop, I didn’t mean to hijack the thread and I’m sorry.

1 Like

Don’t worry too much about it. It’s already been split into its own thread by a moderator. Just another sign that we’re living in the future.

I get it. It came across as creepy and unexpected. It’s also avoidable, but that’s not a good counterargument; Discourse URLs are often kind of long, so expecting users to trim off a query at the end is blaming the user for a UI that violated their reasonable expectations.

Discourse is in a no-win UI design scenario: we want to encourage users to share links (so removing the feature entirely would be bad), we want to avoid presenting users with complicated choices with unclear trade-offs (so presenting a checkmark to let them choose whether the URL is tracked or not would be bad), and we don’t want to scare off privacy-conscious users who are probably Discourse’s best advocates as an alternative to groups on a site like Facebook (so the status quo is kinda bad).

But just to reiterate: Discourse tracking is always on an instance-by-instance basis, modulo Akismet (which isn’t given your link-sharing information). If you self-host, then the discourse.org crew will literally not have access to the data that your forum collects*.

* Alright, in a truly adversarial scenario, they could probably sneak a vulnerability in :small_red_triangle: :eye: but let’s assume abject lying and underhanded coding is too risky.

7 Likes

Is the fear that a malicious admin on an individual Discourse site could use the data against you? They’re generally the only ones who have direct access to it.

I’m also curious what you’d consider a more secure alternative to Discourse?

4 Likes

stackoverflow has a nice non-intrusive way of nagging ppl to share stuff ;). (altho they also track this stuff >.<)

the biggest risk isn’t so much a bad admin as it is a potential leak. (but, ofc, a bad admin could be the cause of a leak…)

I’m not sure what is “more secure” - more secure… to whom? in my case, I don’t like this (IMO unnecessary) tracking. I also don’t like how passwords are exchanged with the server. I also don’t like how I’m forced to use my password to login - an usability issue that has some security tradeoffs. (I don’t believe being able to “share” a login between devices is a serious security issue - we do have a session manager so we can just force-logout other (unwanted) sessions. (well, okay, it’s only barely passable as a session manager tbh :p. but that’s beside the point.))

Ah ok, so in this case the concern is if data were to leak from an individual Discourse install that there’s generally too much available?

Understandable, but as mentioned by others this is just a fundamental difference in opinion — a lot of the features built into Discourse wouldn’t be possible without tracking some stats.

The overall hope is that Discourse can help foster individual communities that have control over their own data and aren’t under the singular umbrellas of Facebook, Google, etc. By comparison the internal tracking on any individual Discourse site is a fraction of a fraction of what the big social networks are doing.

I’m asking you! I was wondering if you had anything in mind that gets things right. We might not agree, but I’m just trying to understand your opinion better.

6 Likes

No it doesn’t. Stack Overflow is employing exactly the same technique.

image

4 Likes

You really need to revisit the definition of spyware. It’s widely accepted as the covert and typically unsolicited monitoring of a user and their behaviors by a third party. Discourse as the Data Controller by definition here is the first party, all tracking is done on-site for the purpose of specific features which are also contained on the site.

For example:

  • Topics infinitely scroll, Discourse recalls where you read until so that you can resume reading from the same point.
  • Link clicks and other interactions are recorded for basic gamification, which is proven to drive user engagement and assist in community building.

None of the specific data is available to anyone other than the user and the administrators. Other forums generate this data too in their web server logs, Discourse merely makes better use of it.

funny that:

9 Likes

Yes, they also track this stuff. But that’s not what I’m talking about. Just post any question and scroll down a little bit, and you’ll see text similar to “please share this with friends it’ll help you get answers”. We don’t have that here!

Well, I’d like to ask about this. General practice on the Internet is that if you’re discussing things on an open forum then probably be prepared to have your own words used against you. Beyond that, the kind of sharing data you’re talking about has very little value. You shared a link, somewhere. What kinds of informational security concerns bother you if Big Brother or Unnamed Company X find out not only that you posted something but that you also shared that link elsewhere? What kinds of problems could come from that? Again, if a share-link is accessible to the Internet than anyone in any search engine could potentially find out that you posted something.

I don’t mean to sound snippy or otherwise negative, but I am seriously curious what negative impact are you concerned with?

So you don’t want a password to be used to login? What would you prefer? I understand that a site with a a password could be hacked, but AFAIK Discourse uses properly hashed and securely stored passwords. Are you looking for biometrics? Are you looking for 2-Factor (which, I could be wrong, correct me if I am but don’t they have that 2-factor already)? Are you looking for public key cryptographic login?

What about how the password is sent to Discourse bothers you? A site admin is responsible for setting up SSL, once the connection is SSL secured that is pretty standardly recognized as a secure way to transmit data, even banking data (which is certainly far more high risk than posts to Random Web Forum X). Now if an administrator hasn’t configure SSL properly, it is easy to tell. A security conscious user can run any URL through checker sites such as SSL Server Test (Powered by Qualys SSL Labs), which is an excellent site tot ell if a site admin has properly configured SSL.

The security risks inherent in sending form data into a web forum for login are also the same risks as they have been for PHPbb, vBulliten, or any of the other standard packages that run forums. Even unofficial text-based options such as those in use with BBS packages which most rely on insecure Telnet (Synchronet is still around, and supports SSH but only as of last year so that is an exception). Thus, I am again curious as to what exactly bothers you about using your password. The only more secure method than a password coupled with 2-factor authentication is to use some kind of public key cryptography, which could be done (certainly) using GPG keys or SSH keys but would have a high encumbrance for users. SSL keys do support connection validation with a proper public key (in SSL terms called a Certificate) but then, the site admin would be forced to run their own Certificate Authority and pay for that Authority to be tied to one of the big named in CA authentication (VeriSign for instance) which is exceedingly expensive. The only other way to accomplish such an upgrade in login security would be to have users install a custom CA on each system to validate the keys provided to the users – most OS Packages won’t import self-signed keys and allow them to be readily used without jumping through hoops – this is all for the incredibly high standard of security SSL can offer.

Just brainstorming other ideas… Would you prefer to use SSH to login to your Discourse forum? Perhaps you’d prefer to use a VPN Connection that uses your SSH key (OpenVPN is free but most clients that are free are difficult for users to setup). Perhaps you’d prefer TOR (Pirate Bay is not only directly accessible as TOR node now, if I remember correctly).

I’m not being flippant, there are more secure ways to send data back and forth, but I have to wonder, what software packages have you used that used more secure options and what options were they?

3 Likes

I haven’t looked at the discourse internals so I don’t know exactly what it tracks about you, but I assume some of that data could be concerning in some situations. Maybe it’s easy to workaround tho, but I haven’t looked deep into it.

Anyway, if you want me to expand on the login stuff:

  1. Browsers should support key derivation from username/password combos.
  2. Since it’s key derivation, it can take as long as you want, maybe you need some sort of preset for setting up some stuff but w/e it’s just keys at the end of the day. Also, since it’s keys, the other side would be forced to attack your username/password combo (oh hey look, it’s no longer just password), which should be a lot harder than plain old password cracking.
  3. Username and password are never sent to servers, only public key. Websites implement display names separate from username, altho I suspect a lot of users would just have them be the same. Anyway, it’s keys, no MITM or server-side hack can break this. With a few extra steps it’s even physically impossible to trick someone into logging into a different service through this mechanism. This pretty much defeats 90% of modern phishing attacks. Maybe more. If only websites would stop with their custom login pages and just use a browser-provided thing. Imagine if phishing attacks had to go an extra length to try and emulate the user’s desktop environment (impossible on linux), browser, etc! A small tweak to how we do logins would eliminate ALMOST ALL PHISHING ATTACKS!
  4. I could go on for days, really. Anyway, in a perfect world the key mechanism would go well beyond just the login page. OAuth and stuff would also be done with keys, pretty much defeating all those OAuth-based attacks where you redirect the tokens elsewhere.
  5. Additionally, since you have a whole key-based system with some automated tool access, you can additionally put QR codes on top of it. So if you’re on a device, you can generate a QR code which contains a derived (or even random) key, and the user can scan it on another device and automatically share the login. This would be a huge win for convenience, and using the key mechanism there’s very little to go wrong! Arguably if you already have access to a logged in account, well, you already have access to it! As such, being able to clone that access (especially with a temporary, short-lived key) shouldn’t be a big deal! And ofc this all assumes browser support.
  6. I definitely hate the modern web. We moved away from the ability for users to easily distinguish a phishing page from a legit page, and we’re still using the old flawed system where phishing pages with the same interface as the legit pages allow the attackers to login on the legit pages as you (something that would be impossible with keys, as the keys would be locked down to specific websites - and the phishing websites aren’t the legit websites unless you go as far as DNS hijacking and somehow coming across a valid TLS certificate for it. and good luck with that.). The web is a failure at protecting the users. We still use plaintext passwords in far too many websites. Any properly trained engineer should be capable of looking at this and staying far, far away from anything to do with computers.
  7. Seriously, tho. Hacking a forum (i.e. the server) shouldn’t give you access to ppl’s emails. But it does. Phishing pages shouldn’t be possible. But they are. None of this is to blame on the users. Yet we keep blaming the users. “Why did you not check if it was a phishing page?” “Why aren’t you using a password manager?” “Why are you reusing passwords?” well, how about this: “Why are we using a flawed system when there are better systems out there, using modern cryptography, that trivially mitigate all those issues and doesn’t involve changing the user’s workflow at all?” This isn’t about users. It’s not users who are doing it wrong.

Oh well, I think I might’ve gone a little too far with this one. But what I’m really asking for is a rather basic set of security guarantees, which basically boils down to 3 main ones: 1. mitigate phishing attacks 2. mitigate the effects of server-side attacks on login information (hacked servers sending username/password combos to an attacker) 3. mitigate the issues of reusing passwords

that’s it. this is more than good enough for 95% of all internet users, and with the client-side key derivation, well, the client could even use its own set of keys that are not derived from an username/password combo, so I’d even argue it’s more than good enough for 99% of all internet users.

I should probably go with “I am concerned about the internet being spyware”, tbh. because, it actually is.

Before I go into your points here, I’d like to say that while I understand the idea where if you don’t know what might happen doesn’t mean you shouldn’t be concerned… I also have to point out that you’re free to go into the code, or even ask here, on what Discourse tracks and why. Calling Discourse spyware or viewing as spyware because they track what you’ve read, when you’ve logged in, how often you post, and what is linked back to your posts / what you’ve linked back to other posts… I have yet to hear anyone tell me — even conceptually — where that would go bad for anyone. If you’re that concerned, look at the code for Discourse, it is all there. These guys do a great job of being transparent. There’s a lot to learn here about how the “modern web” (as you call it later) works and most of us here are readily willing to help you out.

On to your points about login methods…

• Browsers should support key derivation from username/password combos.

That isn’t the fault of Discourse, you purported this to be a flaw in Discourse, as far as I know, there are no Browsers that have this capability for built in public key cryptographically generated login credentials.

• Since it’s key derivation, it can take as long as you want, maybe you need some sort of preset for setting up some stuff but w/e it’s just keys at the end of the day. Also, since it’s keys, the other side would be forced to attack your username/password combo (oh hey look, it’s no longer just password), which should be a lot harder than plain old password cracking.

See above, this is not implemented anywhere that I know of.

• Username and password are never sent to servers, only public key. Websites implement display names separate from username, altho I suspect a lot of users would just have them be the same. Anyway, it’s keys, no MITM or server-side hack can break this. With a few extra steps it’s even physically impossible to trick someone into logging into a different service through this mechanism. This pretty much defeats 90% of modern phishing attacks. Maybe more. If only websites would stop with their custom login pages and just use a browser-provided thing. Imagine if phishing attacks had to go an extra length to try and emulate the user’s desktop environment (impossible on linux), browser, etc! A small tweak to how we do logins would eliminate ALMOST ALL PHISHING ATTACKS!

Well, I get the impression that perhaps you would benefit from reading some books but, if you don’t have time for that I highly recommend the Security Now Podcast and there are a multitude of books on the subject. Computer security is an incredibly fascinating but also incredibly complicated field. To be fair, a server-side attack could certainly hack your credentials, once you’re into a server you can pretty much do what you want with the site as you then have access to the source code.

“A Few Extra Steps” also kind of shows how much you might benefit from reading about how complicated the issue of providing identity online truly is. I don’t say that to be mean or insulting, you’re clearly passionate about this subject I highly encourage you to read about it. It may become easier or harder to trick someone into providing login credentials to sites but the human factor will always be the weakest leak. You’re just assuming that this public key credential plan would end most of modern security attacks, which is not really true. The situation would change of course.

Also, a Linux based desktop environment is impossible to emulate? Can you go into more detail over this issue in particular? I’m afraid that I don’t know what you mean by that in this context. Certainly, if I gather what you mean correctly, personal computer logins would benefit from some kind of public key cryptographic requirement; you can get that now in the form of Smart Card based logins, but that would require the technology be implemented on a wider basis than it is.

Again, none of this has anything to do with Discourse, and you original message made it sound like Discourse has a bad login interface.

• Additionally, since you have a whole key-based system with some automated tool access, you can additionally put QR codes on top of it. So if you’re on a device, you can generate a QR code which contains a derived (or even random) key, and the user can scan it on another device and automatically share the login. This would be a huge win for convenience, and using the key mechanism there’s very little to go wrong! Arguably if you already have access to a logged in account, well, you already have access to it! As such, being able to clone that access (especially with a temporary, short-lived key) shouldn’t be a big deal! And ofc this all assumes browser support.

Something like this is done on Keybase, which is an interesting idea. If you want to share logins between devices. Session keys are widely used for this purpose, however, still many less technically savvy users may have issues with it. On computers you could certainly generate a code to share with a mobile but the other way is a bit more problematic or sharing between computers without a mobile device to act as a conduit. Of course being able to clone login credentials defeats the purpose of having public key cryptographic logins, as someone (once they had access from a server side or cross-site attack) they could just copy your login information. This could be mitigated in some ways, the use of OTP’s and the like, but that gets beyond the scope of what you’ve discussed.

• I definitely hate the modern web. We moved away from the ability for users to easily distinguish a phishing page from a legit page, and we’re still using the old flawed system where phishing pages with the same interface as the legit pages allow the attackers to login on the legit pages as you (something that would be impossible with keys, as the keys would be locked down to specific websites - and the phishing websites aren’t the legit websites unless you go as far as DNS hijacking and somehow coming across a valid TLS certificate for it. and good luck with that.). The web is a failure at protecting the users. We still use plaintext passwords in far too many websites. Any properly trained engineer should be capable of looking at this and staying far, far away from anything to do with computers.

Actually this is entirely false and shows your lack of understanding of the “modern web” SSL implementations on browsers have gotten very good at telling people a web site is a potential problem. The problem here is not the technology it is the people. Your disdain for the modern web it really a disdain for people who don’t check things. Anyone can click on the lock symbol in a toolbar and see — Oh… Hey… This company doesn’t match (try it, with Amazon, Google, Apple, etc.) but people don’t. You don’t have to check DNS records, just pay attention to the SSL indicator, make sure it is there and make sure the company matches the certificate. Browsers readily warn if a certificate is invalid and certificates can be made invalid from the issuer relatively easily. Thus a Phishing site whose been caught with an SSL certificate that is using it to con people will be easily discovered and their certificate invalidated by the issuer.

• Seriously, tho. Hacking a forum (i.e. the server) shouldn’t give you access to ppl’s emails. But it does. Phishing pages shouldn’t be possible. But they are. None of this is to blame on the users. Yet we keep blaming the users. “Why did you not check if it was a phishing page?” “Why aren’t you using a password manager?” “Why are you reusing passwords?” well, how about this: “Why are we using a flawed system when there are better systems out there, using modern cryptography, that trivially mitigate all those issues and doesn’t involve changing the user’s workflow at all?” This isn’t about users. It’s not users who are doing it wrong.
Oh well, I think I might’ve gone a little too far with this one. But what I’m really asking for is a rather basic set of security guarantees, which basically boils down to 3 main ones: 1. mitigate phishing attacks 2. mitigate the effects of server-side attacks on login information (hacked servers sending username/password combos to an attacker) 3. mitigate the issues of reusing passwords

I think perhaps you don’t understand what hacking any server, of any kind, does. If you get into the system, you can get anything that server has. Now if you like email notifications, ability to do some kind of password recovery operation, and all that kind of thing; you really have to store emails on the server. This means that if the server is breeched, they can get into it. Period. I can feel your frustration, but if public key cryptography would “trivially mitigate all those issues”, then it would be done already.

To sum up though, the problem with computer security isn’t the technology. It is the people. The problem with people is… People.

I wanted to add one more point I forgot… So these cryptographic key pairs (public and private keys), where would they be stored to make them secure? The server would only get signed data (signed by your private key verified with your public that the server has). However, on your system, how would you keep these secure? How would you keep less informed people from uploading the wrong key? Do you have a key for every site (most secure) but if you do… Where do you keep the keys? Keep them online is insecure, keep them offline is a pain to get them where you need, does every device have its own key for each site? If so, how is that managed? What algorithms would you use? How do you manage killing off all your keys if the device you had them on is stolen?

There are a number of issues.

9 Likes

FUD

I’m not sure why this topic keeps being revived, honestly.

6 Likes