Transparency in moderation really matters

Does Transparency in Moderation Really Matter?: User Behavior After Content Removal Explanations on Reddit

When posts are removed on a social media platform, users may or may not receive an explanation. What kinds of explanations are provided? Do those explanations matter? Using a sample of 32 million Reddit posts, we characterize the removal explanations that are provided to Redditors, and link them to measures of subsequent user behaviors—including future post submissions and future post removals.

Adopting a topic modeling approach, we show that removal explanations often provide information that educate users about the social norms of the community, thereby (theoretically) preparing them to become a productive member. We build regression models that show evidence of removal explanations playing a role in future user activity. Most importantly, we show that offering explanations for content moderation reduces the odds of future post removals. Additionally, explanations provided by human moderators did not have a significant advantage over explanations provided by bots for reducing future post removals.

We propose design solutions that can promote the efficient use of explanation mechanisms, reflecting on how automated moderation tools can contribute to this space. Overall, our findings suggest that removal explanations may be under-utilized in moderation practices, and it is potentially worthwhile for community managers to invest time and resources into providing them.


(I haven’t read the contents of that paper btw, so this is just a general comment on moderation and moderation transparency.)

Discourse is already excellent and better than most forum software - users get notified of most mod actions. However I feel we could certainly do with expanding moderation features.

To give you an example, we routinely edit out bad language or personal attacks from posts and some users have complained about this - asking that they are asked to make the changes themselves and that we don’t touch their posts.

Of course that is not practical - the whole point of a moderator removing personal attacks is so that the person that the attack is aimed at doesn’t see it, which could lead to things spiralling. Additionally, moderators don’t have the time to PM a user asking them to make the change, make a note of the incident, remember to check, PM the user again or take other action if they haven’t made the required changes - it’s just too much of a burden on moderator teams.

An alternative could be to use the flag system which hides a post - but what happens if the user just unhides it or does not make an adequate change? We’d still have to make a note to check, unhide again, etc. (Additionally, removing the post from view also has other negatives - can slow the flow of a conversation, can make people feel like it’s too much bother posting, etc - which can all avoided usually by a very simple mod edit.)

I do firmly believe that Discourse could take moderation to the next level just like they have in other aspects of forum software :smiley: It would require a lot more thought than just a single post or thread though - moderation is hard (nobody likes being moderated!)

I’d be happy to be part of such discussions and I am sure @HAWK would be invaluable in such an effort.


It kinda depends.

For heavy stuff like racist / sexist / bigoted invective I do not think there is any value in leaving toxic waste lying around to continually poison the community, e.g.

For lighter violations, sure, edit and then leave a staff notice on the post.


I completely agree - I have a note about this in our staff room:

Hi all,

One of our members has come out as _____ recently, and although it is unlikely that they will suffer any abuse from any genuine member of the community, it’s possible that someone with an axe to grind against [topic of forum] or the community may pick on them as an easy target or a way to create intercommunity conflict.

  • If a poster is ‘new’ (i.e someone who has registered recently and has very few posts) and they make any racist, trans-phobic, homophobic or sexist remarks - please ban their account immediately and also remove any offending posts. The ban notice can simply be “Comments contravene our Code of Conduct”.
  • If a poster is a regular member, please delete the offending post/s and PM the user to let them know the post was deleted because you believe it is against our rules and to please refrain from making any further posts in that thread (or contacting the other user if another user was involved) until a member of the Admin team has reviewed the situation and followed up/contacted them directly. If they ignore this notice and continue to post in the thread or continue to make abusive posts, suspend their account with the message “Failure to comply with Moderator request. Pending Admin review.”

Then please log any incidents in this thread and we’ll review each case to see what action might be necessary. If I’m not around and issues are ongoing (for example, where others are debating the issue in that or another thread) then just lock the thread with a “Pending Admin review” notice.

It’s the other stuff that’s tricky - I’m off to bed now but will try to come back to this tomorrow.

(Are those staff notes publicly viewable or just by the person who made the post btw?)

1 Like

Add a staff notice to a post yourself via the wrench menu and see…


We don’t want to dictate how people moderate their communities, we just want to provide the tools so that they can do it as they see fit.

So I’m curious here – what could be done differently? You can hide, you can unhide, you can delete, you can empower the community to hide, you can add staff notices.


I can see you Jeff! :smiley:

With regards to such notices, we would almost certainly not use them for any kind of user-moderation because we prefer to avoid publicly embarrassing (or chastising) a user if at all possible - only dealing with user issues more publicly if we have been left with little choice (for instance, by their behaviour or conduct increasingly affecting more and more people on the forum).

If those notices could only be seen by the creator of that post, we could use them in some instances so long as we could make obvious that only they can see it.

Why do we take this path? Because people hate being moderated and being ‘shamed’ publicly even more - it can be infuriating (and somewhat disrespectful - if someone has made a genuine mistake they shouldn’t be made to feel like an idiot in front of the whole community). Offending users like this can create huge problems further down the line - if they take umbrage they will never forget, and it will be compound future issues, which could lead to inter-community issues or them falling out with moderators (which unfortunately never ends well). At worst they could leave or get banned and carry out more malicious attacks. It is just not worth it, particularly if you were once reasonably confident they were a genuine user.

I agree - I just think at the moment the tools seem to reflect how moderation was traditionally carried out (with some improvements of course) but I think there’s probably a much better way (just like you all at DC have demonstrated for other areas of forum software)… we just need to work out what that better path is.

It’s a huge topic, but I think the main thing that requires our focus right now is making moderation as palatable as possible - while - not burdening the mod team.

Unfortunately I don’t have the time to go into greater detail atm, but would love to chat about this more in the new year perhaps? :blush:


I will admit that it was me that did that.

I spent many months thinking on this (while revisiting the way that we moderated SitePoint prior to moving to Discourse) and I feel like we reached a really good place [for that particular forum] using new processes which were driven by the tools available here. We pretty much threw everything out and started again, which was well overdue.

I’m not sure what moderation problems we need to solve that aren’t already being solved so I’m certainly interested in your thoughts. I do concede that it’s difficult to think outside of the realm of the communities that you’ve individually managed, but I do think it’s important to start with real problems, rather than assumptions.

Ditto. The NY sounds good. :slight_smile:


This thread kinda went off on a tangent, but the main point I think we should take away from this article is that it is worth it to privately explain to users who post content against the guidelines why their own stuff is getting removed. The study indicated that the more detail provided, the better.

So, concretely, if somebody’s content is removed because N people flagged them, I believe right now the automated message to the user says “your post was removed because N people flagged you;” the message provides no information about why the post may have violated community guidelines.

That’s just an example. I feel like the overall gist is: removals should be associated with reasons that the post was removed, reasons that the user could act upon to change/fix the post, even (especially) if they’re just canned explanations selected from a drop down.


This is incorrect. It provides information on the type of flag per the flag dialog – off topic, inappropriate, spam, or “other”.


“Inappropriate” could use a lot more structure, I say. “Whaddya mean, inappropriate?! This post is totally fine!”