Part 4: Random stuff
In this section, I'll answer analysis questions, present a few miscellaneous findings, and muse about what I really figured out.
Originally Posted by theshim
Here's a thought - what, if anything, is the correlation between Ally to Enemy report ratios and punishment margins? This is one that I'd like to see.
First, allies are nearly 2.5 times as likely to be the reporters as enemies, despite there being less of them (4 vs 5, mostly).
However, (and this one is a bit surprising) there is no viable correlation between the ratio of reports from allies to enemies and the chances of a punishment. Each extreme has a higher punishment rate than the middle, but that’s simply because in order to have a 10:1 ratio you need 11 reports total.
Originally Posted by theshim
Hmmm...you did racism there, and I know you're probably getting tired of crunching these, but do you think you could add a couple for homophobia? Maybe just "gay" and "f*g", I'd be interested to see how quick the community punishes for those.
Excluding “gay” itself, 2.4% of players used at least one antigay slur in chat. One player managed an astounding 407! (They weren’t the reported player, but boy I hope they got banned.)
8.1% of reported players used an antigay slur. One of these managed a mere 107 in a single game, and was indeed banned for his efforts.
As for the effects on punishment rates, any usage at all halves the pardon rate from 36% to 18%. More than 5 usages drops the pardon rate into the single digits, but there are are still occasional pardons at any degree of usage (including one pardon with 13 usages.)
As for the word gay itself, it is, rather surprisingly, used less often than its more offensive counterparts (at least when those are combined). The total prevalence is 2.2%, and usage among reported players is at 4.8%
The punishment profile for gay itself actually looks rather similar to the other one, starting at 36% pardon and dropping to single digits around 5 usages/game.
I was expecting to see more usage of gay compared to harsher insults, and less punishment for it. I'm not sure what that means.
Originally Posted by Mokkun
Things that I'm somewhat curious about.
% of different types of reports, and how often the types get punished for.
(maybe which champs garner which reports more often)
(maybe which type the perma-bans tend to come from, vs initial warnings)
I don’t have individual report types, but I do have a “most common report type” variable, sent to the client but not displayed.
Not all report types are equally used. This is the breakdown.
27.7% Intentionally feeding
25.8% Offensive Language
21.3% Verbal Abuse
14.1% Negative Attitude
7.8% Assisting Enemy Team
1.5% Inappropriate Name
A small fraction of cases have no report type specified.
Remember, this is the proportion of most common report types, not individual report types. Refusal to communicate does not appear at all, meaning it is very uncommon.
Most of these types are punished at roughly the same rate as cases as a whole. Only one shows real statistical significance; Inappropriate Name cases, which are punished more often (at a 69% conviction rate).
Report type has no correlation to the nature of the punishment (warning, ban, or permaban).
I wrote code to analyze item builds, but it was complicated and non-intuitive in output, and I didn't know how to analyze it.
I never managed to extract summoner spells in a useful fashion.
14.1% of players are using a skin. Reported players are slightly more likely to have a skin than the population at large (15.2%). The use of a skin does not affect conviction rate, though permabanned players only have skins 13.1% of the time (possibly indicating that a non-trivial number of permabanned players are on second or third accounts and have learned not to spend real money on the game).
Fizz, Lee Sin, Cassiopea, Evelynn, and Nocturne players talk the most (measured by lines), and Heimerdinger, Nassus, Caitlyn, Kog’Maw, and Sona players talk the least. Trundle players are the most erudite, averaging more than 3 words/line (there are so few Trundles, however that this could be an aberration), followed by Nautilus and Evelynn. Kassadin, Tryndamere, and Wukong have the shortest lines at around 2.75 words per.
1.) The tribunal handles a LOT of cases.
My estimate is that the Tribunal handles sixty thousand cases a week. If the average Tribunal voter judges 40 cases a week (an aggressive estimate), and a case needs 20 votes to close (a conservative estimate), there must be 30,000 Tribunal judges.
To manage this workload, it would take roughly 400 full-time Riot employees. I figure an employee would average 4 minutes a case, and would be able to spend 30 hours a week actually reviewing cases and not in meetings, breaks, training, review, etc, and that each case would go before 3 employees.
Even just manually reviewing the 1400 permabans a week with these estimates takes about 10 people.
2.) Autopunishing is not a problem in the Tribunal.
Tribunal cases are reasonably proportioned across the spectrum of overwhelming majority pardon to overwhelming majority punish. This means that either Riot has picked a punish threshold that accounts for and effectively ignores autopunishers, or there are in fact very few autopunishers (or Riot has an undiscussed vote weighting algorithm that compensates for them, or some other unknown solution.)
3.) By deciding how many reports and incidents to collect before generating a case, Riot controls the punish rate.
The difference in punish rates between cases with very few reports or incidents and many is stronger than nearly any other describing factor. But these factors aren’t directly within the control of the reported player; they are under Riot’s control. Rather than making a 1 report Tribunal case that has an 80% Pardon rate, Riot could wait for more reports before generating it, or decide not to submit it at all. Sometimes they do, and sometimes they don’t. I don’t know why.
4.) Punishable behaviors are non-orthogonal.
It is easy to find explanations for why a player was punished. If you examine deaths, you’ll find that players are punished more often for more deaths. If you examine kills, you’ll find players are punished less often for more kills. This might lead you to conclude that gameplay, not chat, was the driving force behind the Tribunal.
But, if you start with the chat, you’d find just as strong explanation there. The answer is that most punishable behaviors come hand in hand, due to the Tribunal’s agglomerative case nature. A punished player may have one game in which they fed but didn’t insult, and another in which they insulted but didn’t feed. It’s impossible to tell which game was the justification for punishment.
The Tribunal-report infrastructure is a complicated system, and despite having access to reams of data from it, most of the deeper questions can't really be answered without seeing report rates and understanding more about how the cases are built.
What I have been able to learn has not given me any reason to doubt the Tribunal's effectiveness. Autopunishers do not dominate the voting results, and most hypotheses that I tested and wanted to be true (racism gets punished, etc) turned out to be true.
The dominance of report total in predicting verdict bothers me, but without more understanding how the decision is made, I can't claim that it makes the Tribunal worse.