I want you to remember, Clark…In all the years to come… in your most private moments… I want you to remember my hand at your throat… I want you to remember the one man who beat you.
Chilling quote isn’t it? That was said by Batman to Superman during the The Dark Knight Returns, a comic book miniseries written and drawn by Frank Miller.
One of the greatest debates in comic book lore and a fun discussion to have is pitting up two superheroes against each other… Who wins and why? The below data story will introduce a data science approach to answering this debate. To have fun with it… I’ve thrown characters from the video game Injustice 2 into a Superhero Thrown Down Tournament.
Before we dive into the tournament and the results of the throw down, I’d like to touch on the approach: Propensity modeling.
Propensity modeling has been around since 1983 and is a statistical approach to measuring uplift (think return on investment). The goal is to measure the uplift of similar or matched groups.
The heart of this approach lies within two machine learning approaches (segmentation and probability.)
Why propensity modeling for this exercise? I wanted to rank my superheroes for the bracket using statistics (i.e. Batman is not getting a number one seed.)
35 characters were segmented on strength, ability, defense and health. For the propensity score I gathered ranking information from crowd sourced websites and surveys. Using this I was able to give an intangible skill score. The reasoning was I wanted the medium of comics to do the majority of the work for me. Comics are stories and the narrative drives the inner core of a character. The higher a character is on a fan sourced website I’m assuming they are written well and are timeless.
Next step was to take the mean of the intangible skill score and flag those characters above the average (this will be my dependent variable for my logistic regression to calculate a propensity score).
What was thrown into the propensity model? The skill sets gathered from the Injustice game, the assumption here is a character of Superman’s skill set would be written much differently then say Catwoman.
Now it’s time for our throw down.
The top four characters by propensity score were:
To determine a winner in the throw-downs characters were put up against each other in 11 categories.
Round 1 Takeaways:
Our number one seed Cyborg nearly lost to Atrocitus. The result was 6-2-5, that’s read as six wins, 2 ties and 5 losses.
There were no upsets in the first round of play. A few characters did not win a single category in their match-ups:
Harley Quinn (vs. Captain Cold)
Green Arrow (vs. Batman)
Black Manta (vs. Black Canary)
These three characters were ill-equipped to take on their opponent, it is possible they would have advanced given a new opponent.
Round 2 Takeaways:
Cyborg (our number one seed) defeated Captain Cold by a larger difference (+3 winning categories) compared to the previous match-up against Atrocitus, but he scored one win less.
We begin to see upsets in Round 2:
Robin defeated Black Adam by 1 winning category. Wonder Woman defeated Firestorm by 4 winning categories. Batman defeated Supergirl by 3 wining categories.
On propensity scores these were upsets, but from comic book debate standpoint you could argue these, i.e. given enough time to prepare Batman could defeat Supergirl.
Round 3 Takeaways:
Cyborg falls to Superman, loss by 4 categories. This was the biggest fight Superman was given in this tournament to date (in both previous rounds he had 9 winning categories).
The upsets keep coming in:
Robin sneaks in a win again by 1 winning category (over Brainiac). Wonder Woman defeats the top seed in her region of the bracket (Aquaman) by 4 winning categories. Batman defeated Green Lantern by 3 winning categories.
Final 4 Takeaways:
Robin’s Cinderella story comes to an end at the hands of Superman (winning in 9 categories). Robin did fair better than those previously who gave Superman 9 category wins… Robin won in 2 categories.
Batman was able to upset Wonder Woman, by 2 winning categories. We’re set for a championship round, the original who wins… Batman Versus Superman!
Our winner is…
Superman defeats Batman. Superman did not win in a landslide. Batman loss by two categories but he was able to win in 5 categories. Previously the highest total win categories against Superman were 3 winning categories.
What did we learn from diving into the DC data? Comic book writing and fan perception goes along way in determining who wins a thrown debate. If we use propensity modeling we can have more even playing field and limit the amount of unfair battles.