K-Means Clustering, Logistic Regression, nintendo, Propensity Modeling, Regression Modeling, Super Mario

TBCC 2019 Smash Brothers, Segmentation & Strategy: Panel Recap

012


003


This Panel was held on:

Friday, August 2, 2019 at 7:30 PM – 8:30 PM

During the Tampa Bay Comic Convention 2019, held at the Tampa Convention Center.

The Panelists were:

Tom Ferrara (@pancake_analytics) , Kalyn Hundley (@kehundley08), Andy Polak (@polak_andy)

001

I want to take a quick moment to discuss the panelists.  I love giving as many different point of views as possible to these data science panels.  Without this variety of point of views it’s more of a lecture and less of a discussion.  This mix of panelists gave the audience the data science view, the tech industry view and the biological sciences view.  Best part about this is Smash Brother brought us all together.


Changing the Tier Conversation

004.png

One of the main objectives of this panel was getting a discussion going on tier selection in Smash and how do we base tier selection in data science, and how do we validate our findings through one of the best players in the game.

A k-means cluster uncovers trends within our Smash Brothers data to understand the relational similarities and differences on key in game attributes.

The more clusters the clearer our picture becomes and the deeper we can understand the pros and cons of each main selection.


005.png

A brief overview of a k-means cluster:

  • Standardize your variables
  • Analyze your elbow curve
  • Validate your clusters

Treat each game release as new product launch or a change in the market.

You would re-score your data, to understand the current market and you’re able to migrate and understand how the meta-game has changed.


006

We end up with five unique clusters:

Floaters:

This group is the slowest by run speed and lightest by weight.

Jack Of All Trades:

They are middle group on everything, there is no distinct trend.

Dashers:

Like the Jack of All Trades group but faster.

Air Tanks:

Fast in aerial attacks and the heaviest of the characters.

Speedsters:

This group is the fastest and the lightest.


007

propensity model is a statistical scorecard that is used to predict the behavior of your customer or prospect base. Propensity models are often used to identify those most likely to respond to an offer, or to focus retention activity on those most likely to churn.

So who should be your main?  In this segment I rely on industry knowledge as well (ZeRo’s tiers as dependent variable).   I’ll build propensity score with the following independent variables:

  • Change in air acceleration
  • Base air acceleration
  • Base speed in the air
  • Base Run Speed
  • Character Weight
  • Ultimate Smash Bros. Cluster
  • Wii-U Smash Bros. Cluster

008


What makes these three stand above the crowd?

The are middle ground on weight, fast air accelerators.

What are the differences between the three?

Wario has a slow run speed.

Palutena is the lightest.

Yoshi is the middle ground of this group.


The Curious Case of Ganondorf

009

Ganondorf has more in-common with Jiggly Puff than he does Bowser.

The reason being is he’s quicker and can adapt well in aerial attacks and in falling than Bowser can.

On the flip-side of this I can also say Bowser more accurately represents how he’s viewed from the super Mario franchise, in Super Smash Bros. Ultimate.


Game Time: Name that segment: Overview

010

I personally feel one of the best ways to reinforce learning is through a game.  For this panel I decided to reinforce the k-means segmentation and wanted volunteers to guess the segment 3 characters on the screen fall into.

Here was the overview:

5 Volunteers

On the screen will be 3 characters

All 3 characters belong to the same segment

Volunteers will do their best to convince the panel of which segment the characters fall into:

  • Floaters
  • Jack of All Trades
  • Dashers
  • Air Tanks
  • Speedsters

For participating volunteers receive a fabulous prize.

For this particular game the prize was an amiibo of their choice that works with Smash Ultimate for the Nintendo Switch.


I want to personally thank everyone who attended the panel in Tampa, at the Tampa Comic Convention.  I look forward to meeting again in 2020.


003_008

Marvel Comics, Propensity Modeling, Regression Modeling

TBCC 2019 Avengers, Algorithms, and Analytics: Panel Recap

012


002


This Panel was held on:

Friday, August 2, 2019 at 9 PM – 10 PM

During the Tampa Bay Comic Convention 2019, held at the Tampa Convention Center.

The Panelists were:

Tom Ferrara (@pancake_analytics) , Kalyn Hundley (@kehundley08), Andy Polak (@polak_andy)

013

 

I want to take a quick moment to discuss the panelists.  I love giving as many different point of views as possible to these data science panels.  Without this variety of point of views it’s more of a lecture and less of a discussion.  This mix of panelists gave the audience the data science view, the tech industry view and the biological sciences view.  Best part about this is the avengers brought us all together.


003

When I pitched this panel the idea was what happens when a data scientist gets hold of the infinity gauntlet?  Pictured above is a visual representation of how I’m going to use each stone.

Use the Time Stone to predict the box office sales for the MCU and determine the top influencers for success.

Use the Power Stone to eliminate low hanging fruit.

Use the Soul Stone to uncover the underlying attributes of the marvel universe.

Use the Space Stone to transport the marvel universe to their closest match.

Use the Reality Stone to show you the marvel universe in a new light, perfectly balanced.

Use the Mind Stone to convince you this matching worked.


Time and Power Stones: What is influencing the MCU box office success?

004.png

I waked through those in attendance the output of regression model I built to unlock the the key influences of the Marvel Cinematic Universe and their relation to box office sales.

Considered influencers:

  • Rotten Tomatoes Scores (Critic and Audience)
  • Movie Release
  • Time since last MCU release
  • Solo Movie Releases
  • Was Iron Man in the movie?

Two Key Influencers stand out:

Having Iron Man in an MCU Movie drives in $100.5MM

The further along in the series drives in at least $216.8MM.  Story Development matters here’s the statistical proof!


Soul and Space Stones: Refitting the Marvel Power Scale

005

During this panel I walked the crowd through the output of a second machine learning algorithm, a propensity score.

Ingredients in the batter:

  • Marvel Contests of Champions (MCC) Power Index Levels
  • MCC Health
  • MCC Attack
  • Marvel Battle Royale (MBR) Twitter Poll:
  • TTL Votes per round, Avg TTL Votes

Flipping the pancakes:

Predict the likelihood twitter would vote for a character

Re-purposing this score to apply it to characters not in the MBR Twitter Poll


Reality and Mind Stones: Perfectly Balancing the Marvel Universe

006

This approach goes beyond ranking by attack, or defense.  This approach takes all those attributes together as well as the fan opinion.

If you only look at attack… you get skewed results

If you only look at defense… you get skewed results

A little bit of good… a little bit of crazy…

Old Man Howard the Duck?

Doctor Octopus the Demi-God?


Marvel Rapid Fire: Marvel Analytics Comparisons

007.png

This was one of my all time favorite segments out of all the comic cons I’ve had the pleasure of paneling at.  Quickly I would show the audience an analytics technique and show them the Marvel equivalent.  I think this technique is very effective in reinforcing our learning and opening up data science to a new audience.

Everything we just went through were machine learning techniques

Machine Learning is the Taskmaster of Data Science

Learns from past data, trains, and attempts to apply this training to new data

When something new is introduced it takes time to catch up


A/B Testing and Incremental ROI is the plot of Civil War

008


A neural network is Ultron… learns from observational data & figures its own solution

009


Dr. Strange ran a logistic regression to find out the odds-on Titan

010


Into the Spider verse was the perfect implementation of a random forest

011


Game Time: Marvel Team-Up: Overview

012


One of the best ways to reinforce learning is through a game.  During this panel I wanted to reinforce the learning from the propensity score.

I asked for 5 volunteers.  On the screen were 3 marvel characters.  2 characters on screen were look-a-likes (statistically speaking).  Volunteers did their best to convince the panel of which two characters should “Team-Up” or in other words identify the 2 statistically closest characters.

For participating all volunteers received a hero-clix figure of their choice.


I want to personally thank everyone who attended the panel in Tampa, at the Tampa Comic Convention.  I look forward to meeting again in 2020.


003_008

Board Games, Logistic Regression, Regression Modeling

Recipe: 008 Likelihood a Board Game Is Universally Loved

logo

For this week’s analysis I’m taking a different approach to the introduction.  I reached out to @missionboardgame to write the forward.  They are a couple from Turkey who tries their best to inspire people to join board game community.  With out further ado here is there overview of the modern board gaming climate:

We think a successful modern board game should include the following features:

✔️Your decisions should have an impact on the game progress.
✔️Minimal randomness.
✔️No player elimination as possible as there can be throughout the game.

In addition to those, theme, artwork and mechanics are also significant for our decisions while purchasing board games. Therefore, our favorite game is Robinson Crusoe: Adventures on the Cursed Island. It is a cooperative survival game where you are trapped on a deserted island. Each decision you have made previously has an outcome afterwards. The harmony between the theme and the rules is perfectly arranged so that you feel very integrated to the game. By this way, every action you take seems meaningful and logical. Also we love feeling the cooperation among us since we are usually 2 players. – Mission Board Game

36607436_1332042043597675_3622509673829105664_n


001


002


003

Countless nights I’ve played board games among friends and family.  Every new year’s eve my family and I play Monopoly.  A few reasons: the game-play length, the amount of players, and the simplified game-play.  I have 5 siblings, so saying it’s difficult to find a game for all of us to play is an understatement.

The reasons why we enjoy board games is an interesting topic.  Is it the theme of the game?  Is it the amount of players required?  Has the game received universal praise from critics alike? Is it a common game most households own, and we grew up playing?

All the above-mentioned variables I’ll throw into a logistic regression model and use the Bayes theory of probabilities, to determine the probability of a board game player will rank a game higher than the average score.

During the first read I see the model is statistically significant based on a z score of less than .05.   A few things stand out to me immediately:

1.) Not all variables have a positive relationship to a highly scored board game

2.) There are some strong social elements going on here (i.e. the longer the play the higher the impact may imply games which encourage discussion are rated higher)

3.) Fantasy themed board games are not ranked high (I have a D&D and video games impact theory)

004


005

Before jumping into the positive relationships, I’d like to touch briefly on the negative relationship independent variables.

1.) Fantasy Theme: I included this variable in the model expecting to see a very high positive correlation, but I was very wrong.  To quote Rick and Morty : “Sometimes science is more art than science.”  In the spirit of the quote, I’ll assume there are threats to the fantasy themed board games genre, in the form of Role-Playing Video Games.  The storytelling in this medium has progressed some much in last decade it out paces the anything a board game could offer.

In other words the target audience is leaving.

2.) Major of voters:  This variable is all about the amount of users who share their ranking.  A rule of thumb for rankings, reviews and ratings is those who go through the effort of expressing their opinion either love or hate the product.  The upper and lower confidence levels mirror themselves, because of this skew-ness.

006


007

Next, I’ll discuss the positive relationship independent variables (focusing on those with the highest impact):

1.) Board games with an average game-play of at least two hours or more has the highest positive impact on a user rating a board game score above average.  What makes a game have a long game-play?

Multiple reasons: more players involved, more game-play mechanics, and mostly importantly more discussion.  The soul of any good board game is bringing people together.

2.) The second highest impact comes from the average score displayed from Game Board Geek.  The reason behind this is users see this rating first before submitting their rating.  Think of it like the Rotten Tomatoes effect, people want to feel like they have universally accepted opinions.  Take the beginning of this data story for example, I mentioned Monopoly is a family tradition of mine, this potentially could have swayed your opinion on this board game.  Possibility you could rate this higher than a game, say is fantasy themed, based on this model output.

For your own reference, this model has an accuracy rate of above 70%

008.emf009


010

What have we learned from diving into the Board Game Data? 

Board games are most successful when they encourage the spirit and soul of “game night”, a gathering of friends and family discussing and enjoying each other’s time.  Adventure and exploration themes are the majority of the top ten highly successful board game genres.  The longer the game-play does not mean the game is like pulling teeth or the pace is slow.

It is more of an indicator of the amount of players required and the story telling the game has in driving a great game night experience.

 

After you have consumed this meal, I hope you take these findings and enjoy your next game night.  Also as always enjoy the featured pancake recipe below!


006

https://boardgamegeek.com/


005

011


003_008