Uncategorized

AFO 2019 Player One, Power Ups, & Probabilities: Panel Recap

012


001


logo


Before I share the entire Anime Festival Orlando (AFO) 2019 Panel I’d like to give some insight on the nerves I had going into this panel and how the audience helped me get into the groove.

This panel I opted to go solo on, normally I have guest panelists join me, so the nerves where at all time high.

Could I keep the entire room engaged for a data science panel?  Would the flow drastically change?

I was set up ready to go early, and had great discussions with those who sat in early, we discussed whether or not to get pick-up Let’s GO Pikachu/Eevee.  Even one of the attendees were referred to attend panel from their friends who attended my Tampa Comic Convention Panels!

This was a first and good gut check for me, that what I’m trying to accomplish with Pancake Analytics is a good thing and is going over well.

I can’t thank the community we’re building here together enough!


028


This panel was held on: Saturday, August 10, 2019 at 8:30 PM – 9:30 PM

In Orlando, Fl during AFO 2019.


Our journey begins…

002

The steps on our Pokémon Journey:

  • New Point Of View on Pokémon
  • Field Researchers & Learning from them
  • Pokémon Team Recommendations

A New Point of View on Pokémon : Overview

003

A k-means cluster uncovers trends within our Pokémon data to understand the relational similarities and differences on key in game attributes.

The more clusters the clearer our picture becomes and the deeper we can understand the Pokémon throughout our journey.


A New Point of View on Pokémon : The Results

004

A Brief overview of the approach:

Standardize your variables (Set each variable to mean of zero)

Analyze your elbow curve (Look for when the line plot elbows)

Validate your clusters (Perform a uni-variate analysis on core kpis for each cluster)

3 Distinct Groups:

High – Highest in all categories except for base defense and hp

Medium – Highest on defense, middle ground in everything else

Low – Only high on hp


What does this tell us about the starters?

005

The output of the k-means clusters can be used in to help determine your approach from the very beginning.

Reading the pyramid:

Easy path: (Build you team around this Pokemon & steamroll grind the competition)

Greninja, Swampert, & Sceptile

Hard path: (Need to acquire complimentary Pokemon, you learn more about Pokemon this way)

Serperior, Meganium, Torterra, & Chesnaught


How do we implement this scoring?

006

I needed more data to implement this approach.

I reached out to my instagram followers with a survey, and volunteers we’re given:

5 Questions:

What’s your ideal team of 6 Pokémon?

What year did you start playing Pokémon?

Do you play Pokémon GO?

How many Pokémon games have you played?

Do you play the Pokémon TCG?


Implementing the scoring: Trust The Process

007

propensity model is a statistical scorecard that is used to predict the behavior of your customer or prospect base. Propensity models are often used to identify those most likely to respond to an offer, or to focus retention activity on those most likely to churn.

I used this model to predict if a Pokemon would be selected in the survey and used these results to recommend Pokemon a survey participants didn’t select but would give them statistically the same results of playing.

This is the whole Pokémon journey coming to a full circle.

The Pokémon Professor has done their own research and builds a model.

The field research team assists the Pokémon Professor with gathering new data.

The Pokémon Professor uses the model to assist the field research team.


Here are results of my recommendation model:

008.png

009

010


029

Is Ash getting better with each season?

I’ve analyzed all of Ash’s teams throughout the anime (from Kanto through XYZ).  I want to answer the question… Is Ash getting better with each season?

First challenge was how do we define success and what data science methodology do we use?

One area I feel gets over looked in data science is the performance analytics realm, using univariate and multivariate statistical analysis.

Univariate and multivariate represent two approaches to statistical analysis. Univariate involves the analysis of a single variable while multivariate analysis examines two or more variables. Most multivariate analysis involves a dependent variable and multiple independent variables

011.png

How do we determine success?

Base stats seem like a good starting point.

But as you can see one Pokémon can throw off our data… cough…  cough … Greninja cough … cough


Here’s how we do it, use the Pokémon GO Approach

012

As much as I feel Pokémon GO has flaws which shouldn’t get a pass, their CP attribute holds the answer to standardizing and scaling Ash’s teams.

What is CP in Pokémon Go?

CP (combat power) is not related to how much damage a Pokémon deals when attacking gyms, but is a combination of attack, defense and stamina (HP)

013

014

Using this approach helps level the field for those teams where Ash was heavy in one attribute, or when he only had one strong Pokemon.


From beginning to end Ash increased his CP by 8%

ash-hat-pikachu-169

015

His best rotation was in Sinnoh

  • He evolved the most Pokémon compared to his other teams.
  • He evolved 3 Pokémon all the way to their final evolution.
  • 3 of his Pokémon fall into our High cluster.

016

His worst rotation was in Johto

  • He evolved only one Pokémon (Notctowl he found).
  • He attempted to build a similar team he had in Kanto.
  • Only 1 of his Pokémon fall into our High cluster.

Game Time: Let’s GO! Wonder Trade: Overview

I personally feel one of the best ways to reinforce learning is through a game.  During all of my panels I like to play a game that reinforces a machine learning technique, in this case the propensity model.

Those who participated received a rare Pokemon TCG EX/GX individual card, a unified minds unopened TCG booster pack, and a gift certificate to Burger King ( a meal on me ).

Food is usually hard to come by at a convention, so I went back to my younger roots, and thought well I would have loved to get a free meal at a convention.

017.png

5 Volunteers

On the screen will be 3 of Ash’s Pokémon

2 Pokémon are look-a-likes (statistically speaking)

Volunteers will do their best to convince the me of which two Pokémon are look-a-likes and who should be wonder traded

For participating volunteers receive a fabulous prize



028

Uncategorized

TBCC 2019 The Pokemon Journey Panel

012


Welcome to the first recap of the Comic Con Data Science panels run by the crew at Pancake Analytics.  Before I dive into the recap of The Pokemon Journey panel held at the Tampa Bay Comic Convention 2019, I’d like to have a quick over view of why I’ve chosen this path.

One question I get asked often is where did I get the idea to apply the fundamentals of data science to comic, video games and all fanfare?

The answer is simple to me and is a core pillar of Pancake Analytics.  I want to teach, share, engage and learn from the comic con family.

I want to TEACH those who attend my panels or interact with this page an introduction to data science and how it can improve areas of your life you are passionate in.

I want to SHARE my years of analytics experience with aspiring analysts and those scared of statistics.

I want to ENGAGE with fans of comic, video games, anime, theme parks, all things geek! I’m one of you and love our conversations.

I want to LEARN your point of view of the topics I discuss.  How do we have a high level discussion about data that doesn’t feel like a math class?

If any these core pillars resonate with you, I hope you enjoy the content I produce and continue to join the discussions.


001


The Pokemon Journey at TBCC2019 was held on Saturday, August 3, 2019 at 7:30 PM – 8:30 PM.
The pitch of the panel was as follows:
Going to Tampa Bay Comic Con⁉️

Join us in the lite heart-ed data science discussion of Pokémon. Journey from Kanto to the Alola region through machine learning. This panel is more helpful than a Pokédex.

The Panelist were myself and Steve (an indie game developer).  Here’s a commissioned piece I got from a comic con artist:
014

002
Above is the a visual representation of the Pokemon Journey we are about to embark on.

The steps on our Pokémon Journey:

  • New Point Of View on Pokémon
  • Field Researchers & Learning from them
  • Pokémon Team Recommendations

During the new point of view on Pokémon section, I walked through the audience of a K-means clustering algorithm to reset Pokémon tiers and move us away from only grouping Pokémon together by typing.

During the Field Researchers & Learning from them section, I walked through the audience how to utilize survey data to build recommendation engine ( companies as large as Amazon and Netflix use this technique).

During the Pokémon Team Recommendations section, I walked through the audience the output of the recommendation model and real life scenarios of recommended teams.


 

003

A k-means cluster uncovers trends within our Pokémon data to understand the relational similarities and differences on key in game attributes.

The more clusters the clearer our picture becomes and the deeper we can understand the Pokémon throughout our journey.

When you pick up a Pokémon game for the first time ever you are in the left square.  Running this algorithm will get you the bottom right sooner, a clear picture.


004


 

A Brief overview of the approach:

Standardize your variables (bring your variables to a mean of zero)

Analyze your elbow curve

Validate your clusters

3 Distinct Groups:

High – Highest in all categories except for base defense and hp

Medium – Highest on defense, middle ground in everything else

Low – Only high on hp


015


005


What does this tell us about the starters?

The output of the k-means clusters can be used in to help determine your approach from the very beginning.

Reading the pyramid:

Easy path:

Greninja, Swampert, & Sceptile

Hard path:

Serperior, Meganium, Torterra, & Chesnaught


013


006


How do we implement this scoring?

I needed more data to implement this approach.

5 Questions:

What’s your ideal team of 6 Pokémon?

What year did you start playing Pokémon?

Do you play Pokémon GO?

How many Pokémon games have you played?

Do you play the Pokémon TCG?


007

This approach recommends a new squad of Pokémon to the field researcher!

Implementing the scoring: Trust The Process

propensity model is a statistical scorecard that is used to predict the behavior of your customer or prospect base. Propensity models are often used to identify those most likely to respond to an offer, or to focus retention activity on those most likely to churn.

This the whole Pokémon journey coming to a full circle.

The Pokémon Professor has done their own research and builds a model.

The field research team assist the Pokémon Professor with gathering new data.

The Pokémon Professor uses the model to assist the field research team.


Here’s the model at work, the input and recommendations:

008.png


009


010


011

During the my data science panels I like to reinforce the learning through a game and participants get a prize from my own personal collection.  For this specific panel participants received an unopened pack of Team Up from the Pokemon TCG, and a Pokemon EX TCG individual card.

Here’s an overview of the game:

5 Volunteers

On the screen will be 3 Pokémon

2 Characters are look-a-likes (statistically speaking)

Volunteers will do their best to convince the panel of which two characters are look-a-likes and who should be wonder traded

For participating volunteers receive a fabulous prize


I want to personally thank everyone who attended the panel in Tampa, at the Tampa Comic Convention.  I look forward to meeting again in 2020.


003_008

Uncategorized

Recipe 012: Pokemon Gen 2 K-means Clustering

logo

Thanks for coming for a bite, let’s dig into some pancakes and the data science behind the Pokemon of the Johto Region.  How do they differ from the Kanto Region?  What’s the importance of introducing two new Pokemon Types?  Finally how speaking about the trends in our data will help us understand the relational differences and similarities beyond Pokemon general typing!


012_receipe


 

gyarados_en_265x240

Pokemon Gold and Silver ushered in a new era for the Pokemon series and listed below are few changes (not listing all the influential game changes in this post) which still have a large influence through this day:

The introduction of Shinies (Shiny Gyarados shown above)

Gender types

Eggs, breeding and babies

The experience bar

Two new Pokemon types: Dark and Steel

increase_in_bugs

I want to touch base on specifically two items in the above list and how they effect the overall re-balancing of the Pokemon universe (see above the increase of stronger bug type Pokemon) and how it’s driving difference between generation 1 and generation 2.

Eggs, breeding and babies

Two new Pokemon types: Dark and Steel

How do Eggs, breeding and babies influence the trends in our data?  For instance there’s more normal types added to the mix (+1%) but the average base attack (-8%) and base defense (-2%, even with the introduction of Blissey!)  have both declined versus generation 1 (Red and Blue).

rebalancing

How do the introduction of two new Pokemon types: Dark and Steel influence our trends?  For those of you have played gold and/or silver you know this is longest nameplate in the Pokemon series to date because you also travel back to the Kanto region (Where psychic and ice types reign supreme!).

Dark type Pokemon are super effective against Psychic and Ghost types.  They’re vulnerable to Fighting, Bug and Fairy types.

Steel type Pokemon are super effective against Rock, Ice, Fairy and Dragon types.  They’re vulnerable to Fighting, Ground, and Fire.

Bug type Pokemon are super effective against Psychic, Grass, and Dark.  They’re vulnerable to Flying, Rock and Fire.

Dark and Steel types where introduced to re-balance the game and give the player the tools to be prepared for the Kanto region challenges.  In doing new and stronger Bug type Pokemon (think Heracross and Shuckle) were introduced to add a check in place for those trainers who go on a full on attack against Psychic type Pokemon (Dark and Grass types [counters to Mewtwo]).

Now we’ve dug into the differences of our data from generation 2 to generation 1 we can begin focusing on generation 2 and how we can apply a guided machine learning to building the best Pokemon Johto team we can!


011_remove_outliers

While training this model, I uncovered a segment full of only legendary Pokemon, although you can get these Pokemon in the game I will be removing them from this analysis, for a few reasons:

They’re overpowered compared to the rest of the population.

They’re meant as a reward.

It’s not very insight full to know the legendary dogs have more income with other legendary Pokemon as opposed to a baby Pokemon.

Let’s continue…


010_standardize_vars

In my segmentation I’ll be throwing in several key performance indicators for Pokemon value throughout the game ranging from base attack to experience growth rate.  How do I get these vastly different attributes on the same scale?

Through standardization!  Standardizing my variables to a mean of zero will put a heavier weight on the trends within the data, as opposed the individual weight of each variable.

002_amount_of_clusters_plot

How do I determine the proper number of clusters?  I’ll analyze this elbow graph and look for an error where my sum of squares begins to bend (as an elbow would).

From first glance I begin to see the shift at 4 groups, then a slight change at 5 groups and vast difference at 6 groups.  What does this tell me? Possibly one of clusters has high deviations and variability on the attributes selected for clustering.


001_comp_plot

Understanding I might have a group with high variability and seeing there isn’t a large difference from 4 groups to 5 groups, I decide to plot a 4 cluster solution.

Visualizing our data in this way (plotting my the top two components [ which accounts for 60.33% of the variability in the data]) show me two things:

The relationships between Pokemon beyond general type.

My group to the far right, if I ran a 6 cluster solution would have large overlap and possibly a smaller cluster smack in the middle of it.

Now that we’ve done this let’s learn about the Johto Pokemon…


003_elite_info

My top tiered Pokemon group is a clustering of elite scored attributes, which explains the variability.  Above you can see the type breakdown and the top base attack and top base defense Pokemon within the group.  I like this display because it puts the emphasis on how introducing Dark, Steel, and more stronger bugs have influenced the Pokemon universe.  During a previous analysis (Which can be found in the kitchen!) I did the same approach for the Kanto Pokemon and Psychic types were the top attackers.

004_valuable_info

The next tier is the Valuable tier, Pokemon fall in this tier because they are borderline elite in one attribute but overall well balanced.  Think of these Pokemon as the Jack of All Trades.

005_medium_info

The Medium value tier has more variability on Pokemon type, and are Pokemon which evolve in most cases (all three starters fall in this group) but not all (see Dunsparce).  Pokemon in this tier if left as is and never evolve…. will never migrate to the upper tiers.

006_low_info

All Pokemon have value when trained to their full potential and this is why my bottom tier is called Low Value.  Pokemon in this tier will take time and patience but do offer unique attribute scores which can be useful at higher levels.  As seen above Granbull’s family tree begins in this tier.  There’s an opportunity to migrate from the Low value tier to the Valuable tier if you train, train, train!!!


009_shuckle_003

Now that we’ve gone through this exercise what unique findings can we come up with?  Possibly something you didn’t already know.

Shuckle has more in common with Tyranitar than Miltank.

Shuckle’s unique combination of Elite base defense and hp, out weighs it’s lower scored attacks, to take it’s place among the Pokemon powerhouses of the Johto region.

Thank you for reading this data story and if you have follow-ups or would like to continue the discussion direct message me on Instagram @pancake_analytics !

Enjoy your breakfast!


005

panackes_yum


 

003_008