Recipes

Marvel Comics, Propensity Modeling, Regression Modeling

Recipe 013: Marvel Comics Propensity Score

FerraraTom

How crazy would it be if I told you Howard the Duck and Old Man Logan are closer to each other in skill sets than they are to any other Marvel characters?  Or how about Thor and Dr. Octopus are lookalikes as well?  Let’s answer these questions together by wrangling some readily available data.


 

008

 


 

001

If I’ve learned anything from my career in data science it’s this: 80% of the work is data gathering and etl work, and 20% is analysis.

Nothing holds truer to this statement than finding data of Marvel characters skills set, on a normalized scale.  In this data story I’ll be using data from Marvel Contests of Champions (power index levels, health and attack) and the Marvel Battle Royale (a twitter fan poll of greatest superheroes).

A few more variables I’ll need to calculate around the results of the Marvel Battle Royale Twitter Fan Poll:

Total votes per each round

Average Total votes

A flag for if they were higher than average total votes per marvel character

This flag I’ll use as my dependent variable and my independent variables will be the Marvel Contest of Champions statistics.

What will this do?  This will predict the likelihood a Marvel Character would receive higher than the average total votes in the Marvel Battle Royale.

Once this is calculated I’ll receive an output of coefficients which I can apply to the rest of the Marvel Characters whom weren’t in the Marvel Battle Royale to create a propensity score.


 

002

Now let’s back track a little bit and see why I’m going with a propensity model as opposed to a grouping by opinion.  I.e. Let’s put all the top attackers in the same category.

The top 3 characters based on Attack are Rocket Raccoon, Spider-man (Symbiote), and Blade.

In the above histogram, if you look all the way to the far right you’ll notice they are the data points on their own little island.


 

 

003

Well what if I just grouped everyone by Health?  This data visualization looks more promising but mostly likely there would overlap on the other attributes and you wouldn’t be able to implement this successfully.


 

004

The power index by definition could be suitable but from the top 3 selected on power index I can tell this rating wasn’t an index in the vein of what I would typically use an index for (time-series forecasting) and it looks to be similar to the Pokemon Go Combat Point System, the ability to use their full potential.


 

005

One use of a propensity score is to create similar groups, based on the likelihood of performing a behavior.

In this case Doctor Octopus and Thor (Ragnarok) statistically the same in the Marvel Contest of Champions skill set.  For those of you want to go down and interesting rabbit whole, you can find YouTube videos on why Doctor Octopus should be in a demi-god tier.

This propensity score approach literally put Doctor Octopus in the same tier as a demi-god!


 

006

Medusa by power index alone would be close to Thanos but factoring all skill sets, she is statistically closer to Gwenpool, Cable, and Nightcrawler than she is to the Mad Titan.


 

007

Now for the crazy but statistically significant section.  Howard the Duck (I’m hoping he gets a show on Disney+) and Old Man Logan are a propensity score match.

An example like this where many begin to argue in data science, when does subject material expertise come into play?  We can argue significance forever, on any topic, but we can agree on all Marvel Champions have a value if played correctly.


006

009


 

005

010


003_008

Uncategorized

Recipe 012: Pokemon Gen 2 K-means Clustering

FerraraTom

Thanks for coming for a bite, let’s dig into some pancakes and the data science behind the Pokemon of the Johto Region.  How do they differ from the Kanto Region?  What’s the importance of introducing two new Pokemon Types?  Finally how speaking about the trends in our data will help us understand the relational differences and similarities beyond Pokemon general typing!


012_receipe


 

gyarados_en_265x240

Pokemon Gold and Silver ushered in a new era for the Pokemon series and listed below are few changes (not listing all the influential game changes in this post) which still have a large influence through this day:

The introduction of Shinies (Shiny Gyarados shown above)

Gender types

Eggs, breeding and babies

The experience bar

Two new Pokemon types: Dark and Steel

increase_in_bugs

I want to touch base on specifically two items in the above list and how they effect the overall re-balancing of the Pokemon universe (see above the increase of stronger bug type Pokemon) and how it’s driving difference between generation 1 and generation 2.

Eggs, breeding and babies

Two new Pokemon types: Dark and Steel

How do Eggs, breeding and babies influence the trends in our data?  For instance there’s more normal types added to the mix (+1%) but the average base attack (-8%) and base defense (-2%, even with the introduction of Blissey!)  have both declined versus generation 1 (Red and Blue).

rebalancing

How do the introduction of two new Pokemon types: Dark and Steel influence our trends?  For those of you have played gold and/or silver you know this is longest nameplate in the Pokemon series to date because you also travel back to the Kanto region (Where psychic and ice types reign supreme!).

Dark type Pokemon are super effective against Psychic and Ghost types.  They’re vulnerable to Fighting, Bug and Fairy types.

Steel type Pokemon are super effective against Rock, Ice, Fairy and Dragon types.  They’re vulnerable to Fighting, Ground, and Fire.

Bug type Pokemon are super effective against Psychic, Grass, and Dark.  They’re vulnerable to Flying, Rock and Fire.

Dark and Steel types where introduced to re-balance the game and give the player the tools to be prepared for the Kanto region challenges.  In doing new and stronger Bug type Pokemon (think Heracross and Shuckle) were introduced to add a check in place for those trainers who go on a full on attack against Psychic type Pokemon (Dark and Grass types [counters to Mewtwo]).

Now we’ve dug into the differences of our data from generation 2 to generation 1 we can begin focusing on generation 2 and how we can apply a guided machine learning to building the best Pokemon Johto team we can!


011_remove_outliers

While training this model, I uncovered a segment full of only legendary Pokemon, although you can get these Pokemon in the game I will be removing them from this analysis, for a few reasons:

They’re overpowered compared to the rest of the population.

They’re meant as a reward.

It’s not very insight full to know the legendary dogs have more income with other legendary Pokemon as opposed to a baby Pokemon.

Let’s continue…


010_standardize_vars

In my segmentation I’ll be throwing in several key performance indicators for Pokemon value throughout the game ranging from base attack to experience growth rate.  How do I get these vastly different attributes on the same scale?

Through standardization!  Standardizing my variables to a mean of zero will put a heavier weight on the trends within the data, as opposed the individual weight of each variable.

002_amount_of_clusters_plot

How do I determine the proper number of clusters?  I’ll analyze this elbow graph and look for an error where my sum of squares begins to bend (as an elbow would).

From first glance I begin to see the shift at 4 groups, then a slight change at 5 groups and vast difference at 6 groups.  What does this tell me? Possibly one of clusters has high deviations and variability on the attributes selected for clustering.


001_comp_plot

Understanding I might have a group with high variability and seeing there isn’t a large difference from 4 groups to 5 groups, I decide to plot a 4 cluster solution.

Visualizing our data in this way (plotting my the top two components [ which accounts for 60.33% of the variability in the data]) show me two things:

The relationships between Pokemon beyond general type.

My group to the far right, if I ran a 6 cluster solution would have large overlap and possibly a smaller cluster smack in the middle of it.

Now that we’ve done this let’s learn about the Johto Pokemon…


003_elite_info

My top tiered Pokemon group is a clustering of elite scored attributes, which explains the variability.  Above you can see the type breakdown and the top base attack and top base defense Pokemon within the group.  I like this display because it puts the emphasis on how introducing Dark, Steel, and more stronger bugs have influenced the Pokemon universe.  During a previous analysis (Which can be found in the kitchen!) I did the same approach for the Kanto Pokemon and Psychic types were the top attackers.

004_valuable_info

The next tier is the Valuable tier, Pokemon fall in this tier because they are borderline elite in one attribute but overall well balanced.  Think of these Pokemon as the Jack of All Trades.

005_medium_info

The Medium value tier has more variability on Pokemon type, and are Pokemon which evolve in most cases (all three starters fall in this group) but not all (see Dunsparce).  Pokemon in this tier if left as is and never evolve…. will never migrate to the upper tiers.

006_low_info

All Pokemon have value when trained to their full potential and this is why my bottom tier is called Low Value.  Pokemon in this tier will take time and patience but do offer unique attribute scores which can be useful at higher levels.  As seen above Granbull’s family tree begins in this tier.  There’s an opportunity to migrate from the Low value tier to the Valuable tier if you train, train, train!!!


009_shuckle_003

Now that we’ve gone through this exercise what unique findings can we come up with?  Possibly something you didn’t already know.

Shuckle has more in common with Tyranitar than Miltank.

Shuckle’s unique combination of Elite base defense and hp, out weighs it’s lower scored attacks, to take it’s place among the Pokemon powerhouses of the Johto region.

Thank you for reading this data story and if you have follow-ups or would like to continue the discussion direct message me on Instagram @pancake_analytics !

Enjoy your breakfast!


005

panackes_yum


 

003_008

DC Comics, K-Means Clustering, Logistic Regression, Propensity Modeling

Recipe 011: DC Super Hero Throw Down: Propensity Modeling

FerraraTom

I want you to remember, Clark…In all the years to come… in your most private moments… I want you to remember my hand at your throat… I want you to remember the one man who beat you.

Chilling quote isn’t it?  That was said by Batman to Superman during the The Dark Knight Returns, a comic book miniseries written and drawn by Frank Miller.

One of the greatest debates in comic book lore and a fun discussion to have is pitting up two superheroes against each other… Who wins and why?  The below data story will introduce a data science approach to answering this debate.  To have fun with it… I’ve thrown characters from the video game Injustice 2 into a Superhero Thrown Down Tournament.


012_pic

 

 


010_pic

Before we dive into the tournament and the results of the throw down, I’d like to touch on the approach: Propensity modeling.

Propensity modeling has been around since 1983 and is a statistical approach to measuring uplift (think return on investment).  The goal is to measure the uplift of similar or matched groups.

The heart of this approach lies within two machine learning approaches (segmentation and probability.)

Why propensity modeling for this exercise?  I wanted to rank my superheroes for the bracket using statistics (i.e. Batman is not getting a number one seed.)

35 characters were segmented on strength, ability, defense and health.  For the propensity score I gathered ranking information from crowd sourced websites and surveys.  Using this I was able to give an intangible skill score.  The reasoning was I wanted the medium of comics to do the majority of the work for me.  Comics are stories and the narrative drives the inner core of a character.  The higher a character is on a fan sourced website I’m assuming they are written well and are timeless.

Next step was to take the mean of the intangible skill score and flag those characters above the average (this will be my dependent variable for my logistic regression to calculate a propensity score).

What was thrown into the propensity model?  The skill sets gathered from the Injustice game, the assumption here is a character of Superman’s skill set would be written much differently then say Catwoman.

011_pic


Now it’s time for our throw down.

001_pic

The top four characters by propensity score were:

Cyborg

Supergirl

Aquaman

Black Adam

To determine a winner in the throw-downs characters were put up against each other in 11 categories.


Round 1 Takeaways:

002_pic

Our number one seed Cyborg nearly lost to Atrocitus. The result was 6-2-5, that’s read as six wins, 2 ties and 5 losses.

There were no upsets in the first round of play.  A few characters did not win a single category in their match-ups:

Harley Quinn (vs. Captain Cold)

Green Arrow (vs. Batman)

Black Manta (vs. Black Canary)

These three characters were ill-equipped to take on their opponent, it is possible they would have advanced given a new opponent.

003_pic


Round 2 Takeaways:

004_pic

Cyborg (our number one seed) defeated Captain Cold by a larger difference (+3 winning categories) compared to the previous match-up against Atrocitus, but he scored one win less.

We begin to see upsets in Round 2:

Robin defeated Black Adam by 1 winning category.  Wonder Woman defeated Firestorm by 4 winning categories.  Batman defeated Supergirl by 3 wining categories.

On propensity scores these were upsets, but from comic book debate standpoint you could argue these, i.e. given enough time to prepare Batman could defeat Supergirl.

005_pic


Round 3 Takeaways:

006_pic

Cyborg falls to Superman, loss by 4 categories.  This was the biggest fight Superman was given in this tournament to date (in both previous rounds he had 9 winning categories).

The upsets keep coming in:

Robin sneaks in a win again by 1 winning category (over Brainiac). Wonder Woman defeats the top seed in her region of the bracket (Aquaman) by 4 winning categories.  Batman defeated Green Lantern by 3 winning categories.

007_pic


Final 4 Takeaways:

008_pic

Robin’s Cinderella story comes to an end at the hands of Superman (winning in 9 categories).  Robin did fair better than those previously who gave Superman 9 category wins… Robin won in 2 categories.

Batman was able to upset Wonder Woman, by 2 winning categories.  We’re set for a championship round, the original who wins… Batman Versus Superman!

batman-vs-superman-movie


Our winner is…

009_pic

Superman defeats Batman.  Superman did not win in a landslide.  Batman loss by two categories but he was able to win in 5 categories.  Previously the highest total win categories against Superman were 3 winning categories.


What did we learn from diving into the DC data?  Comic book writing and fan perception goes along way in determining who wins a thrown debate.  If we use propensity modeling we can have more even playing field and limit the amount of unfair battles.


005

SupermanPancakesW


003_008

nintendo, Regression Modeling, Super Mario

Recipe 010: Mario Kart Game-play Improvement Controller Trials

FerraraTom

Before I dive into this week’s data story, let me state why I love the Nintendo Switch.  I personally feel there’s a need for video games to be a social event, and couch co-op is a must have feature.  The Nintendo Switch offers several games which meet this need.

My family loves playing video games and most of all we love playing video games together.

Most of the Nintendo games I’ve grown up on and have played over the years, Mario Kart by far is one of my favorites.  I’ll admit my wife shows me how it’s done.


001


002


003

What I do find interesting about the Nintendo Switch is the joy con controllers, there’s a learning curve (but a huge improvement on the Wii-mote) and most veteran gamers prefer an alternative.

One alternative is the wireless controller, very similar to the X-box controller format.  I did pick up the Yoshi version for my wife and she loves it and personally feels it improves her game-play.

I’d thought it was time to put this notion to the test, what impact if any does a wireless controller has on game-play performance versus using a joy con.

Mario Kart seemed like the logical choice for this is experiment, it’s a multiplayer game, you can standardize your users (via ride type and modifications), and performance is measured in a continuous variable of points.

004


005

A total of 8 trails were ran under these conditions:

-Standard Kart

-Standard Wheels

-Standard Flyer

-Mushroom Cup

-50 cc length race

-2 gamers

Half through the trial one gamer switched to the wired controller (Test group) while the other gamer stayed on the single joy con (Control group).

Results were documented, and the etl. process began, points scored each race would be used as the key performance indicator.

I next ran a linear regression (great for evaluating an A/B test), with my dependent variable being the points scored after the event (introducing the wired controller) the two independent variables: Treatment and Pre Points Scored.

006


007.png

In this model I wasn’t concerned with the r-squared value or the significance level of each variable.  The sample data was not large enough, this was closed circuit small market test.

The model itself did show to be significant, which is a good indicator I can continue with the results.  Evaluating my Q’s graph, I see the model fits well, the trend goes through all the data points.

In my summary fit I notice there is a positive relationship between treatment (group) and post points scores.  At first glance this says you improve your Mario Kart game-play performance if you play with a wireless controller.

To complete this story I want to know my upper confidence level to be able to know by how many points and is this enough to move me up the rankings.

Using a wired controller has the potential to increase a gamers point performance by over six points each race.

The average points differential between race placement is 1.2 points.  This 6-point increase is enough to move you roughly 4 places, depending on your historic placement.

008.png


009

What have we learned from diving into the Mario Kart Data?

The controller you play with matters, switching to a traditional wired controller can potentially improve your point score by 6.5 points,

which depending on your average race placement can move you up 4 places in the final standings.

Observing the CPU controlled racers, Shy Guy performed the best with an average final placement of 2.8.  The heavy class overall was the weakest group but without Bowser, it could have been worse.  Bowser’s average final placement was 4th.

 

After you have consumed this meal, I hope you take these findings and enjoy your next Mario Kart Grand Prix.  Also as always enjoy the featured pancake recipe below!


006

010


005

011


003_008

Classification Tree, Game of Thrones, Tree Based Models

Recipe: 009 Game of Thrones Survival of the Fittest

FerraraTom


“When you play the game of thrones, you win or you die.” — Cersei

Let’s bring this quote to life in what I like to call a survival tree of the fittest.  This week’s analysis will focus on the character survival in Game of thrones.  Chow down and enjoy!


001


002


003

Winter is coming and you’d like to know your chances of survival in the Game of Thrones universe.

Let’s learn from those who have survived to this point and those who have met their unkindly fate.

To do this I’ll build a classification tree with my event being set to is the character alive (1 for yes, 2 for zero).  Classification trees in general test the null hypothesis, when we reach my tree visualization I’ll assign the color red to instances of were it’s highly probable of a character death.  Green leaves will indicate it’s highly probable a character survives… as long as all this criteria is met.

Think of this tree as a really morbid family tree, but since the data is Game of thrones it fits right into place.

The variables have readily available to me (hopefully they have importance) are as follows:

  • House Affiliation
  • Member of nobility
  • Marital Status
  • Gender
  • Family history of deaths
  • Popularity

004


005

From the initial read I see knowing if a character is popular among fans and if they are male hold the highest importance in determining survival.

Also the variables I have available account for 75% of the variability (a 25% miss-classification rate).

Let’s say you moved to Westeros, out of the gate you have a 25.4% chance of meeting your end.  At those odds I’m taking my chances but I should stay under the radar as much as I can, because the data warrants it.

If you become a popular character or are an integral part of the story, your death becomes more meaningful and your probability of survival is worse than a coin flip.

So let’s say you’re a like-able character (you can’t help it), not all is loss, as long as you’re a female.  The highest survival rate is the popular female character group.  This is a classic tale of high risk high reward.

006


007

A classification tree is a great way to visual your data and now I’ll walk us through this Game of Thrones survival tree.

Let’s start at the very top, the tree assumes everyone has a 75% of survival.  Now as the tree splits this Is where the interesting part begins, and our data story begins to unfold.

If you are a popular character you flow to the left side of the tree, your survival rate of 75% now drops to 48%.

Staying to the left side of the tree there is another important split, are you a male or female?  Female characters have a higher probability of surviving (87% if you’re popular and 76% if you’re under the radar).

If you’re a male and you’re popular you have a 42% chance of survival (We’re looking at your Peter Dinklage).

Now here’s the largest caveat to take with this classification tree: I’m assuming it will no longer be relevant after the final season.  Winter is coming and most likely our characters will see their end by hands of White Walkers.


008

What have we learned from diving into the Game of Thrones Data?

Everyone has starts off at a 75% survival rate and as your popularity grows your survival rate lessons by 27%.  If you’re a male your survival drops again by 33%.  If you’re a popular female character you are 45% more likely to survive versus your male counterparts.

An interesting tidbit…If you become popular and you are a female (hopefully the mother of dragons) you boast the highest survival rate of anyone in this universe, 87%.

 

After you have consumed this meal, I hope you take these findings and enjoy your episode of Game of Thrones. J  Also as always enjoy the featured pancake recipe below!


006

https://gameofthrones.fandom.com/wiki/Game_of_Thrones_Wiki


005

009


003_008

Board Games, Logistic Regression, Regression Modeling

Recipe: 008 Likelihood a Board Game Is Universally Loved

FerraraTom

For this week’s analysis I’m taking a different approach to the introduction.  I reached out to @missionboardgame to write the forward.  They are a couple from Turkey who tries their best to inspire people to join board game community.  With out further ado here is there overview of the modern board gaming climate:

We think a successful modern board game should include the following features:

✔️Your decisions should have an impact on the game progress.
✔️Minimal randomness.
✔️No player elimination as possible as there can be throughout the game.

In addition to those, theme, artwork and mechanics are also significant for our decisions while purchasing board games. Therefore, our favorite game is Robinson Crusoe: Adventures on the Cursed Island. It is a cooperative survival game where you are trapped on a deserted island. Each decision you have made previously has an outcome afterwards. The harmony between the theme and the rules is perfectly arranged so that you feel very integrated to the game. By this way, every action you take seems meaningful and logical. Also we love feeling the cooperation among us since we are usually 2 players. – Mission Board Game

36607436_1332042043597675_3622509673829105664_n


001


002


003

Countless nights I’ve played board games among friends and family.  Every new year’s eve my family and I play Monopoly.  A few reasons: the game-play length, the amount of players, and the simplified game-play.  I have 5 siblings, so saying it’s difficult to find a game for all of us to play is an understatement.

The reasons why we enjoy board games is an interesting topic.  Is it the theme of the game?  Is it the amount of players required?  Has the game received universal praise from critics alike? Is it a common game most households own, and we grew up playing?

All the above-mentioned variables I’ll throw into a logistic regression model and use the Bayes theory of probabilities, to determine the probability of a board game player will rank a game higher than the average score.

During the first read I see the model is statistically significant based on a z score of less than .05.   A few things stand out to me immediately:

1.) Not all variables have a positive relationship to a highly scored board game

2.) There are some strong social elements going on here (i.e. the longer the play the higher the impact may imply games which encourage discussion are rated higher)

3.) Fantasy themed board games are not ranked high (I have a D&D and video games impact theory)

004


005

Before jumping into the positive relationships, I’d like to touch briefly on the negative relationship independent variables.

1.) Fantasy Theme: I included this variable in the model expecting to see a very high positive correlation, but I was very wrong.  To quote Rick and Morty : “Sometimes science is more art than science.”  In the spirit of the quote, I’ll assume there are threats to the fantasy themed board games genre, in the form of Role-Playing Video Games.  The storytelling in this medium has progressed some much in last decade it out paces the anything a board game could offer.

In other words the target audience is leaving.

2.) Major of voters:  This variable is all about the amount of users who share their ranking.  A rule of thumb for rankings, reviews and ratings is those who go through the effort of expressing their opinion either love or hate the product.  The upper and lower confidence levels mirror themselves, because of this skew-ness.

006


007

Next, I’ll discuss the positive relationship independent variables (focusing on those with the highest impact):

1.) Board games with an average game-play of at least two hours or more has the highest positive impact on a user rating a board game score above average.  What makes a game have a long game-play?

Multiple reasons: more players involved, more game-play mechanics, and mostly importantly more discussion.  The soul of any good board game is bringing people together.

2.) The second highest impact comes from the average score displayed from Game Board Geek.  The reason behind this is users see this rating first before submitting their rating.  Think of it like the Rotten Tomatoes effect, people want to feel like they have universally accepted opinions.  Take the beginning of this data story for example, I mentioned Monopoly is a family tradition of mine, this potentially could have swayed your opinion on this board game.  Possibility you could rate this higher than a game, say is fantasy themed, based on this model output.

For your own reference, this model has an accuracy rate of above 70%

008.emf009


010

What have we learned from diving into the Board Game Data? 

Board games are most successful when they encourage the spirit and soul of “game night”, a gathering of friends and family discussing and enjoying each other’s time.  Adventure and exploration themes are the majority of the top ten highly successful board game genres.  The longer the game-play does not mean the game is like pulling teeth or the pace is slow.

It is more of an indicator of the amount of players required and the story telling the game has in driving a great game night experience.

 

After you have consumed this meal, I hope you take these findings and enjoy your next game night.  Also as always enjoy the featured pancake recipe below!


006

https://boardgamegeek.com/


005

011


003_008

Cosplay

Recipe: 007 Comic Con Cosplay and the Drivers of Instagram Engagement

FerraraTom


Halloween has recently passed and it’s a good transition into this week’s analysis;

Let’s face it dressing up on Halloween is the first step to cosplaying at your local comic con.

Cosplay can be a lucrative business if done correct, and many people do.  As you read through this week’s analysis I urge you respect and treat cosplayers as you would any other professional.  It that’s a lot of hard-work and dedication to master their craft as they have.


meal_specs_cosplay


 

meal_card


 

cosplay_group_001

A staple at any comic con is the Cosplay culture.  Fans show their appreciation and passion for beloved characters.  Cosplay can also be a lucrative business if you have a strong work ethic, are consistent, and dedicated to your craft.

Get out the hot glue gun and let’s start forming the foam!

I’ve gathered a random selection of Cosplay data from Instagram.  The cosplayers ranged from followers of +3 million to below 2K.  This alone posed an interesting challenge.  How do I normalize and standardize my data to fit into a model?

My solution was to factor in key performance indicators of Instagram success (regardless of being in the realm of cosplay) and implemented an engagement score for each cosplayer (like a customer value score).

To prevent confounding variables (influencers with a direct correlation to each other), I elected to excluded everything which went into the engagement score.

for_blg_002

My initial read shows this model is very predictive of the data sample gather from Instagram and the highest influencer with significance is the images of the Cosplayer where they are exposed (think NSFW but tasteful).  The amount of hashtags impact was skewed to a correlation of the more followers the less to no hashtags are used.


cosplay_group_002

If you’re a subscriber to this blog and enjoy the Stacks of Stats, you’ll recognize my preference for Q graphs.

There’s a large curl at both tails but most of the data fits well, so there won’t be a need to run a more complex model.

for_blg_001

What could be causing these extreme values towards the end of each tail?

While gathering and visualizing my data, I observed an interesting behavior:

The amount of hashtags deviates and almost has no correlation with engagement.

Driving the skew-ness is two factors:

Newer cosplay accounts use fewer hashtags at the beginning

Well established cosplay accounts use little to zero hashtags with their most recent posts.


for_blg

Our data story isn’t complete and once take the exposed variable to the profiling stage and begin to extrapolate the engagement impact, a telling data story begins to form.

For example, this table read as:

DC comics themed Cosplayers whom also happen to be exposed potentially drive nearly 700 more likes than cosplay images fully clothed.

In the case of what has the highest impact?  We can chalk up Nintendo to the champion and most of it is from the Bowsette trend. Potentially driving in a whopping +61K likes.

Interesting enough the runner up from a potential engagement impact standpoint is Scooby-Doo (Velma mostly), and the gap is less than 10K likes.

Does being exposed help all boost all themes of cosplay?  There is one theme in this sample where there was a negative relationship; Anime.  The possible reason behind this relationship is the niche fan base and attention to detail Anime fans have.  i.e. Hard to go as Sailor Moon without the bow.


 

for_blg_003

What have learned from diving into the Cosplay Data?

Being a top cosplayer on Instagram is as delicate as any social media fame.  Every post, every composition, every hashtag, every theme… can make or break your brand.  Not all cosplay needs to have a level of exposure to be successful, but it is a huge driver in engagement.

A few uses of this analysis are if you’re going to theme as Scooby-Do lean towards Velma and there’s enough out there for comparison.

If you’re looking for a large impact and a fan of video games, take dive at Bowsette (drives in a potential +61K likes).

Finally more hashtags does not mean more likes.

There’s more value in posting a cosplay of character you are passionate about and post relevant hashtags for more organic likes.

After you have consumed this meal, I hope you take these findings and improve your cosplay engagement.  Also as always enjoy the featured pancake recipe below!


005

 

for_blg_004


006

https://www.inquisitr.com/5035455/the-5-sexiest-female-cosplayers-to-follow-on-instagram/


 

 

003_008