Site icon Pancake Breakfast: Stacks of Stats

Short Stacks


Hi and welcome! I’m Tom from @pancake_analytics. In our short stacks section you’ll find quick analytics metaphors and comic con references. In other words here’s the comic con example over everything you’re search in google for! Just scroll below and broaden your knowledge!


Propensity modeling has been around since 1983 and is a statistical approach to measuring uplift (think return on investment).
🥞
The goal is to measure the uplift of similar or matched groups. The heart of this approach lies within two machine learning approaches (segmentation and probability).
🥞
All in all Tony Stark has built a propensity score and targeting system into the avengers upgraded armor.


So you’re the newly appointed leader of shield and have to create multiple new avengers teams.
🥞
You need to spread your talent and be as represented of the entire super hero community (you don’t know what threats may arise).
🥞
Bruce Banner, when he’s not an big green rage monster and Hank Pym will agree: try Stratified random sampling!
🥞
This approach involves the division of a population into smaller groups known as strata. The strata are formed based on members’ shared attributes or characteristics.
🥞
You can put parameters in place before stratification, through univariate analysis.
🥞
Reason is you want your avengers team to be similar with only minor deviation on key saving the day components. Think west coast avengers, not Great Lake avengers 🤣👌


Let’s enter the data science equivalent of the spider verse: Random Forest Decision Tree.
🥞
A random forest will run multiple decision trees (usually 100), kicking out a sample and train set, randomly choosing variables.
🥞
At the end your output will give you, ranked by importance, the top independent variables used to determine your dependent variable.
🥞
It’s the same as having 100’s of Spider-Men, all slightly different fighting for the same cause!
🥞
I love using the random forest as a prep step to build better regression models.


Lets examine one of the techniques I often use in my data stories: Linear Regression 📈The overall idea of regression is to examine two things:
🥞
1) does a set of predictor variables do a good job in predicting an outcome (dependent) variable?
🥞
2)Which variables in particular are significant predictors of the outcome variable, and in what way do they–indicated by the magnitude and sign of the beta estimates–impact the outcome variable? These regression estimates are used to explain the relationship between one dependent variable and one or more independent variables.
🥞
Applying a linear regression I’m able to predict the impact of cups of coffee ingested by detective pikachu vs clues found by detective pikachu ☕️ ⚡️ 🥞


Exit mobile version