Decision Science

1. Introduction and Course Overview

Mihai Bizovi

VP of Decision Science @AdoreMe

2024-12-13

What is the purpose of a business?

Understanding and operating modern businesses is complicated

Most startups fail. ¹ One has to make many good decisions over a long period of time to have a chance at success in a competitive environment.

Maximize profits? Compete, grow, and gain market share?
Maximize shareholder value? ² ³
Bring added value to customers by solving a problem?
Contribute to economy and society (tax, employment)?
Create new markets to allocate resources efficiently?

Model = simplified representation

Why do we build models? Mental, mathematical, statistical? ⁵

We need to understand ⁶ a system to make decisions
Reality is overwhelming and combinatorially explosive ⁷
We want to capture the essential aspects of phenomena
While minimizing our biases and foolishness ⁸

All models have assumptions, pressupositions, and limitations:

Models are golemns of Prague – Richard McElreath
All models are wrong, but some are useful – George Box

What will we study?

Fundamentals of business economics and statistical modeling:

\[ Question \longrightarrow Modeling \longrightarrow Insight \longrightarrow Action \longrightarrow Outcome \]

Understand problem space to ask the right questions ⁹
So that we can build custom statistical models
Which bring insight into consequences of actions
So that we’re informed which actions that should be taken
Such that a firm can achieve its ST/LT objectives

Data science is an umbrella term

We have to understand clearly what we mean by AI, ML, and Analytics. Then, pick the right tool for each problem

AI | Decision making under uncertainty at scale
Cybernetics | The science of general regularities of Information Processing and Control in CAS ¹⁰
ML | Expertise from Experience in an automated way ¹¹
Analytics | Exploratory analyses to formulate hypotheses
Statistics | Changing your mind given new evidence

What are some typical applications?

Demand planning, inventory optimization, production ¹²
Revenue management, pricing, and promotion
Advertisement and conversion optimization
Customer behavior and preferences: ¹³
- Churn, repurchase, engagement, LTV, WTP
- Fraud, credit default, insurance risk ¹⁴
- Choice modeling, recommender systems, uplift
Improving products, assortment, merch, processes

Weak or specialized AI

Decision-Making Under Uncertainty at Scale

domain-specific (medicine vs finance vs automotive …)
data-driven (key idea of learning from data)
limited degrees of autonomy and notions of agency
sometimes concerned with networks of agents

Unpacking Cybernetics

Control \(\implies\) goal-directedness. Action to steer to a trajectory or autopoesis (perserve \((S-f)_{org}\))
Information Processing \(\implies\) pattern recoginition, perception, modeling & inference
General regularities \(\implies\) plausible of control and information processing across fields and CAS
Animal refers to applications in biology, machine – in engineering, and human – in our society and behavior.

Analytics vs ML vs Stats

Analysis — *Source: xkcd;* Instead of Stats, I would say we want Causal Inference

Three challenges in statistics

First and foremost, you have to start with a research topic, question and design your study or experiment. It’s hard!

You might’ve heard of internal and external validity

A. Gelman clarifies very well three different aspects of statistical inference. Always remember this when we discuss stats! We want to generalize in terms of:

\[Sample \longrightarrow Population\]

\[Treatment \longrightarrow Control\]

\[Measurement \longrightarrow Construct\]

The holy grail is to build statistical models based on the causal processes informed by theories / hypotheses. Then model how we measured,¹⁵ and collected data.¹⁶

Isn’t causal inference just stats?

The hardest problem, worthy of a Nobel prize in econometrics

How many times have you heard that correlation doesn’t imply causation? Yet, ALL the methods you studied so far are not sufficient to fully justify causality, even Granger!

Theoretical estimand (quantity or effect of interest)
Scientific (causal) models - DAGs of influence
Use (1) and (2) to build a Statistical Model
Simulate from (2) to prove (3) \(\implies\) (1)
Analyze and make inferences with real data

Why do we need these tools?

VUCA: Volatile | Uncertain | Complex | Ambiguous

A problem well formulated is halfway solved
Limited resources, conflicting objectives
- therefore, have to make tradeoffs
Decisions informed by robust predictions and evidence
Do you have what it takes to navigate this environment?
- the tools, understanding and skills
- attitude: learning and growth mindset

To make better decisions!

Think of youself as a business person with superpowers

Engineering in the trenches

Develp skills – a hard, but rewarding path from novice to expert ¹⁷
Pragmatic data-driven software
Modeling workflow and pipelines
Simulations as a safe playground
This course is NOT a bootcamp

Contemplating in the library

Cultivating understanding and insight into fundamental theoretical ideas
Understanding your domain & clients
Apply the right model for the job
Awareness of pitfalls and mistakes
Rigorous, but NOT detail-oriented

Why did you learn all of that?

Linear Algebra - language and geometry of data
Mathematical Analysis - formalism of change
Probability - logic of uncertainty
Statistics - changing your mind and action under evidence
- inference = data + assumptions
Econometrics - does \(X \longrightarrow Y\)? Causes or associations?
Operations Research - optimization with constraints

~whoami: as engineer and leader

Skin in the game: VP of Decision Science @AdoreMe (VS)
AI strategy, decision-making, ML systems design
Statistics, ML/data engineering, Full stack data apps
Critical applications along the value chain:
- demand forecasting & inventory optimization systems
- recommender systems, NLP/LLMs for try-at-home
- marketing: acquisition, CRO experiment design, CRM

~whoami: as teacher and researcher

Maven (yid: meyvn) – experience to help you find your path
- help you develop skills and understanding
Graduate of Cybernetics and Quantitative Economics
- Thesis & Dissertation on Bayesian Microeconometrics
- Research in Probabilistic Methods for Time Series
- Started in systems’ dynamics and economic complexity
Speaking at conferences, meetups, and panels

~whoami: as a person

Cultivating wisdom – philosophy as a way of life
Reading: art history, cognitive science, evolution
Painting in oils and watercolor
Hiking, hipster coffee, 14 years of pro-ish chess
Concerts: opera, metal, jazz, indie, cinema

Course roadmap

Business school from data science perspective
Probability theory: review with lots of simulation
Mathematical statistics: the useful parts of theory
Experiment Design and A/B testing
Bayesian statistics and hierarchical models
R and Python for data science (project)

Covered in year two course

Applied Machine Learning:
- Classification, regression, probabilities
- Clustering and dimensionality reduction
Special topics like:
- Propensity score matching, choice, and conjoint models
- NLP / recommender systems / ML for time series
Operationalizing ML pipelines, data-driven applications

Simplification and Relations

Reality

We see a messy reality, which is the data. We want to get to the essence, underlying structure

Big Picture — This is a **big picture** course: see how it all fits together and how to apply the tools in practice to be useful for firms

Too much content. Where to start?

Moreover, how do you keep up with latest developments?

Master the fundamentals: understanding and skill
Curated selection of excellent resources
- Still, years’ worth of study
Roadmap and conceptual frame to navigate by yourself
- Choose what is relevant for your problem
- Come back to a topic with greater understanding later
No shortcuts that work, patience needed

Admin and grading

Lectures and labs are in-person only
Changes in schedule coordinated with representative
Final exam on 10th Feb: 5p
Coursework 5p, out of which:
- Attendance and participation 1p
- Project in phases* 4p. Final submission on 9 Feb
- Survey completion: bonus 0.25p
Passing grade: round(course+exam) at least 5p

Project report or paper

At most 5 pages + appendix + code attachement

What is your domain and problem / research question / hypothesis? What are your objectives?
Why is it interesting, unsolved, or important (for you)?
What have other people tried or written before?
What is your approach and idea to solve the problem?
What results did your approach yield? Did it work?
Conclusions and future work

Different project flavors

For engineers, if you like programming:

Interactive, data-driven application to support decisions
Should follow the same thinking outlined before

For analysts, statisticians, and data scientists:

Answer an interesting question with data and modeling
If you solve a problem - focus on decisions and outcomes
If you like exploring data - extract insights and hypotheses
Automate and render the report with Quarto (pdf, html)

Not allowed in the project

ML approaches are allowed, however the below aren’t:

PCA, CA, FA, clustering, single decision trees, kNN
ARIMA, GARCH, VAR and other time-series methods
Macroeconomic models and methods

Also not allowed:

Reuse of projects from other courses
Dashboards built via a BI tool like Qlik
Plagiarism and lack of citations / credits

Project phases

Form a team of 2-3 people, pick a preliminary subject
- register the team and topic on the spreadsheet
Submit a 1-2 page document which answers the key questions from project structure
- Pick the dataset(s) you will be working on
- No need to provide the results / conclusions
- Get feedback and make the necessary corrections
Implement your solution / analysis and render project
- Submit the final project before Feb 9th, 2025

The easiest and most difficult course

“It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of light, it was the season of darkness, it was the spring of hope, it was the winter of despair.” ― A Tale of Two Cities, (Dickens 1859),

No memorization, proofs or solving on paper
You have to really understand, justify choices, write code
No single, unique solution, but many tradeoffs

A long and rewarding journey

Why should you stick with the course?

Effective problem solving & decision-making
Bridge the gap between theory and practice
- between math world and real world
Reframe what you already studied to make it useful – e.g. hypothesis testing, regression
Lots of new methods for your toolbox
Clear understanding and precision in:
- data science terms and concepts
- communication with clients, decision-makers, and stakeholders

Navigate the complexities of the field and choose your path. *(Source: Generated with DALL-E)*

Homework

Reading from course materials and external resources:

Course introduction, philosophy, and study guide
Prerequisites: why study all of that?
Application domains and use-cases
Video about Target’s disaster in Canada
Critique of the idea of shareholder value

Fill in the survey about your interests and prerequisites

References

Dickens, Charles. 1859. A Tale of Two Cities. New York: Chelsea House Publishers.

Hunt, A. 2008. Pragmatic Thinking and Learning: Refactor Your "Wetware". Pragmatic Bookshelf Series. Pragmatic Bookshelf. https://books.google.ro/books?id=Pq3_PAAACAAJ.

Jordan, Michael. 2016. “Stop Calling Everything AI, Machine-Learning Pioneer Says.” https://spectrum.ieee.org/stop-calling-everything-ai-machinelearning-pioneer-says.

Kozyrkov, Cassie. 2021. “Making Friends with Machine Learning.” Google. https://www.youtube.com/watch?v=1vkb7BCMQd0.

Montier, James. 2014. “The World’s Dumbest Idea.” https://www.gmo.com/globalassets/articles/white-paper/2014/jm_the-worlds-dumbest-idea_12-14.pdf.

Stout, Lynn. 2016. “The Myth of Maximizing Shareholder Value.” https://evonomics.com/maximizing-shareholder-value-dumbest-idea/.

Footnotes

How many do you think survive in the first 5 years in the US?
Montier (2014)
Stout (2016)
Then ask yourself, are you open to founding a startup in the next 5 years?
For example, the coin flip is a wonderful abstraction, especially when the flips or events are independent
In a causal sense, that is if we intervene, this is what is going to happen. Or in the predictive sense
Imagine how many possible paths are there if for every minute we have a choice between 20 actions: read, eat, move, watch, etc
This is not to minimize what intuition resulting from deep knowledge and experience can bring to the table
This is what you basically study in Business Analytics classes: ways to think about diagnosis, strategy, business processes, stakeholders, etc
Jordan (2016)
Kozyrkov (2021)
Deep dive in Logistics class. My strong recommendation is to apply it in practice and read N. Vandeput
You will deep dive into customer psychology and decisions under uncertainty in the behavioral economics class
These models are the way you operationalize a risk management strategy at scale (“minimize bleeding”)
We’ll dedicate one lecture on it, but you have whole specialized classes on business metrics
Again, we treat it in the most general sense, but you have BI classes which show you how data is collected and structured in most businesses
Hunt (2008)