Decision Science

1. Introduction and Course Overview

Mihai Bizovi

VP of Decision Science @AdoreMe

2024-12-13

What is the purpose of a business?

Understanding and operating modern businesses is complicated

Most startups fail. 1 One has to make many good decisions over a long period of time to have a chance at success in a competitive environment.

  • Maximize profits? Compete, grow, and gain market share?
  • Maximize shareholder value? 2 3
  • Bring added value to customers by solving a problem?
  • Contribute to economy and society (tax, employment)?
  • Create new markets to allocate resources efficiently?

Model = simplified representation

Why do we build models? Mental, mathematical, statistical? 5

  • We need to understand 6 a system to make decisions
  • Reality is overwhelming and combinatorially explosive 7
  • We want to capture the essential aspects of phenomena
  • While minimizing our biases and foolishness 8

All models have assumptions, pressupositions, and limitations:

  • Models are golemns of Prague – Richard McElreath
  • All models are wrong, but some are useful – George Box

What will we study?

Fundamentals of business economics and statistical modeling:

\[ Question \longrightarrow Modeling \longrightarrow Insight \longrightarrow Action \longrightarrow Outcome \]

  • Understand problem space to ask the right questions 9
  • So that we can build custom statistical models
  • Which bring insight into consequences of actions
  • So that we’re informed which actions that should be taken
  • Such that a firm can achieve its ST/LT objectives

Data science is an umbrella term

We have to understand clearly what we mean by AI, ML, and Analytics. Then, pick the right tool for each problem

  • AI | Decision making under uncertainty at scale
  • Cybernetics | The science of general regularities of Information Processing and Control in CAS 10
  • ML | Expertise from Experience in an automated way 11
  • Analytics | Exploratory analyses to formulate hypotheses
  • Statistics | Changing your mind given new evidence

What are some typical applications?

  • Demand planning, inventory optimization, production 12
  • Revenue management, pricing, and promotion
  • Advertisement and conversion optimization
  • Customer behavior and preferences: 13
    • Churn, repurchase, engagement, LTV, WTP
    • Fraud, credit default, insurance risk 14
    • Choice modeling, recommender systems, uplift
  • Improving products, assortment, merch, processes

Weak or specialized AI

Decision-Making Under Uncertainty at Scale

  • domain-specific (medicine vs finance vs automotive …)
  • data-driven (key idea of learning from data)
  • limited degrees of autonomy and notions of agency
  • sometimes concerned with networks of agents

Unpacking Cybernetics

  • Control \(\implies\) goal-directedness. Action to steer to a trajectory or autopoesis (perserve \((S-f)_{org}\))
  • Information Processing \(\implies\) pattern recoginition, perception, modeling & inference
  • General regularities \(\implies\) plausible of control and information processing across fields and CAS
  • Animal refers to applications in biology, machine – in engineering, and human – in our society and behavior.

Analytics vs ML vs Stats

Source: xkcd; Instead of Stats, I would say we want Causal Inference

Three challenges in statistics

First and foremost, you have to start with a research topic, question and design your study or experiment. It’s hard!

You might’ve heard of internal and external validity

A. Gelman clarifies very well three different aspects of statistical inference. Always remember this when we discuss stats! We want to generalize in terms of:

\[Sample \longrightarrow Population\]

\[Treatment \longrightarrow Control\]

\[Measurement \longrightarrow Construct\]

The holy grail is to build statistical models based on the causal processes informed by theories / hypotheses. Then model how we measured,15 and collected data.16

Isn’t causal inference just stats?

The hardest problem, worthy of a Nobel prize in econometrics

How many times have you heard that correlation doesn’t imply causation? Yet, ALL the methods you studied so far are not sufficient to fully justify causality, even Granger!

  1. Theoretical estimand (quantity or effect of interest)
  2. Scientific (causal) models - DAGs of influence
  3. Use (1) and (2) to build a Statistical Model
  4. Simulate from (2) to prove (3) \(\implies\) (1)
  5. Analyze and make inferences with real data

Why do we need these tools?

VUCA: Volatile | Uncertain | Complex | Ambiguous

  • A problem well formulated is halfway solved
  • Limited resources, conflicting objectives
    • therefore, have to make tradeoffs
  • Decisions informed by robust predictions and evidence
  • Do you have what it takes to navigate this environment?
    • the tools, understanding and skills
    • attitude: learning and growth mindset

To make better decisions!

Think of youself as a business person with superpowers

Engineering in the trenches

  • Develp skills – a hard, but rewarding path from novice to expert 17
  • Pragmatic data-driven software
  • Modeling workflow and pipelines
  • Simulations as a safe playground
  • This course is NOT a bootcamp

Contemplating in the library

  • Cultivating understanding and insight into fundamental theoretical ideas
  • Understanding your domain & clients
  • Apply the right model for the job
  • Awareness of pitfalls and mistakes
  • Rigorous, but NOT detail-oriented

Why did you learn all of that?

  • Linear Algebra - language and geometry of data
  • Mathematical Analysis - formalism of change
  • Probability - logic of uncertainty
  • Statistics - changing your mind and action under evidence
    • inference = data + assumptions
  • Econometrics - does \(X \longrightarrow Y\)? Causes or associations?
  • Operations Research - optimization with constraints

~whoami: as engineer and leader

  • Skin in the game: VP of Decision Science @AdoreMe (VS)
  • AI strategy, decision-making, ML systems design
  • Statistics, ML/data engineering, Full stack data apps
  • Critical applications along the value chain:
    • demand forecasting & inventory optimization systems
    • recommender systems, NLP/LLMs for try-at-home
    • marketing: acquisition, CRO experiment design, CRM

~whoami: as teacher and researcher

  • Maven (yid: meyvn) – experience to help you find your path
    • help you develop skills and understanding
  • Graduate of Cybernetics and Quantitative Economics
    • Thesis & Dissertation on Bayesian Microeconometrics
    • Research in Probabilistic Methods for Time Series
    • Started in systems’ dynamics and economic complexity
  • Speaking at conferences, meetups, and panels

~whoami: as a person

  • Cultivating wisdom – philosophy as a way of life
  • Reading: art history, cognitive science, evolution
  • Painting in oils and watercolor
  • Hiking, hipster coffee, 14 years of pro-ish chess
  • Concerts: opera, metal, jazz, indie, cinema

Course roadmap

  1. Business school from data science perspective
  2. Probability theory: review with lots of simulation
  3. Mathematical statistics: the useful parts of theory
  4. Experiment Design and A/B testing
  5. Bayesian statistics and hierarchical models
  6. R and Python for data science (project)

Covered in year two course

  • Applied Machine Learning:
    • Classification, regression, probabilities
    • Clustering and dimensionality reduction
  • Special topics like:
    • Propensity score matching, choice, and conjoint models
    • NLP / recommender systems / ML for time series
  • Operationalizing ML pipelines, data-driven applications

Simplification and Relations

We see a messy reality, which is the data. We want to get to the essence, underlying structure

 

This is a big picture course: see how it all fits together and how to apply the tools in practice to be useful for firms

Too much content. Where to start?

Moreover, how do you keep up with latest developments?

  • Master the fundamentals: understanding and skill
  • Curated selection of excellent resources
    • Still, years’ worth of study
  • Roadmap and conceptual frame to navigate by yourself
    • Choose what is relevant for your problem
    • Come back to a topic with greater understanding later
  • No shortcuts that work, patience needed

Admin and grading

  • Lectures and labs are in-person only
  • Changes in schedule coordinated with representative
  • Final exam on 10th Feb: 5p
  • Coursework 5p, out of which:
    • Attendance and participation 1p
    • Project in phases* 4p. Final submission on 9 Feb
    • Survey completion: bonus 0.25p
  • Passing grade: round(course+exam) at least 5p

Project report or paper

At most 5 pages + appendix + code attachement

  • What is your domain and problem / research question / hypothesis? What are your objectives?
  • Why is it interesting, unsolved, or important (for you)?
  • What have other people tried or written before?
  • What is your approach and idea to solve the problem?
  • What results did your approach yield? Did it work?
  • Conclusions and future work

Different project flavors

For engineers, if you like programming:

  • Interactive, data-driven application to support decisions
  • Should follow the same thinking outlined before

For analysts, statisticians, and data scientists:

  • Answer an interesting question with data and modeling
  • If you solve a problem - focus on decisions and outcomes
  • If you like exploring data - extract insights and hypotheses
  • Automate and render the report with Quarto (pdf, html)

Not allowed in the project

ML approaches are allowed, however the below aren’t:

  • PCA, CA, FA, clustering, single decision trees, kNN
  • ARIMA, GARCH, VAR and other time-series methods
  • Macroeconomic models and methods

Also not allowed:

  • Reuse of projects from other courses
  • Dashboards built via a BI tool like Qlik
  • Plagiarism and lack of citations / credits

Project phases

  • Form a team of 2-3 people, pick a preliminary subject
    • register the team and topic on the spreadsheet
  • Submit a 1-2 page document which answers the key questions from project structure
    • Pick the dataset(s) you will be working on
    • No need to provide the results / conclusions
    • Get feedback and make the necessary corrections
  • Implement your solution / analysis and render project
    • Submit the final project before Feb 9th, 2025

The easiest and most difficult course

“It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of light, it was the season of darkness, it was the spring of hope, it was the winter of despair.” ― A Tale of Two Cities, (Dickens 1859),

  • No memorization, proofs or solving on paper
  • You have to really understand, justify choices, write code
  • No single, unique solution, but many tradeoffs

A long and rewarding journey

Why should you stick with the course?

  • Effective problem solving & decision-making
  • Bridge the gap between theory and practice
    • between math world and real world
  • Reframe what you already studied to make it useful – e.g. hypothesis testing, regression
  • Lots of new methods for your toolbox
  • Clear understanding and precision in:
    • data science terms and concepts
    • communication with clients, decision-makers, and stakeholders

Navigate the complexities of the field and choose your path. (Source: Generated with DALL-E)

Homework

Reading from course materials and external resources:

  • Course introduction, philosophy, and study guide
  • Prerequisites: why study all of that?
  • Application domains and use-cases
  • Video about Target’s disaster in Canada
  • Critique of the idea of shareholder value

Fill in the survey about your interests and prerequisites

References

Dickens, Charles. 1859. A Tale of Two Cities. New York: Chelsea House Publishers.
Hunt, A. 2008. Pragmatic Thinking and Learning: Refactor Your "Wetware". Pragmatic Bookshelf Series. Pragmatic Bookshelf. https://books.google.ro/books?id=Pq3_PAAACAAJ.
Jordan, Michael. 2016. “Stop Calling Everything AI, Machine-Learning Pioneer Says.” https://spectrum.ieee.org/stop-calling-everything-ai-machinelearning-pioneer-says.
Kozyrkov, Cassie. 2021. “Making Friends with Machine Learning.” Google. https://www.youtube.com/watch?v=1vkb7BCMQd0.
Montier, James. 2014. “The World’s Dumbest Idea.” https://www.gmo.com/globalassets/articles/white-paper/2014/jm_the-worlds-dumbest-idea_12-14.pdf.
Stout, Lynn. 2016. “The Myth of Maximizing Shareholder Value.” https://evonomics.com/maximizing-shareholder-value-dumbest-idea/.

Footnotes

  1. How many do you think survive in the first 5 years in the US?

  2. Montier (2014)

  3. Stout (2016)

  4. Then ask yourself, are you open to founding a startup in the next 5 years?

  5. For example, the coin flip is a wonderful abstraction, especially when the flips or events are independent

  6. In a causal sense, that is if we intervene, this is what is going to happen. Or in the predictive sense

  7. Imagine how many possible paths are there if for every minute we have a choice between 20 actions: read, eat, move, watch, etc

  8. This is not to minimize what intuition resulting from deep knowledge and experience can bring to the table

  9. This is what you basically study in Business Analytics classes: ways to think about diagnosis, strategy, business processes, stakeholders, etc

  10. Jordan (2016)

  11. Kozyrkov (2021)

  12. Deep dive in Logistics class. My strong recommendation is to apply it in practice and read N. Vandeput

  13. You will deep dive into customer psychology and decisions under uncertainty in the behavioral economics class

  14. These models are the way you operationalize a risk management strategy at scale (“minimize bleeding”)

  15. We’ll dedicate one lecture on it, but you have whole specialized classes on business metrics

  16. Again, we treat it in the most general sense, but you have BI classes which show you how data is collected and structured in most businesses

  17. Hunt (2008)