Python for MBA Students

A Practical Overview

Kerry Back, Rice University

AI Writes the Code. You Read It.

The new reality

  • GitHub Copilot, ChatGPT, Claude generate working Python in seconds
  • You describe what you want — AI writes the code
  • This is already how many analysts and managers work

What this session gives you

  • Recognize what Python code is doing
  • Understand the output well enough to judge whether it’s right
  • Know enough vocabulary to ask AI better questions
  • Follow along when a colleague walks through code

Goal today: Build enough vocabulary to read and understand Python — not to write it from memory.

Here’s What Python Looks Like

# profit_margins.py

products = ["Widget", "Gadget", "Doohickey"]
revenues = [120_000,  85_000,  200_000]
costs    = [ 80_000,  60_000,  140_000]

for i in range(len(products)):
    margin = (revenues[i] - costs[i]) / revenues[i]
    print(f"{products[i]}: {margin:.1%} margin")

Running this prints the profit margin for each product. By the end of this session you’ll be able to name every pattern in that code.

How Python Runs

Your Python Code (your script) pandas / NumPy (pre-built libraries) Python Engine (runs your code) interpreted calls already compiled, runs fast

Two paths to execution

  • Interpreted: Python reads your code line by line and runs it — easy to write, slightly slower
  • Pre-built libraries: pandas and NumPy are written in C, compiled (converted to machine code the CPU can read directly) ahead of time — called by Python, runs very fast

Scripts and Notebooks

Script — a .py file, runs top to bottom

# profit_margins.py
products = ["Widget", "Gadget", "Doohickey"]
revenues = [120_000,  85_000,  200_000]
costs    = [ 80_000,  60_000,  140_000]

for i in range(len(products)):
    margin = (revenues[i] - costs[i]) / revenues[i]
    print(f"{products[i]}: {margin:.1%} margin")

Run from terminal: python profit_margins.py

Notebook — cells you run one at a time

Cell 1:

products = ["Widget", "Gadget", "Doohickey"]
revenues = [120_000,  85_000,  200_000]
costs    = [ 80_000,  60_000,  140_000]

Cell 2:

for i in range(len(products)):
    margin = (revenues[i] - costs[i]) / revenues[i]
    print(f"{products[i]}: {margin:.1%} margin")

Same code — different containers. Notebooks let you run one piece at a time and inspect the result.

Libraries

What Is a Library?

A library is a collection of pre-built code you can import and use.

Key libraries for business

Library What it does
pandas Data tables — read, clean, reshape, summarize
numpy Fast math on arrays of numbers
matplotlib Charts and plots
seaborn Statistical visualization

Hundreds more exist — for machine learning (scikit-learn), optimization (scipy), web APIs, and more. import pandas as pd means: load pandas and call it pd for short.

The 6 Patterns

Almost All Python Is Built From 6 Patterns

# Pattern One-liner example
1 Objects & Types 42 · 3.14 · "hello" · True
2 Variables price = 182.50
3 Collections ["AAPL", "GOOGL"] · {"ticker": "AAPL"}
4 Conditionals if revenue > target: ...
5 Loops for company in portfolio: ...
6 Functions & Methods len(items) · text.upper()

You’ll also encounter

  • List comprehensions — a compact loop on one line: [x * 2 for x in prices] — this is Pattern 5 written in shorthand
  • try / except — error handling: try: ... except: ... — guards code that might fail (bad data, missing file)

How to Spot Each Pattern

Pattern What to look for
Objects & Types Literal values: 42, 3.14, "text", True / False
Variables name = value — a single = sign
Collections [...] = list (ordered) · {key: value, ...} = dictionary
Conditionals if / elif / else followed by : and an indented block
Loops for ... in ...: or while ...: followed by an indented block
Functions & Methods name(...) = function · object.method(...) = method
List comprehension [expr for item in collection] — compact loop inside brackets
try / except try: + indented block, then except: + error handler

Assignment Statements

A single = sign stores a value under a name:

price = 182.50
company = "Apple"
shares = 100
total = price * shares

How to read it

Read price = 182.50 as “price gets 182.50” — not “price equals 182.50.” The right side is computed first, then stored in the name on the left.

Pattern Spotting

Example 1

revenue = 383_285_000_000
costs   = 223_546_000_000
profit  = revenue - costs
margin  = profit / revenue

if margin > 0.25:
    print(f"High margin: {margin:.1%}")
else:
    print(f"Lower margin: {margin:.1%}")

Patterns

  • ② Variables — lines 1–4, each name = value
  • ① Objects & Types — large integers; 0.25 is a float
  • ④ Conditionalif / else with indented blocks
  • ⑥ Functionsprint(...) called twice

Example 2

tickers = ["AAPL", "GOOGL", "MSFT"]
prices  = [182.50, 141.80, 378.90]
shares  = [100, 50, 75]

total = 0
for i in range(len(tickers)):
    value = prices[i] * shares[i]
    total += value
    print(f"{tickers[i]}: ${value:,.0f}")

Patterns

  • ③ Collections — three lists [...]
  • ② Variablestotal, value, i
  • ⑤ Loopfor i in range(...):
  • ⑥ Functionslen(...), range(...), print(...)
  • ⑤ Accumulatortotal += value inside loop

Example 3

portfolio = [
    {"ticker": "AAPL", "shares": 100, "price": 182.50},
    {"ticker": "MSFT", "shares":  75, "price": 378.90},
    {"ticker": "GOOGL","shares":  50, "price": 141.80},
]

values = [h["shares"] * h["price"] for h in portfolio]
total  = sum(values)
print(f"Portfolio value: ${total:,.2f}")

Patterns

  • ③ Collections — list of dictionaries [{...}, ...]
  • List comprehension — compact for loop in brackets
  • ② Variablesvalues, total
  • ⑥ Functionssum(...), print(...)
  • ① Objects — strings and floats as dict values

Example 4

def assess_margin(revenue, costs):
    margin = (revenue - costs) / revenue
    if margin > 0.30:
        return "strong"
    elif margin > 0.15:
        return "adequate"
    else:
        return "weak"

rating = assess_margin(500_000, 320_000)
print(f"Margin rating: {rating}")

Patterns

  • ⑥ Function defdef keyword, parameters, return
  • ② Variablesmargin, rating
  • ① Objects — floats 0.30, 0.15; strings
  • ④ Conditionalif / elif / else inside the function
  • ⑥ Function callassess_margin(500_000, 320_000)

Example 5

companies = ["apple inc.", "microsoft corp.", "alphabet inc."]
suffixes  = ["inc.", "corp.", "ltd."]

for company in companies:
    clean = company.title()
    has_suffix = any(s in company for s in suffixes)
    if has_suffix:
        print(f"  {clean}  ✓")
    else:
        print(f"  {clean}")

Patterns

  • ③ Collections — two lists
  • ⑤ Loopfor company in companies:
  • ② Variablesclean, has_suffix
  • ⑥ Method.title() on a string object
  • ④ Conditionalif has_suffix:
  • ⑥ Functionsany(...), print(...)

Example 6

import pandas as pd

df = pd.read_excel("sales.xlsx")

tech    = df[df["sector"] == "Technology"]
summary = tech.groupby("region")["revenue"].sum()
top     = summary.idxmax()

print(f"Top tech region: {top}")
print(f"Revenue: ${summary[top]:,.0f}")

Patterns

  • Libraryimport pandas as pd
  • ② Variablesdf, tech, summary, top
  • ④ Conditional filterdf[df["sector"] == "Technology"]
  • ⑥ Method chain.groupby(...)[...].sum()
  • ⑥ Methods.idxmax(), .read_excel()
  • ⑥ Functionsprint(...)

Colab + Gemini

Open the Companion Notebook

Go to this link in your browser

colab.research.google.com/github/kerryback/workshop_python/blob/main/python_overview_exercises.ipynb

Google Colab is a free, browser-based Python notebook — nothing to install.

  • The notebook has one section per pattern — ① through ⑥
  • Run the cells

Colab Basics

Three things you need to know

  1. Add a code cell — click the + Code button at the top, or hover between cells and click + Code
  2. Run a cell — press Shift + Enter (runs the cell and moves to the next)
  3. See the output — it appears directly below the cell

That’s it. Everything else — writing the code, explaining errors, adding new cells — you can ask Gemini.

Using Gemini in Colab

How to get code from Gemini

  1. Click the Gemini sparkle icon (top right) — or open a new code cell
  2. Type your prompt describing what you want Python to do
  3. Gemini generates code — click Insert to add it to a cell
  4. Shift + Enter to run

Example

Copy and paste this into Gemini

“Write a Python function that calculates the future value of an investment given a starting amount, annual return rate, and number of years. Then call the function to show how $10,000 grows at 7% over 10, 20, and 30 years. Print the results in a formatted table.”

After Gemini generates the code, look for:

  • A def block — the function definition (Pattern ⑥)
  • ** — exponentiation for compounding math (Pattern ①)
  • A loop to call the function for each time horizon (Pattern ⑤)
  • print() with f-string formatting (Pattern ⑥)

Working with Data

Uploading Data

One-time setup — two steps

  1. Download this data file as usual (not in Colab): sales.xlsx — save it to your Google Drive
  2. Ask Gemini: “Mount my Google Drive so I can access files from it”

Once your Drive is mounted, your file is at /content/drive/MyDrive/sales.xlsx — that’s the path you’ll pass to pd.read_excel.

Load the Sales Data

Run this cell first — then use df in all the prompts below:

import pandas as pd
df = pd.read_excel("/content/drive/MyDrive/sales.xlsx")
df

Think of df as an Excel table.

What’s in the data

10 rows · 4 columns: region, product, revenue, units

region product revenue units
East Widget 42000 210
West Gadget 31500 180

Example 1

Copy and paste this into Gemini

“The variable df is a pandas DataFrame with columns: region, product, revenue, and units. Compute the total revenue and total units for each region. Sort by total revenue from highest to lowest and display the result.”

After Gemini generates the code, look for:

  • df.groupby("region") — grouping (Pattern ⑥ method)
  • .agg(...) or .sum() — aggregating (Pattern ⑥)
  • .sort_values(...) — sorting (Pattern ⑥)
  • A new variable storing the result (Pattern ②)

Example 2: Filter and Enrich

Copy and paste this into Gemini

“Using df (columns: region, product, revenue, units), filter to rows where revenue exceeds 30,000. For those rows, add a new column called price_per_unit equal to revenue divided by units. Display the filtered table sorted by price_per_unit.”

After Gemini generates the code, look for:

  • df[df["revenue"] > 30000] — conditional filter (Pattern ④)
  • df["price_per_unit"] = ... — creating a new column (Pattern ②)
  • Division inside the expression (Pattern ①)
  • .sort_values("price_per_unit") (Pattern ⑥)

Example 3: Best Product per Region

Copy and paste this into Gemini

“Using df (columns: region, product, revenue, units), for each region find the product with the highest revenue. Show the result as a neat table with one row per region.”

After Gemini generates the code, look for:

  • groupby combined with idxmax() or apply() (Pattern ⑥)
  • .loc[...] to look up rows by index (Pattern ⑥)
  • Variable names storing intermediate results (Pattern ②)

Example 4: Method Chaining

Copy and paste this into Gemini

df.groupby(“region”)[“revenue”].sum().sort_values(ascending=False).plot(kind=“bar”)

Read left to right — each method returns an object the next method acts on.

Step Code Returns
1 df.groupby("region") grouped data object
2 ["revenue"] revenue values per group
3 .sum() total revenue per region
4 .sort_values(ascending=False) sorted Series
5 .plot(kind="bar") bar chart

Pattern ⑥ in action. Under the hood, pandas runs these operations as precompiled C code — the “already compiled, runs fast” path from the earlier diagram.

Example 5: Full Pipeline with Chart

Copy and paste this into Gemini

“Using df (columns: region, product, revenue, units): add a column called price_per_unit (revenue ÷ units), then compute total revenue and mean price per unit grouped by product, sort by total revenue highest first, and plot a bar chart of total revenue by product titled ‘Revenue by Product’.”

This is the synthesis prompt. In roughly 10 lines you should find all 6 patterns:

df["price_per_unit"] = df["revenue"] / df["units"]   # ② variable  ① math
result = (df.groupby("product")                       # ⑥ method chain
            .agg(total_revenue=("revenue","sum"),
                 mean_price=("price_per_unit","mean"))
            .sort_values("total_revenue", ascending=False))
result["total_revenue"].plot(kind="bar",              # ⑥ method
                             title="Revenue by Product")

Can you name the pattern for every line Gemini produces?

pandas — More Capabilities

A taste of what else pandas can do:

# Combine two tables on a shared key (like VLOOKUP)
df = pd.merge(prices, company_info, on="ticker")

# Reshape: rows → columns (like a pivot table)
df.pivot_table(values="revenue", index="quarter", columns="region")

# Find missing data
df.isnull().sum()

# Summary statistics in one call
df.describe()

Each of these is a method call (Pattern ⑥). When AI generates code for data tasks, it will use pandas. Recognizing the method-chain pattern tells you what each line is doing — even if you don’t know the specific method name.

Visualization

The same method-chaining pattern — different kind=:

# Distribution of one variable
df["revenue"].plot(kind="hist", bins=20)

# Comparison across categories
df.groupby("region")["revenue"].sum().plot(kind="bar")

# Relationship between two variables
df.plot(kind="scatter", x="units", y="revenue")

# Trend over time
df.set_index("date")["revenue"].plot(kind="line")

Different charts, same pattern: build a series or DataFrame, call .plot(kind=...). When you see a visualization in AI-generated code, look for this pattern — then look at what kind= is set to.

Another Dataset for Practice

stock_panel.xlsx — download link

stock_panel.xlsx

30 S&P 500 stocks × 36 months (2022–2024) — 1,080 rows, 13 columns:

Column What it is
ticker, month, sector Stock identifier, year-month, industry sector
return, momentum, lagged_return Monthly return, 12-month momentum, prior-month return
marketcap_billions, pb Market cap ($ billions), price-to-book ratio
roe, grossmargin, assetturnover Return on equity, gross margin, asset turnover
gp_to_assets, asset_growth Gross profit / assets, year-over-year asset growth

Exercises

  1. Compute summary statistics (mean, median, …) for all variables.
  2. Compute a pie chart of marketcap by firm for the last month of the sample.
  3. Compute a pie chart of marketcap by sector for the last month of the sample.
  4. Create a filled density plot for returns.
  5. Create boxplots of returns by sector.
  6. Create a barplot of mean monthly return by sector.

More Exercises

  1. Create boxplots of the returns of all firms in the technology sector.
  2. Create a filled line chart of the cumulative equally weighted portfolio return by month.
  3. Sort into quintiles each month by momentum. Compute the equally weighted return of each quintile each month. Create a barplot of the mean return of each quintile.
  4. Create a scatterplot of returns versus momentum. Overlay a regression line.
  5. Show the regression results (coefficients, \(t\)-statistics, \(R^2\), etc.) in a table.
  6. Compute the total return over the full time period of each stock and show the top 10 in a table.