Python for MBA Students

A Practical Overview

Kerry Back, Rice University

AI Writes the Code. You Read It.

The new reality

GitHub Copilot, ChatGPT, Claude generate working Python in seconds
You describe what you want — AI writes the code
This is already how many analysts and managers work

What this session gives you

Recognize what Python code is doing
Understand the output well enough to judge whether it’s right
Know enough vocabulary to ask AI better questions
Follow along when a colleague walks through code

Goal today: Build enough vocabulary to read and understand Python — not to write it from memory.

Here’s What Python Looks Like

# profit_margins.py

products = ["Widget", "Gadget", "Doohickey"]
revenues = [120_000,  85_000,  200_000]
costs    = [ 80_000,  60_000,  140_000]

for i in range(len(products)):
    margin = (revenues[i] - costs[i]) / revenues[i]
    print(f"{products[i]}: {margin:.1%} margin")

Running this prints the profit margin for each product. By the end of this session you’ll be able to name every pattern in that code.

How Python Runs

Two paths to execution

Interpreted: Python reads your code line by line and runs it — easy to write, slightly slower
Pre-built libraries: pandas and NumPy are written in C, compiled (converted to machine code the CPU can read directly) ahead of time — called by Python, runs very fast

Scripts and Notebooks

Script — a .py file, runs top to bottom

# profit_margins.py
products = ["Widget", "Gadget", "Doohickey"]
revenues = [120_000,  85_000,  200_000]
costs    = [ 80_000,  60_000,  140_000]

for i in range(len(products)):
    margin = (revenues[i] - costs[i]) / revenues[i]
    print(f"{products[i]}: {margin:.1%} margin")

Run from terminal: python profit_margins.py

Notebook — cells you run one at a time

Cell 1:

products = ["Widget", "Gadget", "Doohickey"]
revenues = [120_000,  85_000,  200_000]
costs    = [ 80_000,  60_000,  140_000]

Cell 2:

for i in range(len(products)):
    margin = (revenues[i] - costs[i]) / revenues[i]
    print(f"{products[i]}: {margin:.1%} margin")

Same code — different containers. Notebooks let you run one piece at a time and inspect the result.

Libraries

What Is a Library?

A library is a collection of pre-built code you can import and use.

Key libraries for business

Library	What it does
`pandas`	Data tables — read, clean, reshape, summarize
`numpy`	Fast math on arrays of numbers
`matplotlib`	Charts and plots
`seaborn`	Statistical visualization

Hundreds more exist — for machine learning (scikit-learn), optimization (scipy), web APIs, and more. import pandas as pd means: load pandas and call it pd for short.

The 6 Patterns

Almost All Python Is Built From 6 Patterns

#	Pattern	One-liner example
1	Objects & Types	`42` · `3.14` · `"hello"` · `True`
2	Variables	`price = 182.50`
3	Collections	`["AAPL", "GOOGL"]` · `{"ticker": "AAPL"}`
4	Conditionals	`if revenue > target: ...`
5	Loops	`for company in portfolio: ...`
6	Functions & Methods	`len(items)` · `text.upper()`

You’ll also encounter

List comprehensions — a compact loop on one line: [x * 2 for x in prices] — this is Pattern 5 written in shorthand
try / except — error handling: try: ... except: ... — guards code that might fail (bad data, missing file)

How to Spot Each Pattern

Pattern	What to look for
Objects & Types	Literal values: `42`, `3.14`, `"text"`, `True` / `False`
Variables	`name = value` — a single `=` sign
Collections	`[...]` = list (ordered) · `{key: value, ...}` = dictionary
Conditionals	`if` / `elif` / `else` followed by `:` and an indented block
Loops	`for ... in ...:` or `while ...:` followed by an indented block
Functions & Methods	`name(...)` = function · `object.method(...)` = method
List comprehension	`[expr for item in collection]` — compact loop inside brackets
try / except	`try:` + indented block, then `except:` + error handler

Assignment Statements

A single = sign stores a value under a name:

price = 182.50
company = "Apple"
shares = 100
total = price * shares

How to read it

Read price = 182.50 as “price gets 182.50” — not “price equals 182.50.” The right side is computed first, then stored in the name on the left.

Pattern Spotting

Example 1

revenue = 383_285_000_000
costs   = 223_546_000_000
profit  = revenue - costs
margin  = profit / revenue

if margin > 0.25:
    print(f"High margin: {margin:.1%}")
else:
    print(f"Lower margin: {margin:.1%}")

Patterns

② Variables — lines 1–4, each name = value
① Objects & Types — large integers; 0.25 is a float
④ Conditional — if / else with indented blocks
⑥ Functions — print(...) called twice

Example 2

tickers = ["AAPL", "GOOGL", "MSFT"]
prices  = [182.50, 141.80, 378.90]
shares  = [100, 50, 75]

total = 0
for i in range(len(tickers)):
    value = prices[i] * shares[i]
    total += value
    print(f"{tickers[i]}: ${value:,.0f}")

Patterns

③ Collections — three lists [...]
② Variables — total, value, i
⑤ Loop — for i in range(...):
⑥ Functions — len(...), range(...), print(...)
⑤ Accumulator — total += value inside loop

Example 3

portfolio = [
    {"ticker": "AAPL", "shares": 100, "price": 182.50},
    {"ticker": "MSFT", "shares":  75, "price": 378.90},
    {"ticker": "GOOGL","shares":  50, "price": 141.80},
]

values = [h["shares"] * h["price"] for h in portfolio]
total  = sum(values)
print(f"Portfolio value: ${total:,.2f}")

Patterns

③ Collections — list of dictionaries [{...}, ...]
List comprehension — compact for loop in brackets
② Variables — values, total
⑥ Functions — sum(...), print(...)
① Objects — strings and floats as dict values

Example 4

def assess_margin(revenue, costs):
    margin = (revenue - costs) / revenue
    if margin > 0.30:
        return "strong"
    elif margin > 0.15:
        return "adequate"
    else:
        return "weak"

rating = assess_margin(500_000, 320_000)
print(f"Margin rating: {rating}")

Patterns

⑥ Function def — def keyword, parameters, return
② Variables — margin, rating
① Objects — floats 0.30, 0.15; strings
④ Conditional — if / elif / else inside the function
⑥ Function call — assess_margin(500_000, 320_000)

Example 5

companies = ["apple inc.", "microsoft corp.", "alphabet inc."]
suffixes  = ["inc.", "corp.", "ltd."]

for company in companies:
    clean = company.title()
    has_suffix = any(s in company for s in suffixes)
    if has_suffix:
        print(f"  {clean}  ✓")
    else:
        print(f"  {clean}")

Patterns

③ Collections — two lists
⑤ Loop — for company in companies:
② Variables — clean, has_suffix
⑥ Method — .title() on a string object
④ Conditional — if has_suffix:
⑥ Functions — any(...), print(...)

Example 6

import pandas as pd

df = pd.read_excel("sales.xlsx")

tech    = df[df["sector"] == "Technology"]
summary = tech.groupby("region")["revenue"].sum()
top     = summary.idxmax()

print(f"Top tech region: {top}")
print(f"Revenue: ${summary[top]:,.0f}")

Patterns

Library — import pandas as pd
② Variables — df, tech, summary, top
④ Conditional filter — df[df["sector"] == "Technology"]
⑥ Method chain — .groupby(...)[...].sum()
⑥ Methods — .idxmax(), .read_excel()
⑥ Functions — print(...)

Colab + Gemini

Open the Companion Notebook

Go to this link in your browser

colab.research.google.com/github/kerryback/workshop_python/blob/main/python_overview_exercises.ipynb

Google Colab is a free, browser-based Python notebook — nothing to install.

The notebook has one section per pattern — ① through ⑥
Run the cells

Colab Basics

Three things you need to know

Add a code cell — click the + Code button at the top, or hover between cells and click + Code
Run a cell — press Shift + Enter (runs the cell and moves to the next)
See the output — it appears directly below the cell

That’s it. Everything else — writing the code, explaining errors, adding new cells — you can ask Gemini.

Using Gemini in Colab

How to get code from Gemini

Click the Gemini sparkle icon (top right) — or open a new code cell
Type your prompt describing what you want Python to do
Gemini generates code — click Insert to add it to a cell
Shift + Enter to run

Example

Copy and paste this into Gemini

“Write a Python function that calculates the future value of an investment given a starting amount, annual return rate, and number of years. Then call the function to show how $10,000 grows at 7% over 10, 20, and 30 years. Print the results in a formatted table.”

After Gemini generates the code, look for:

A def block — the function definition (Pattern ⑥)
** — exponentiation for compounding math (Pattern ①)
A loop to call the function for each time horizon (Pattern ⑤)
print() with f-string formatting (Pattern ⑥)

Working with Data

Uploading Data

One-time setup — two steps

Download this data file as usual (not in Colab): sales.xlsx — save it to your Google Drive
Ask Gemini: “Mount my Google Drive so I can access files from it”

Once your Drive is mounted, your file is at /content/drive/MyDrive/sales.xlsx — that’s the path you’ll pass to pd.read_excel.

Load the Sales Data

Run this cell first — then use df in all the prompts below:

import pandas as pd
df = pd.read_excel("/content/drive/MyDrive/sales.xlsx")
df

Think of df as an Excel table.

What’s in the data

10 rows · 4 columns: region, product, revenue, units

region	product	revenue	units
East	Widget	42000	210
West	Gadget	31500	180
…	…	…	…

Example 1

Copy and paste this into Gemini

“The variable df is a pandas DataFrame with columns: region, product, revenue, and units. Compute the total revenue and total units for each region. Sort by total revenue from highest to lowest and display the result.”

After Gemini generates the code, look for:

df.groupby("region") — grouping (Pattern ⑥ method)
.agg(...) or .sum() — aggregating (Pattern ⑥)
.sort_values(...) — sorting (Pattern ⑥)
A new variable storing the result (Pattern ②)

Example 2: Filter and Enrich

Copy and paste this into Gemini

“Using df (columns: region, product, revenue, units), filter to rows where revenue exceeds 30,000. For those rows, add a new column called price_per_unit equal to revenue divided by units. Display the filtered table sorted by price_per_unit.”

After Gemini generates the code, look for:

df[df["revenue"] > 30000] — conditional filter (Pattern ④)
df["price_per_unit"] = ... — creating a new column (Pattern ②)
Division inside the expression (Pattern ①)
.sort_values("price_per_unit") (Pattern ⑥)

Example 3: Best Product per Region

Copy and paste this into Gemini

“Using df (columns: region, product, revenue, units), for each region find the product with the highest revenue. Show the result as a neat table with one row per region.”

After Gemini generates the code, look for:

groupby combined with idxmax() or apply() (Pattern ⑥)
.loc[...] to look up rows by index (Pattern ⑥)
Variable names storing intermediate results (Pattern ②)

Example 4: Method Chaining

Copy and paste this into Gemini

df.groupby(“region”)[“revenue”].sum().sort_values(ascending=False).plot(kind=“bar”)

Read left to right — each method returns an object the next method acts on.

Step	Code	Returns
1	`df.groupby("region")`	grouped data object
2	`["revenue"]`	revenue values per group
3	`.sum()`	total revenue per region
4	`.sort_values(ascending=False)`	sorted Series
5	`.plot(kind="bar")`	bar chart

Pattern ⑥ in action. Under the hood, pandas runs these operations as precompiled C code — the “already compiled, runs fast” path from the earlier diagram.

Example 5: Full Pipeline with Chart

Copy and paste this into Gemini

“Using df (columns: region, product, revenue, units): add a column called price_per_unit (revenue ÷ units), then compute total revenue and mean price per unit grouped by product, sort by total revenue highest first, and plot a bar chart of total revenue by product titled ‘Revenue by Product’.”

This is the synthesis prompt. In roughly 10 lines you should find all 6 patterns:

df["price_per_unit"] = df["revenue"] / df["units"]   # ② variable  ① math
result = (df.groupby("product")                       # ⑥ method chain
            .agg(total_revenue=("revenue","sum"),
                 mean_price=("price_per_unit","mean"))
            .sort_values("total_revenue", ascending=False))
result["total_revenue"].plot(kind="bar",              # ⑥ method
                             title="Revenue by Product")

Can you name the pattern for every line Gemini produces?

pandas — More Capabilities

A taste of what else pandas can do:

# Combine two tables on a shared key (like VLOOKUP)
df = pd.merge(prices, company_info, on="ticker")

# Reshape: rows → columns (like a pivot table)
df.pivot_table(values="revenue", index="quarter", columns="region")

# Find missing data
df.isnull().sum()

# Summary statistics in one call
df.describe()

Each of these is a method call (Pattern ⑥). When AI generates code for data tasks, it will use pandas. Recognizing the method-chain pattern tells you what each line is doing — even if you don’t know the specific method name.

Visualization

The same method-chaining pattern — different kind=:

# Distribution of one variable
df["revenue"].plot(kind="hist", bins=20)

# Comparison across categories
df.groupby("region")["revenue"].sum().plot(kind="bar")

# Relationship between two variables
df.plot(kind="scatter", x="units", y="revenue")

# Trend over time
df.set_index("date")["revenue"].plot(kind="line")

Different charts, same pattern: build a series or DataFrame, call .plot(kind=...). When you see a visualization in AI-generated code, look for this pattern — then look at what kind= is set to.

Another Dataset for Practice

stock_panel.xlsx — download link

stock_panel.xlsx

30 S&P 500 stocks × 36 months (2022–2024) — 1,080 rows, 13 columns:

Column	What it is
`ticker`, `month`, `sector`	Stock identifier, year-month, industry sector
`return`, `momentum`, `lagged_return`	Monthly return, 12-month momentum, prior-month return
`marketcap_billions`, `pb`	Market cap ($ billions), price-to-book ratio
`roe`, `grossmargin`, `assetturnover`	Return on equity, gross margin, asset turnover
`gp_to_assets`, `asset_growth`	Gross profit / assets, year-over-year asset growth

Exercises

Compute summary statistics (mean, median, …) for all variables.
Compute a pie chart of marketcap by firm for the last month of the sample.
Compute a pie chart of marketcap by sector for the last month of the sample.
Create a filled density plot for returns.
Create boxplots of returns by sector.
Create a barplot of mean monthly return by sector.

More Exercises

Create boxplots of the returns of all firms in the technology sector.
Create a filled line chart of the cumulative equally weighted portfolio return by month.
Sort into quintiles each month by momentum. Compute the equally weighted return of each quintile each month. Create a barplot of the mean return of each quintile.
Create a scatterplot of returns versus momentum. Overlay a regression line.
Show the regression results (coefficients, $t$-statistics, $R^2$, etc.) in a table.
Compute the total return over the full time period of each stock and show the top 10 in a table.