A Practical Overview
The new reality
What this session gives you
Goal today: Build enough vocabulary to read and understand Python — not to write it from memory.
# profit_margins.py
products = ["Widget", "Gadget", "Doohickey"]
revenues = [120_000, 85_000, 200_000]
costs = [ 80_000, 60_000, 140_000]
for i in range(len(products)):
margin = (revenues[i] - costs[i]) / revenues[i]
print(f"{products[i]}: {margin:.1%} margin")Running this prints the profit margin for each product. By the end of this session you’ll be able to name every pattern in that code.
Two paths to execution
Script — a .py file, runs top to bottom
# profit_margins.py
products = ["Widget", "Gadget", "Doohickey"]
revenues = [120_000, 85_000, 200_000]
costs = [ 80_000, 60_000, 140_000]
for i in range(len(products)):
margin = (revenues[i] - costs[i]) / revenues[i]
print(f"{products[i]}: {margin:.1%} margin")Run from terminal: python profit_margins.py
Same code — different containers. Notebooks let you run one piece at a time and inspect the result.
A library is a collection of pre-built code you can import and use.
Key libraries for business
| Library | What it does |
|---|---|
pandas |
Data tables — read, clean, reshape, summarize |
numpy |
Fast math on arrays of numbers |
matplotlib |
Charts and plots |
seaborn |
Statistical visualization |
Hundreds more exist — for machine learning (scikit-learn), optimization (scipy), web APIs, and more. import pandas as pd means: load pandas and call it pd for short.
| # | Pattern | One-liner example |
|---|---|---|
| 1 | Objects & Types | 42 · 3.14 · "hello" · True |
| 2 | Variables | price = 182.50 |
| 3 | Collections | ["AAPL", "GOOGL"] · {"ticker": "AAPL"} |
| 4 | Conditionals | if revenue > target: ... |
| 5 | Loops | for company in portfolio: ... |
| 6 | Functions & Methods | len(items) · text.upper() |
You’ll also encounter
[x * 2 for x in prices] — this is Pattern 5 written in shorthandtry: ... except: ... — guards code that might fail (bad data, missing file)| Pattern | What to look for |
|---|---|
| Objects & Types | Literal values: 42, 3.14, "text", True / False |
| Variables | name = value — a single = sign |
| Collections | [...] = list (ordered) · {key: value, ...} = dictionary |
| Conditionals | if / elif / else followed by : and an indented block |
| Loops | for ... in ...: or while ...: followed by an indented block |
| Functions & Methods | name(...) = function · object.method(...) = method |
| List comprehension | [expr for item in collection] — compact loop inside brackets |
| try / except | try: + indented block, then except: + error handler |
A single = sign stores a value under a name:
How to read it
Read price = 182.50 as “price gets 182.50” — not “price equals 182.50.” The right side is computed first, then stored in the name on the left.
Patterns
name = value0.25 is a floatif / else with indented blocksprint(...) called twicePatterns
[...]total, value, ifor i in range(...):len(...), range(...), print(...)total += value inside loopPatterns
[{...}, ...]for loop in bracketsvalues, totalsum(...), print(...)Patterns
def keyword, parameters, returnmargin, rating0.30, 0.15; stringsif / elif / else inside the functionassess_margin(500_000, 320_000)Patterns
for company in companies:clean, has_suffix.title() on a string objectif has_suffix:any(...), print(...)Patterns
import pandas as pddf, tech, summary, topdf[df["sector"] == "Technology"].groupby(...)[...].sum().idxmax(), .read_excel()print(...)Go to this link in your browser
colab.research.google.com/github/kerryback/workshop_python/blob/main/python_overview_exercises.ipynb
Google Colab is a free, browser-based Python notebook — nothing to install.
Three things you need to know
That’s it. Everything else — writing the code, explaining errors, adding new cells — you can ask Gemini.
How to get code from Gemini
Copy and paste this into Gemini
“Write a Python function that calculates the future value of an investment given a starting amount, annual return rate, and number of years. Then call the function to show how $10,000 grows at 7% over 10, 20, and 30 years. Print the results in a formatted table.”
After Gemini generates the code, look for:
def block — the function definition (Pattern ⑥)** — exponentiation for compounding math (Pattern ①)print() with f-string formatting (Pattern ⑥)One-time setup — two steps
Once your Drive is mounted, your file is at /content/drive/MyDrive/sales.xlsx — that’s the path you’ll pass to pd.read_excel.
Run this cell first — then use df in all the prompts below:
Think of df as an Excel table.
What’s in the data
10 rows · 4 columns: region, product, revenue, units
| region | product | revenue | units |
|---|---|---|---|
| East | Widget | 42000 | 210 |
| West | Gadget | 31500 | 180 |
| … | … | … | … |
Copy and paste this into Gemini
“The variable df is a pandas DataFrame with columns: region, product, revenue, and units. Compute the total revenue and total units for each region. Sort by total revenue from highest to lowest and display the result.”
After Gemini generates the code, look for:
df.groupby("region") — grouping (Pattern ⑥ method).agg(...) or .sum() — aggregating (Pattern ⑥).sort_values(...) — sorting (Pattern ⑥)Copy and paste this into Gemini
“Using df (columns: region, product, revenue, units), filter to rows where revenue exceeds 30,000. For those rows, add a new column called price_per_unit equal to revenue divided by units. Display the filtered table sorted by price_per_unit.”
After Gemini generates the code, look for:
df[df["revenue"] > 30000] — conditional filter (Pattern ④)df["price_per_unit"] = ... — creating a new column (Pattern ②).sort_values("price_per_unit") (Pattern ⑥)Copy and paste this into Gemini
“Using df (columns: region, product, revenue, units), for each region find the product with the highest revenue. Show the result as a neat table with one row per region.”
After Gemini generates the code, look for:
groupby combined with idxmax() or apply() (Pattern ⑥).loc[...] to look up rows by index (Pattern ⑥)Copy and paste this into Gemini
df.groupby(“region”)[“revenue”].sum().sort_values(ascending=False).plot(kind=“bar”)
Read left to right — each method returns an object the next method acts on.
| Step | Code | Returns |
|---|---|---|
| 1 | df.groupby("region") |
grouped data object |
| 2 | ["revenue"] |
revenue values per group |
| 3 | .sum() |
total revenue per region |
| 4 | .sort_values(ascending=False) |
sorted Series |
| 5 | .plot(kind="bar") |
bar chart |
Pattern ⑥ in action. Under the hood, pandas runs these operations as precompiled C code — the “already compiled, runs fast” path from the earlier diagram.
Copy and paste this into Gemini
“Using df (columns: region, product, revenue, units): add a column called price_per_unit (revenue ÷ units), then compute total revenue and mean price per unit grouped by product, sort by total revenue highest first, and plot a bar chart of total revenue by product titled ‘Revenue by Product’.”
This is the synthesis prompt. In roughly 10 lines you should find all 6 patterns:
df["price_per_unit"] = df["revenue"] / df["units"] # ② variable ① math
result = (df.groupby("product") # ⑥ method chain
.agg(total_revenue=("revenue","sum"),
mean_price=("price_per_unit","mean"))
.sort_values("total_revenue", ascending=False))
result["total_revenue"].plot(kind="bar", # ⑥ method
title="Revenue by Product")Can you name the pattern for every line Gemini produces?
A taste of what else pandas can do:
# Combine two tables on a shared key (like VLOOKUP)
df = pd.merge(prices, company_info, on="ticker")
# Reshape: rows → columns (like a pivot table)
df.pivot_table(values="revenue", index="quarter", columns="region")
# Find missing data
df.isnull().sum()
# Summary statistics in one call
df.describe()Each of these is a method call (Pattern ⑥). When AI generates code for data tasks, it will use pandas. Recognizing the method-chain pattern tells you what each line is doing — even if you don’t know the specific method name.
The same method-chaining pattern — different kind=:
# Distribution of one variable
df["revenue"].plot(kind="hist", bins=20)
# Comparison across categories
df.groupby("region")["revenue"].sum().plot(kind="bar")
# Relationship between two variables
df.plot(kind="scatter", x="units", y="revenue")
# Trend over time
df.set_index("date")["revenue"].plot(kind="line")Different charts, same pattern: build a series or DataFrame, call .plot(kind=...). When you see a visualization in AI-generated code, look for this pattern — then look at what kind= is set to.
stock_panel.xlsx — download link
30 S&P 500 stocks × 36 months (2022–2024) — 1,080 rows, 13 columns:
| Column | What it is |
|---|---|
ticker, month, sector |
Stock identifier, year-month, industry sector |
return, momentum, lagged_return |
Monthly return, 12-month momentum, prior-month return |
marketcap_billions, pb |
Market cap ($ billions), price-to-book ratio |
roe, grossmargin, assetturnover |
Return on equity, gross margin, asset turnover |
gp_to_assets, asset_growth |
Gross profit / assets, year-over-year asset growth |