import pandas as pd
import plotly.express as px
url = 'https://faculty.utrgv.edu/diego.escobari/teaching/Datasets/WAGE1.xls'
wages = pd.read_excel(url, header=None)Plotly creates interactive visualizations that allow users to zoom, pan, hover for details, and explore data dynamically. This can be powerful for presentations.
To create this, we can:
fig.write_html to write it to an HTML fileThis creates a clickable link within your presentation that opens the HTML file in your default browser.
Hover data: Show details when hovering over data points - great for explaining differences between categories
Hide/Show plot elements: Click on legend items to hide/show elements - useful for building up explanations step by step
Animation: Show how data evolves over time with play button controls
Pan/Zoom: All plotly figures include control elements at the top right for panning, zooming, and other interactions
We’ll use the wages data to demonstrate Plotly visualizations.
# Convert binary variables to categorical
wages['Gender'] = wages['female'].map({0: 'Male', 1: 'Female'})
wages['Race'] = wages['nonwhite'].map({0: 'White', 1: 'Non-white'})
wages['Marital_Status'] = wages['married'].map({0: 'Not Married', 1: 'Married'})
wages['Urban'] = wages['smsa'].map({0: 'Rural', 1: 'Urban'})# Create industry categories
def get_industry(row):
if row['construc'] == 1:
return 'Construction'
elif row['ndurman'] == 1:
return 'Non-durable Manufacturing'
elif row['trcommpu'] == 1:
return 'Transportation/Communications'
elif row['trade'] == 1:
return 'Trade'
elif row['services'] == 1:
return 'Services'
elif row['profserv'] == 1:
return 'Professional Services'
else:
return 'Other'
wages['Industry'] = wages.apply(get_industry, axis=1)# Create education and experience categories
wages['Education_Level'] = pd.cut(wages['educ'],
bins=[0, 12, 16, 25],
labels=['High School or Less', 'Some College', 'College Plus'])
wages['Experience_Level'] = pd.cut(wages['exper'],
bins=[0, 5, 15, 60],
labels=['Low (0-5)', 'Mid (6-15)', 'High (16+)'])
# Drop unnecessary columns
wages = wages.drop(columns=wages.columns[4:-9])
# Create combined Gender_Race column
wages['Gender_Race'] = [
x + ' ' + y for x, y in zip(wages['Gender'], wages['Race'])
]fig = px.box(wages,
x='Education_Level',
y='wage',
title='Wage Distribution by Gender and Race',
labels={'Gender_Race': 'Demographic Group',
'wage': 'Hourly Wage ($)',
'Education_Level': ''},
template='plotly_white',
color='Gender_Race')
fig.update_layout(
height=400,
xaxis_tickangle=-45,
)
fig.show()Exercise 1 (with Gemini): Ask Gemini to “create an interactive Plotly box plot comparing values across different categories”
Exercise 2 (on your own): Type import plotly.express as px then fig = px.scatter(x=[1, 2, 3], y=[4, 5, 6]) then fig.show() and run it.
To save your Plotly figure as an HTML file for use in presentations:
This creates an interactive HTML file that can be: