diff --git a/lectures/_toc.yml b/lectures/_toc.yml
index 97c429c4..56b5015c 100644
--- a/lectures/_toc.yml
+++ b/lectures/_toc.yml
@@ -30,6 +30,7 @@ parts:
chapters:
- file: pandas
- file: pandas_panel
+ - file: polars
- caption: More Python Programming
numbered: true
chapters:
diff --git a/lectures/pandas.md b/lectures/pandas.md
index cec984bf..0270acd9 100644
--- a/lectures/pandas.md
+++ b/lectures/pandas.md
@@ -78,6 +78,7 @@ You can think of a `Series` as a "column" of data, such as a collection of obser
A `DataFrame` is a two-dimensional object for storing related columns of data.
+(pd-series)=
## Series
```{index} single: Pandas; Series
diff --git a/lectures/polars.md b/lectures/polars.md
new file mode 100644
index 00000000..e0acffca
--- /dev/null
+++ b/lectures/polars.md
@@ -0,0 +1,800 @@
+---
+jupytext:
+ text_representation:
+ extension: .md
+ format_name: myst
+ format_version: 0.13
+ jupytext_version: 1.16.7
+kernelspec:
+ display_name: Python 3 (ipykernel)
+ language: python
+ name: python3
+---
+
+(pl)=
+```{raw} jupyter
+
+```
+
+# Polars
+
+```{index} single: Python; Polars
+```
+
+In addition to what's in Anaconda, this lecture will need the following libraries:
+
+```{code-cell} ipython3
+:tags: [hide-output]
+
+!pip install --upgrade polars yfinance
+```
+
+## Overview
+
+[Polars](https://pola.rs/) is a fast data manipulation library for Python written in Rust.
+
+It has gained significant popularity as a modern alternative to {doc}`pandas ` due to its performance advantages.
+
+Polars is designed with performance and memory efficiency in mind, leveraging:
+
+* [Apache Arrow columnar format](https://arrow.apache.org/docs/format/Columnar.html) for fast data access
+* [Lazy evaluation](https://en.wikipedia.org/wiki/Lazy_evaluation) to optimize query execution
+* Parallel processing to utilize all available CPU cores
+* An expressive API built around column expressions
+
+```{tip}
+*Why consider Polars over pandas?*
+
+* **Memory**: pandas typically needs 5--10x your dataset size in RAM; Polars needs only 2--4x
+* **Speed**: Polars is 10--100x faster for many common operations
+* **See**: [Polars TPC-H benchmarks](https://www.pola.rs/benchmarks/) for up-to-date performance comparisons
+```
+
+Throughout the lecture, we will assume that the following imports have taken place
+
+```{code-cell} ipython3
+import polars as pl
+import numpy as np
+import matplotlib.pyplot as plt
+```
+
+Like {doc}`pandas`, Polars defines two important data types: `Series` and `DataFrame`.
+
+You can think of a `Series` as a column of data, such as a collection of observations on a single variable.
+
+A `DataFrame` is a two-dimensional object for storing related columns of data.
+
+## Series
+
+```{index} single: Polars; Series
+```
+
+Let's start with Series.
+
+We begin by creating a series of four random observations
+
+```{code-cell} ipython3
+s = pl.Series(name='daily returns', values=np.random.randn(4))
+s
+```
+
+```{note}
+Unlike {doc}`pandas ` Series, Polars Series have no row index.
+Polars is column-centric --- data access is managed through column expressions
+and boolean masks rather than row labels.
+See [this blog post](https://medium.com/@luca.basanisi/understand-polars-lack-of-indexes-526ea75e413) for more detail.
+```
+
+Polars `Series` are built on top of [Apache Arrow](https://arrow.apache.org/) arrays and support many familiar operations
+
+```{code-cell} ipython3
+s * 100
+```
+
+Absolute values are available as a method
+
+```{code-cell} ipython3
+s.abs()
+```
+
+We can also get quick summary statistics
+
+```{code-cell} ipython3
+s.describe()
+```
+
+Since Polars has no row index, labelled data requires a `DataFrame`.
+
+For example, to associate ticker symbols with returns:
+
+```{code-cell} ipython3
+df = pl.DataFrame({
+ 'company': ['AMZN', 'AAPL', 'MSFT', 'GOOG'],
+ 'daily returns': np.random.randn(4)
+})
+df
+```
+
+We access a value by filtering on a column expression
+
+```{code-cell} ipython3
+df.filter(
+ pl.col('company') == 'AMZN'
+).select('daily returns').item()
+```
+
+Updates also use expressions rather than index assignment
+
+```{code-cell} ipython3
+df = df.with_columns(
+ pl.when(pl.col('company') == 'AMZN')
+ .then(0)
+ .otherwise(pl.col('daily returns'))
+ .alias('daily returns')
+)
+df
+```
+
+We can also check membership
+
+```{code-cell} ipython3
+'AAPL' in df['company']
+```
+
+## DataFrames
+
+```{index} single: Polars; DataFrames
+```
+
+While a `Series` is a single column of data, a `DataFrame` is several columns, one for each variable.
+
+As in {doc}`pandas`, let's work with data from the [Penn World Tables](https://www.rug.nl/ggdc/productivity/pwt/pwt-releases/pwt-7.0).
+
+We read this in using `pl.read_csv`
+
+```{code-cell} ipython3
+url = ('https://raw.githubusercontent.com/QuantEcon/'
+ 'lecture-python-programming/main/lectures/_static/'
+ 'lecture_specific/pandas/data/test_pwt.csv')
+df = pl.read_csv(url)
+df
+```
+
+### Selecting data
+
+We can select rows by slicing and columns by name
+
+```{code-cell} ipython3
+df[2:5]
+```
+
+To select specific columns, pass a list of names to `select`
+
+```{code-cell} ipython3
+df.select(['country', 'tcgdp'])
+```
+
+These can be combined
+
+```{code-cell} ipython3
+df[2:5].select(['country', 'tcgdp'])
+```
+
+### Filtering by conditions
+
+The `filter` method accepts boolean expressions built from `pl.col`
+
+```{code-cell} ipython3
+df.filter(pl.col('POP') >= 20000)
+```
+
+Multiple conditions can be combined with `&` (and) and `|` (or)
+
+```{code-cell} ipython3
+df.filter(
+ (pl.col('country').is_in(['Argentina', 'India', 'South Africa'])) &
+ (pl.col('POP') > 40000)
+)
+```
+
+Expressions can involve arithmetic across columns
+
+```{code-cell} ipython3
+df.filter(
+ (pl.col('cc') + pl.col('cg') >= 80) & (pl.col('POP') <= 20000)
+)
+```
+
+Select the country with the largest household consumption share
+
+```{code-cell} ipython3
+df.filter(pl.col('cc') == pl.col('cc').max())
+```
+
+### Column expressions
+
+A key difference from pandas is that Polars uses **column expressions** for transformations rather than element-wise `apply` calls.
+
+Here is an example computing the max of each numeric column
+
+```{code-cell} ipython3
+df.select(
+ pl.col(['year', 'POP', 'XRAT', 'tcgdp', 'cc', 'cg'])
+ .max()
+ .name.suffix('_max')
+)
+```
+
+Expressions can be used inside `with_columns` to add or modify columns
+
+```{code-cell} ipython3
+df.with_columns(
+ (pl.col('XRAT') / 10).alias('XRAT_scaled'),
+ pl.col(pl.Float64).round(2)
+)
+```
+
+Conditional logic uses `pl.when(...).then(...).otherwise(...)`
+
+```{code-cell} ipython3
+df.with_columns(
+ pl.when(pl.col('POP') >= 20000)
+ .then(pl.col('POP'))
+ .otherwise(None)
+ .alias('POP_filtered')
+).select(['country', 'POP', 'POP_filtered'])
+```
+
+```{note}
+Polars provides `map_elements` as an escape hatch for applying arbitrary
+Python functions row-by-row, but it bypasses the optimized expression
+engine and should be avoided when a native expression exists.
+```
+
+### Missing values
+
+Let's insert some null values to demonstrate imputation techniques
+
+```{code-cell} ipython3
+df_nulls = df.with_row_index().with_columns(
+ pl.when(pl.col('index') == 0)
+ .then(None).otherwise(pl.col('XRAT')).alias('XRAT'),
+ pl.when(pl.col('index') == 3)
+ .then(None).otherwise(pl.col('cc')).alias('cc'),
+ pl.when(pl.col('index') == 5)
+ .then(None).otherwise(pl.col('tcgdp')).alias('tcgdp'),
+ pl.when(pl.col('index') == 6)
+ .then(None).otherwise(pl.col('POP')).alias('POP'),
+).drop('index')
+df_nulls
+```
+
+Fill all nulls with zero
+
+```{code-cell} ipython3
+df_nulls.fill_null(0)
+```
+
+Or fill with column means
+
+```{code-cell} ipython3
+cols = ['cc', 'tcgdp', 'POP', 'XRAT']
+df_nulls.with_columns(
+ pl.col(cols).fill_null(pl.col(cols).mean())
+)
+```
+
+Polars also supports forward fill (`fill_null(strategy='forward')`) and interpolation.
+
+There are more [advanced imputation tools](https://scikit-learn.org/stable/modules/impute.html) available in scikit-learn.
+
+### Visualization
+
+Let's build a GDP per capita column and plot it
+
+```{code-cell} ipython3
+df = (df
+ .select(['country', 'POP', 'tcgdp'])
+ .rename({'POP': 'population', 'tcgdp': 'total GDP'})
+ .with_columns(
+ (pl.col('population') * 1e3).alias('population')
+ )
+ .with_columns(
+ (pl.col('total GDP') * 1e6 / pl.col('population'))
+ .alias('GDP percap')
+ )
+ .sort('GDP percap', descending=True)
+)
+df
+```
+
+We can extract columns directly for matplotlib
+
+```{note}
+Polars also provides a built-in [plotting API](https://docs.pola.rs/user-guide/misc/visualization/)
+based on Altair (e.g., `df.plot.bar(x=..., y=...)`).
+We use matplotlib here for consistency with the rest of the lecture series.
+```
+
+```{code-cell} ipython3
+fig, ax = plt.subplots()
+ax.bar(df['country'].to_list(), df['GDP percap'].to_list())
+ax.set_xlabel('country', fontsize=12)
+ax.set_ylabel('GDP per capita', fontsize=12)
+plt.xticks(rotation=45, ha='right')
+plt.tight_layout()
+plt.show()
+```
+
+## Lazy evaluation
+
+```{index} single: Polars; Lazy Evaluation
+```
+
+One of Polars' most powerful features is **lazy evaluation**.
+
+Instead of executing each operation immediately, lazy mode collects the full query plan and optimizes it before running.
+
+### Eager vs lazy
+
+```{code-cell} ipython3
+# Reload the dataset
+url = ('https://raw.githubusercontent.com/QuantEcon/'
+ 'lecture-python-programming/main/lectures/_static/'
+ 'lecture_specific/pandas/data/test_pwt.csv')
+df_full = pl.read_csv(url)
+```
+
+The **eager** API executes immediately (like pandas)
+
+```{code-cell} ipython3
+result_eager = (df_full
+ .filter(pl.col('tcgdp') > 1000)
+ .select(['country', 'year', 'tcgdp'])
+ .sort('tcgdp', descending=True)
+)
+result_eager.head()
+```
+
+The **lazy** API builds a query plan instead
+
+```{code-cell} ipython3
+lazy_query = (df_full.lazy()
+ .filter(pl.col('tcgdp') > 1000)
+ .select(['country', 'year', 'tcgdp'])
+ .sort('tcgdp', descending=True)
+)
+print(lazy_query.explain())
+```
+
+Call `collect` to execute the plan
+
+```{code-cell} ipython3
+result_lazy = lazy_query.collect()
+result_lazy.head()
+```
+
+### Query optimization
+
+The lazy engine applies several optimizations automatically:
+
+* **Predicate pushdown** --- filters are applied as early as possible
+* **Projection pushdown** --- only required columns are read from the source
+* **Common subexpression elimination** --- duplicate calculations are merged
+
+Let's see how Polars rewrites a multi-step query
+
+```{code-cell} ipython3
+optimized = (df_full.lazy()
+ .select(['country', 'year', 'tcgdp', 'POP'])
+ .filter(pl.col('tcgdp') > 500)
+ .with_columns(
+ (pl.col('tcgdp') / pl.col('POP')).alias('gdp_per_capita')
+ )
+ .filter(pl.col('gdp_per_capita') > 10)
+ .select(['country', 'year', 'gdp_per_capita'])
+)
+
+print("Optimized plan:")
+print(optimized.explain())
+```
+
+Executing the plan gives us the final result
+
+```{code-cell} ipython3
+optimized.collect()
+```
+
+### Performance comparison
+
+Let's compare pandas, Polars eager, and Polars lazy on the same task.
+
+We start with a small dataset (the Penn World Tables we used above) to show
+that for small data the differences are negligible
+
+```{code-cell} ipython3
+import pandas as pd
+import time
+
+# Small dataset -- Penn World Tables (~8 rows)
+url = ('https://raw.githubusercontent.com/QuantEcon/'
+ 'lecture-python-programming/main/lectures/_static/'
+ 'lecture_specific/pandas/data/test_pwt.csv')
+small_pd = pd.read_csv(url)
+small_pl = pl.read_csv(url)
+```
+
+Now we time the same filter-select-sort operation in each library
+
+```{code-cell} ipython3
+# pandas
+start = time.perf_counter()
+_ = (small_pd
+ .query('tcgdp > 500')
+ [['country', 'year', 'tcgdp', 'POP']]
+ .assign(gdp_pc=lambda d: d['tcgdp'] / d['POP'])
+ .sort_values('gdp_pc', ascending=False))
+pd_small = time.perf_counter() - start
+
+# Polars eager
+start = time.perf_counter()
+_ = (small_pl
+ .filter(pl.col('tcgdp') > 500)
+ .select(['country', 'year', 'tcgdp', 'POP'])
+ .with_columns((pl.col('tcgdp') / pl.col('POP')).alias('gdp_pc'))
+ .sort('gdp_pc', descending=True))
+pl_small = time.perf_counter() - start
+
+print(f"Small data -- pandas: {pd_small:.4f}s | Polars eager: {pl_small:.4f}s")
+```
+
+On a handful of rows the speed difference is immaterial --- use whichever
+API you find more convenient.
+
+Now let's scale up to 5 million rows where the difference becomes clear.
+
+The task is: filter rows where `value > 0`, compute a weighted product
+`value * weight`, then take the mean of that product within each group ---
+a grouped weighted average.
+
+```{code-cell} ipython3
+n = 5_000_000
+np.random.seed(42)
+
+groups = np.random.choice(['A', 'B', 'C', 'D'], n)
+values = np.random.randn(n)
+weights = np.random.rand(n)
+extra1 = np.random.randn(n)
+extra2 = np.random.randn(n)
+
+big_pd = pd.DataFrame({
+ 'group': groups, 'value': values,
+ 'weight': weights, 'extra1': extra1, 'extra2': extra2
+})
+big_pl = pl.DataFrame({
+ 'group': groups, 'value': values,
+ 'weight': weights, 'extra1': extra1, 'extra2': extra2
+})
+```
+
+First, the pandas baseline
+
+```{code-cell} ipython3
+start = time.perf_counter()
+tmp = big_pd[big_pd['value'] > 0][['group', 'value', 'weight']].copy()
+tmp['weighted'] = tmp['value'] * tmp['weight']
+_ = tmp.groupby('group')['weighted'].mean()
+pd_time = time.perf_counter() - start
+print(f"pandas: {pd_time:.4f}s")
+```
+
+Next, Polars in eager mode
+
+```{code-cell} ipython3
+start = time.perf_counter()
+_ = (big_pl
+ .filter(pl.col('value') > 0)
+ .select(['group', 'value', 'weight'])
+ .with_columns(
+ (pl.col('value') * pl.col('weight')).alias('weighted'))
+ .group_by('group')
+ .agg(pl.col('weighted').mean()))
+eager_time = time.perf_counter() - start
+print(f"Polars eager: {eager_time:.4f}s")
+```
+
+And finally, Polars in lazy mode
+
+```{code-cell} ipython3
+start = time.perf_counter()
+_ = (big_pl.lazy()
+ .filter(pl.col('value') > 0)
+ .select(['group', 'value', 'weight'])
+ .with_columns(
+ (pl.col('value') * pl.col('weight')).alias('weighted'))
+ .group_by('group')
+ .agg(pl.col('weighted').mean())
+ .collect())
+lazy_time = time.perf_counter() - start
+print(f"Polars lazy: {lazy_time:.4f}s")
+```
+
+The take-away:
+
+* For **small data** (thousands of rows), pandas and Polars perform
+ similarly --- choose based on API preference and ecosystem fit.
+* For **medium to large data** (hundreds of thousands of rows and above),
+ Polars can be significantly faster thanks to its Rust engine, parallel
+ execution, and (in lazy mode) query optimization.
+
+The lazy API is particularly powerful when reading from disk --- `scan_csv` returns a `LazyFrame` directly, so filters and projections are pushed down to the file reader.
+
+```{tip}
+Use `pl.scan_csv(path)` instead of `pl.read_csv(path)` when working with
+large CSV files.
+Only the columns and rows you actually need will be read from disk.
+See [the Polars I/O documentation](https://docs.pola.rs/user-guide/io/csv/).
+```
+
+## On-line data sources
+
+```{index} single: Data Sources
+```
+
+As in {doc}`pandas`, Python makes it straightforward to query online databases.
+
+An important database for economists is [FRED](https://fred.stlouisfed.org/) --- a vast collection of time series data maintained by the St. Louis Fed.
+
+Polars' `read_csv` can fetch data from a URL directly.
+
+We use `try_parse_dates=True` to parse the date column automatically
+
+```{code-cell} ipython3
+fred_url = ('https://fred.stlouisfed.org/graph/fredgraph.csv?'
+ 'bgcolor=%23e1e9f0&chart_type=line&drp=0&'
+ 'fo=open%20sans&graph_bgcolor=%23ffffff&'
+ 'height=450&mode=fred&recession_bars=on&'
+ 'txtcolor=%23444444&ts=12&tts=12&width=1318&'
+ 'nt=0&thu=0&trc=0&show_legend=yes&'
+ 'show_axis_titles=yes&show_tooltip=yes&'
+ 'id=UNRATE&scale=left&cosd=1948-01-01&'
+ 'coed=2024-06-01&line_color=%234572a7&'
+ 'link_values=false&line_style=solid&'
+ 'mark_type=none&mw=3&lw=2&ost=-99999&'
+ 'oet=99999&mma=0&fml=a&fq=Monthly&fam=avg&'
+ 'fgst=lin&fgsnd=2020-02-01&line_index=1&'
+ 'transformation=lin&vintage_date=2024-07-29&'
+ 'revision_date=2024-07-29&nd=1948-01-01')
+data = pl.read_csv(fred_url, try_parse_dates=True)
+```
+
+Let's inspect the first few rows
+
+```{code-cell} ipython3
+data.head()
+```
+
+And get summary statistics
+
+```{code-cell} ipython3
+data.describe()
+```
+
+Plot the unemployment rate from 2006 to 2012
+
+```{code-cell} ipython3
+filtered = data.filter(
+ (pl.col('observation_date') >= pl.date(2006, 1, 1)) &
+ (pl.col('observation_date') <= pl.date(2012, 12, 31))
+)
+
+fig, ax = plt.subplots()
+ax.plot(filtered['observation_date'].to_list(),
+ filtered['UNRATE'].to_list())
+ax.set_title('US Unemployment Rate')
+ax.set_xlabel('year', fontsize=12)
+ax.set_ylabel('%', fontsize=12)
+plt.show()
+```
+
+Polars supports [many file formats](https://docs.pola.rs/user-guide/io/) including Excel, JSON, Parquet, and direct database connections.
+
+## Exercises
+
+```{exercise-start}
+:label: pl_ex1
+```
+
+With these imports:
+
+```{code-cell} ipython3
+import datetime as dt
+import yfinance as yf
+```
+
+Write a program to calculate the percentage price change over 2021 for the following shares:
+
+```{code-cell} ipython3
+ticker_list = {'INTC': 'Intel',
+ 'MSFT': 'Microsoft',
+ 'IBM': 'IBM',
+ 'BHP': 'BHP',
+ 'TM': 'Toyota',
+ 'AAPL': 'Apple',
+ 'AMZN': 'Amazon',
+ 'C': 'Citigroup',
+ 'QCOM': 'Qualcomm',
+ 'KO': 'Coca-Cola',
+ 'GOOG': 'Google'}
+```
+
+Here's a function that reads closing prices into a Polars DataFrame:
+
+```{code-cell} ipython3
+def read_data_polars(ticker_list,
+ start=dt.datetime(2021, 1, 1),
+ end=dt.datetime(2021, 12, 31)):
+ """
+ Read closing price data from Yahoo Finance
+ and return a Polars DataFrame.
+ """
+ dataframes = []
+
+ for tick in ticker_list:
+ stock = yf.Ticker(tick)
+ prices = stock.history(start=start, end=end)
+ df = pl.DataFrame({
+ 'Date': list(prices.index.date),
+ tick: prices['Close'].values
+ }).with_columns(pl.col('Date').cast(pl.Date))
+ dataframes.append(df)
+
+ result = dataframes[0]
+ for df in dataframes[1:]:
+ result = result.join(
+ df, on='Date', how='full', coalesce=True
+ )
+ return result
+
+ticker = read_data_polars(ticker_list)
+```
+
+Complete the program to plot the result as a bar graph.
+
+```{exercise-end}
+```
+
+```{solution-start} pl_ex1
+:class: dropdown
+```
+
+Calculate percentage changes using Polars expressions:
+
+```{code-cell} ipython3
+price_change = ticker.select([
+ ((pl.col(tick).last() / pl.col(tick).first() - 1) * 100)
+ .alias(tick)
+ for tick in ticker_list.keys()
+]).transpose(
+ include_header=True,
+ header_name='ticker',
+ column_names=['pct_change']
+).with_columns(
+ pl.col('ticker')
+ .replace_strict(ticker_list, default=pl.col('ticker'))
+ .alias('company')
+).sort('pct_change')
+
+print(price_change)
+```
+
+Plot the results using matplotlib directly:
+
+```{code-cell} ipython3
+companies = price_change['company'].to_list()
+changes = price_change['pct_change'].to_list()
+colors = ['red' if x < 0 else 'blue' for x in changes]
+
+fig, ax = plt.subplots(figsize=(10, 8))
+ax.bar(companies, changes, color=colors)
+ax.set_xlabel('stock', fontsize=12)
+ax.set_ylabel('percentage change in price', fontsize=12)
+plt.xticks(rotation=45, ha='right')
+plt.tight_layout()
+plt.show()
+```
+
+```{solution-end}
+```
+
+
+```{exercise-start}
+:label: pl_ex2
+```
+
+Using `read_data_polars` from {ref}`pl_ex1`, obtain year-on-year percentage change for these indices:
+
+```{code-cell} ipython3
+indices_list = {'^GSPC': 'S&P 500',
+ '^IXIC': 'NASDAQ',
+ '^DJI': 'Dow Jones',
+ '^N225': 'Nikkei'}
+```
+
+Plot the result as a time series graph.
+
+```{exercise-end}
+```
+
+```{solution-start} pl_ex2
+:class: dropdown
+```
+
+```{code-cell} ipython3
+indices_data = read_data_polars(
+ indices_list,
+ start=dt.datetime(1971, 1, 1),
+ end=dt.datetime(2021, 12, 31)
+)
+
+indices_data = indices_data.with_columns(
+ pl.col('Date').dt.year().alias('year')
+)
+```
+
+Calculate yearly returns using group-by operations:
+
+```{code-cell} ipython3
+yearly_returns = indices_data.group_by('year').agg([
+ *[pl.col(idx).drop_nulls().first().alias(f'{idx}_first')
+ for idx in indices_list],
+ *[pl.col(idx).drop_nulls().last().alias(f'{idx}_last')
+ for idx in indices_list]
+])
+
+for idx, name in indices_list.items():
+ yearly_returns = yearly_returns.with_columns(
+ ((pl.col(f'{idx}_last') - pl.col(f'{idx}_first'))
+ / pl.col(f'{idx}_first') * 100).alias(name)
+ )
+
+yearly_returns = (yearly_returns
+ .select(['year', *indices_list.values()])
+ .sort('year')
+)
+print(yearly_returns)
+```
+
+Summary statistics:
+
+```{code-cell} ipython3
+yearly_returns.select(list(indices_list.values())).describe()
+```
+
+Plot each index in a subplot:
+
+```{code-cell} ipython3
+fig, axes = plt.subplots(2, 2, figsize=(12, 10))
+years = yearly_returns['year'].to_list()
+
+for iter_, ax in enumerate(axes.flatten()):
+ name = list(indices_list.values())[iter_]
+ values = yearly_returns[name].to_list()
+ ax.plot(years, values, 'o-', linewidth=2, markersize=4)
+ ax.axhline(y=0, color='k', linestyle='--', alpha=0.3)
+ ax.set_ylabel('yearly return (%)', fontsize=12)
+ ax.set_xlabel('year', fontsize=12)
+ ax.set_title(name, fontsize=12)
+
+plt.tight_layout()
+plt.show()
+```
+
+```{solution-end}
+```
+
+[^mung]: Wikipedia defines munging as cleaning data from one raw form into a structured, purged one.