plot¶
from hotcoco.plot import (
report,
pr_curve, pr_curve_iou_sweep, pr_curve_by_category, pr_curve_top_n,
confusion_matrix, top_confusions, per_category_ap, tide_errors,
style, SERIES_COLORS, CHROME, SEQUENTIAL,
)
Requires pip install hotcoco[plot] (matplotlib >= 3.5).
All functions share these common parameters:
| Parameter | Type | Description |
|---|---|---|
theme |
str |
Visual theme: "cold-brew" (default), "warm-slate", "scientific-blue", or "ember". |
paper_mode |
bool |
Set both figure and axes background to white. Useful for LaTeX / PowerPoint. Default False. |
ax |
Axes | None |
Draw on an existing axes. If None, creates a new figure. |
save_path |
str | Path | None |
Save figure to this path (150 DPI). |
All functions return (Figure, Axes).
pr_curve_iou_sweep¶
pr_curve_iou_sweep(
coco_eval, *,
iou_thrs=None, area_rng="all", max_det=None,
theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)
Plot one precision-recall curve per IoU threshold, with precision averaged across all categories. The primary line (lowest IoU) gets an under-fill and F1 peak annotation.
| Parameter | Type | Description |
|---|---|---|
coco_eval |
COCOeval |
Must have run() called first. |
iou_thrs |
list[float] | None |
IoU thresholds to include. Default: all thresholds in params. |
area_rng |
str |
Area range: "all", "small", "medium", "large". Default "all". |
max_det |
int | None |
Max detections. Default: last entry in params.max_dets. |
pr_curve_by_category¶
pr_curve_by_category(
coco_eval, cat_id, *,
iou_thr=0.5, area_rng="all", max_det=None,
theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)
Plot the precision-recall curve for a single category at a fixed IoU threshold, with an F1 peak annotation.
| Parameter | Type | Description |
|---|---|---|
coco_eval |
COCOeval |
Must have run() called first. |
cat_id |
int |
Category ID to plot. |
iou_thr |
float |
IoU threshold. Default 0.5. |
area_rng |
str |
Area range. Default "all". |
max_det |
int | None |
Max detections. Default: last entry in params.max_dets. |
pr_curve_top_n¶
pr_curve_top_n(
coco_eval, *,
cat_ids=None, top_n=10, iou_thr=0.5, area_rng="all", max_det=None,
theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)
Plot precision-recall curves for multiple categories on one axes. When cat_ids is omitted, selects the top top_n categories by AP automatically.
| Parameter | Type | Description |
|---|---|---|
coco_eval |
COCOeval |
Must have run() called first. |
cat_ids |
list[int] | None |
Categories to plot. Default: top top_n by AP. |
top_n |
int |
Number of top categories when cat_ids is omitted. Default 10. |
iou_thr |
float |
IoU threshold. Default 0.5. |
area_rng |
str |
Area range. Default "all". |
max_det |
int | None |
Max detections. Default: last entry in params.max_dets. |
pr_curve¶
pr_curve(
coco_eval, *,
iou_thrs=None, cat_id=None, cat_ids=None,
iou_thr=None, top_n=10, area_rng="all", max_det=None,
theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)
Convenience dispatcher — inspects the arguments and calls the appropriate named function. Prefer calling the named functions directly for clarity.
- No
cat_idorcat_ids→pr_curve_iou_sweep cat_idset →pr_curve_by_categorycat_idsoriou_thrset →pr_curve_top_n
| Parameter | Type | Description |
|---|---|---|
coco_eval |
COCOeval |
Must have run() called first. |
iou_thrs |
list[float] | None |
IoU thresholds to plot (IoU sweep mode). |
cat_id |
int | None |
Single category to plot. |
cat_ids |
list[int] | None |
Categories to compare. |
iou_thr |
float | None |
Fixed IoU for single/multi-category modes. Default 0.50. |
top_n |
int |
Top N categories by AP. Default 10. |
area_rng |
str |
Area range: "all", "small", "medium", "large". |
max_det |
int | None |
Max detections. Default: last in params. |
confusion_matrix¶
confusion_matrix(
cm_dict, *,
normalize=True, top_n=None,
group_by=None, cat_groups=None,
theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)
Plot a confusion matrix heatmap.
| Parameter | Type | Description |
|---|---|---|
cm_dict |
dict |
Output of coco_eval.confusion_matrix(). |
normalize |
bool |
Row-normalize values. Default True. |
top_n |
int | None |
Show only top N categories by confusion mass. Auto-set to 25 when K > 30. |
group_by |
str | None |
"supercategory" to aggregate by group. |
cat_groups |
dict | None |
Group name → list of category names. Required with group_by. |
top_confusions¶
top_confusions(
cm_dict, *,
top_n=20,
theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)
Plot the top N misclassifications as horizontal bars. Shows "ground truth → prediction" pairs sorted by count.
| Parameter | Type | Description |
|---|---|---|
cm_dict |
dict |
Output of coco_eval.confusion_matrix(). |
top_n |
int |
Number of confusions to show. Default 20. |
per_category_ap¶
per_category_ap(
results_dict, *,
top_n=20, bottom_n=5,
theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)
Plot per-category AP as horizontal bars with a mean AP reference line.
| Parameter | Type | Description |
|---|---|---|
results_dict |
dict |
Output of coco_eval.results(per_class=True). |
top_n |
int |
Top categories to show. Default 20. |
bottom_n |
int |
Bottom categories to show. Default 5. |
tide_errors¶
tide_errors(
tide_dict, *,
theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)
Plot TIDE error breakdown as horizontal bars.
| Parameter | Type | Description |
|---|---|---|
tide_dict |
dict |
Output of coco_eval.tide_errors(). |
reliability_diagram¶
reliability_diagram(
cal_or_eval, *,
n_bins=10, iou_threshold=0.5,
theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)
Plot a reliability diagram — predicted confidence vs actual accuracy per bin, with a perfect calibration diagonal and gap overlay.
| Parameter | Type | Description |
|---|---|---|
cal_or_eval |
dict | COCOeval |
Either the output of ev.calibration() (a dict) or a COCOeval instance. If a COCOeval is passed, calibration() is called automatically. |
n_bins |
int |
Number of bins (only used when cal_or_eval is a COCOeval). Default 10. |
iou_threshold |
float |
IoU threshold (only used when cal_or_eval is a COCOeval). Default 0.5. |
ev = COCOeval(coco_gt, coco_dt, "bbox")
ev.evaluate()
# From a calibration dict
cal = ev.calibration(n_bins=15)
fig, ax = reliability_diagram(cal)
# Or directly from a COCOeval
fig, ax = reliability_diagram(ev, n_bins=15)
comparison_bar¶
comparison_bar(
compare_result: dict, *,
theme: str = "cold-brew",
paper_mode: bool = False,
ax=None,
save_path: str | Path | None = None,
) -> tuple[Figure, Axes]
Grouped bar chart comparing all metrics between two models. When the compare result includes bootstrap CIs, error bars are drawn on the model B bars.
from hotcoco import compare
from hotcoco.plot import comparison_bar
result = compare(ev_a, ev_b, n_bootstrap=1000)
fig, ax = comparison_bar(result, save_path="comparison.png")
category_deltas¶
category_deltas(
compare_result: dict, *,
top_k: int = 20,
theme: str = "cold-brew",
paper_mode: bool = False,
ax=None,
save_path: str | Path | None = None,
) -> tuple[Figure, Axes]
Horizontal bar chart of per-category AP deltas (B − A), sorted by magnitude. Green bars are improvements, red bars are regressions. Shows top_k categories from each end.
from hotcoco import compare
from hotcoco.plot import category_deltas
result = compare(ev_a, ev_b)
fig, ax = category_deltas(result, top_k=10, save_path="deltas.png")
report¶
report(
coco_eval, *,
save_path,
gt_path=None,
dt_path=None,
title="COCO Evaluation Report",
)
Generate a publication-quality single-page PDF report. Requires pip install hotcoco[plot].
The report contains:
- Header — title and timestamp
- Run context — GT/DT file paths, eval params, and dataset statistics (images, annotations, categories, detections)
- Summary metrics — AP and AR tables with a PR-curve panel and KPI tiles
- Per-category AP — bar chart sorted descending, three columns
The metric rows adapt automatically to the evaluation mode:
| Mode | AP rows | AR rows |
|---|---|---|
bbox / segm |
AP AP50 AP75 APs APm APl | AR1 AR10 AR100 ARs ARm ARl |
keypoints |
AP AP50 AP75 APm APl | AR AR50 AR75 ARm ARl |
| LVIS | AP AP50 AP75 APs APm APl APr APc APf | AR@300 ARs@300 ARm@300 ARl@300 |
| Parameter | Type | Description |
|---|---|---|
coco_eval |
COCOeval |
Must have run() called first. |
save_path |
str | Path |
Output PDF path. |
gt_path |
str | None |
Ground-truth JSON path shown in the run context block. |
dt_path |
str | None |
Detections JSON path shown in the run context block. |
title |
str |
Report title shown in the header. Default "COCO Evaluation Report". |
Returns None. Raises on I/O error or if run() was not called first.
Themes¶
Four built-in themes:
| Theme | Character |
|---|---|
"cold-brew" |
Default. Warm off-white background, 10-color infographic palette (alternating warm/cool). |
"warm-slate" |
Warm off-white background, terracotta + slate series colors. |
"scientific-blue" |
Cool/academic. Light blue-grey background, navy + red anchor colors. |
"ember" |
Warm/editorial. Parchment background, rust + copper + amber palette. |
Pass paper_mode=True to set figure and axes backgrounds to white, keeping all other theme colors intact. Useful when embedding plots in LaTeX documents or PowerPoint slides.
# Academic paper
fig, ax = pr_curve(ev, theme="scientific-blue", paper_mode=True, save_path="pr.pdf")
# Warm editorial style
fig, ax = per_category_ap(results, theme="ember", save_path="ap.png")
Use the style() context manager to apply a theme to your own matplotlib code:
from hotcoco.plot import style
with style(theme="scientific-blue", paper_mode=True):
fig, ax = plt.subplots()
ax.plot(recall, precision)
fig.savefig("custom.pdf")
Color palette¶
The cold-brew theme constants are available for custom plots:
from hotcoco.plot import SERIES_COLORS, CHROME, SEQUENTIAL
SERIES_COLORS— 10 infographic-optimized data series colors (fjord, kiln, fern, maize, plum, patina, rose, moss, slate, sienna)CHROME— non-data element colors (text, label, tick, grid, spine, background)SEQUENTIAL— 3-stop colormap for heatmaps (stone cream → fjord blue → deep navy)