plot¶

from hotcoco.plot import (
    report,
    pr_curve, pr_curve_iou_sweep, pr_curve_by_category, pr_curve_top_n,
    confusion_matrix, top_confusions, per_category_ap, tide_errors,
    style, SERIES_COLORS, CHROME, SEQUENTIAL,
)

Requires pip install hotcoco[plot] (matplotlib >= 3.5).

All functions share these common parameters:

Parameter	Type	Description
`theme`	`str`	Visual theme: `"cold-brew"` (default), `"warm-slate"`, `"scientific-blue"`, or `"ember"`.
`paper_mode`	`bool`	Set both figure and axes background to white. Useful for LaTeX / PowerPoint. Default `False`.
`ax`	`Axes \| None`	Draw on an existing axes. If `None`, creates a new figure.
`save_path`	`str \| Path \| None`	Save figure to this path (150 DPI).

All functions return (Figure, Axes).

`pr_curve_iou_sweep`¶

pr_curve_iou_sweep(
    coco_eval, *,
    iou_thrs=None, area_rng="all", max_det=None,
    theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)

Plot one precision-recall curve per IoU threshold, with precision averaged across all categories. The primary line (lowest IoU) gets an under-fill and F1 peak annotation.

Parameter	Type	Description
`coco_eval`	`COCOeval`	Must have `run()` called first.
`iou_thrs`	`list[float] \| None`	IoU thresholds to include. Default: all thresholds in params.
`area_rng`	`str`	Area range: `"all"`, `"small"`, `"medium"`, `"large"`. Default `"all"`.
`max_det`	`int \| None`	Max detections. Default: last entry in `params.max_dets`.

`pr_curve_by_category`¶

pr_curve_by_category(
    coco_eval, cat_id, *,
    iou_thr=0.5, area_rng="all", max_det=None,
    theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)

Plot the precision-recall curve for a single category at a fixed IoU threshold, with an F1 peak annotation.

Parameter	Type	Description
`coco_eval`	`COCOeval`	Must have `run()` called first.
`cat_id`	`int`	Category ID to plot.
`iou_thr`	`float`	IoU threshold. Default `0.5`.
`area_rng`	`str`	Area range. Default `"all"`.
`max_det`	`int \| None`	Max detections. Default: last entry in `params.max_dets`.

`pr_curve_top_n`¶

pr_curve_top_n(
    coco_eval, *,
    cat_ids=None, top_n=10, iou_thr=0.5, area_rng="all", max_det=None,
    theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)

Plot precision-recall curves for multiple categories on one axes. When cat_ids is omitted, selects the top top_n categories by AP automatically.

Parameter	Type	Description
`coco_eval`	`COCOeval`	Must have `run()` called first.
`cat_ids`	`list[int] \| None`	Categories to plot. Default: top `top_n` by AP.
`top_n`	`int`	Number of top categories when `cat_ids` is omitted. Default `10`.
`iou_thr`	`float`	IoU threshold. Default `0.5`.
`area_rng`	`str`	Area range. Default `"all"`.
`max_det`	`int \| None`	Max detections. Default: last entry in `params.max_dets`.

`pr_curve`¶

pr_curve(
    coco_eval, *,
    iou_thrs=None, cat_id=None, cat_ids=None,
    iou_thr=None, top_n=10, area_rng="all", max_det=None,
    theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)

Convenience dispatcher — inspects the arguments and calls the appropriate named function. Prefer calling the named functions directly for clarity.

No cat_id or cat_ids → pr_curve_iou_sweep
cat_id set → pr_curve_by_category
cat_ids or iou_thr set → pr_curve_top_n

Parameter	Type	Description
`coco_eval`	`COCOeval`	Must have `run()` called first.
`iou_thrs`	`list[float] \| None`	IoU thresholds to plot (IoU sweep mode).
`cat_id`	`int \| None`	Single category to plot.
`cat_ids`	`list[int] \| None`	Categories to compare.
`iou_thr`	`float \| None`	Fixed IoU for single/multi-category modes. Default 0.50.
`top_n`	`int`	Top N categories by AP. Default 10.
`area_rng`	`str`	Area range: `"all"`, `"small"`, `"medium"`, `"large"`.
`max_det`	`int \| None`	Max detections. Default: last in params.

`confusion_matrix`¶

confusion_matrix(
    cm_dict, *,
    normalize=True, top_n=None,
    group_by=None, cat_groups=None,
    theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)

Plot a confusion matrix heatmap.

Parameter	Type	Description
`cm_dict`	`dict`	Output of `coco_eval.confusion_matrix()`.
`normalize`	`bool`	Row-normalize values. Default `True`.
`top_n`	`int \| None`	Show only top N categories by confusion mass. Auto-set to 25 when K > 30.
`group_by`	`str \| None`	`"supercategory"` to aggregate by group.
`cat_groups`	`dict \| None`	Group name → list of category names. Required with `group_by`.

`top_confusions`¶

top_confusions(
    cm_dict, *,
    top_n=20,
    theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)

Plot the top N misclassifications as horizontal bars. Shows "ground truth → prediction" pairs sorted by count.

Parameter	Type	Description
`cm_dict`	`dict`	Output of `coco_eval.confusion_matrix()`.
`top_n`	`int`	Number of confusions to show. Default 20.

`per_category_ap`¶

per_category_ap(
    results_dict, *,
    top_n=20, bottom_n=5,
    theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)

Plot per-category AP as horizontal bars with a mean AP reference line.

Parameter	Type	Description
`results_dict`	`dict`	Output of `coco_eval.results(per_class=True)`.
`top_n`	`int`	Top categories to show. Default 20.
`bottom_n`	`int`	Bottom categories to show. Default 5.

`tide_errors`¶

tide_errors(
    tide_dict, *,
    theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)

Plot TIDE error breakdown as horizontal bars.

Parameter	Type	Description
`tide_dict`	`dict`	Output of `coco_eval.tide_errors()`.

`reliability_diagram`¶

reliability_diagram(
    cal_or_eval, *,
    n_bins=10, iou_threshold=0.5,
    theme="cold-brew", paper_mode=False, ax=None, save_path=None,
)

Plot a reliability diagram — predicted confidence vs actual accuracy per bin, with a perfect calibration diagonal and gap overlay.

Parameter	Type	Description
`cal_or_eval`	`dict \| COCOeval`	Either the output of `ev.calibration()` (a dict) or a `COCOeval` instance. If a `COCOeval` is passed, `calibration()` is called automatically.
`n_bins`	`int`	Number of bins (only used when `cal_or_eval` is a `COCOeval`). Default `10`.
`iou_threshold`	`float`	IoU threshold (only used when `cal_or_eval` is a `COCOeval`). Default `0.5`.

ev = COCOeval(coco_gt, coco_dt, "bbox")
ev.evaluate()

# From a calibration dict
cal = ev.calibration(n_bins=15)
fig, ax = reliability_diagram(cal)

# Or directly from a COCOeval
fig, ax = reliability_diagram(ev, n_bins=15)

`comparison_bar`¶

comparison_bar(
    compare_result: dict, *,
    theme: str = "cold-brew",
    paper_mode: bool = False,
    ax=None,
    save_path: str | Path | None = None,
) -> tuple[Figure, Axes]

Grouped bar chart comparing all metrics between two models. When the compare result includes bootstrap CIs, error bars are drawn on the model B bars.

from hotcoco import compare
from hotcoco.plot import comparison_bar

result = compare(ev_a, ev_b, n_bootstrap=1000)
fig, ax = comparison_bar(result, save_path="comparison.png")

`category_deltas`¶

category_deltas(
    compare_result: dict, *,
    top_k: int = 20,
    theme: str = "cold-brew",
    paper_mode: bool = False,
    ax=None,
    save_path: str | Path | None = None,
) -> tuple[Figure, Axes]

Horizontal bar chart of per-category AP deltas (B − A), sorted by magnitude. Green bars are improvements, red bars are regressions. Shows top_k categories from each end.

from hotcoco import compare
from hotcoco.plot import category_deltas

result = compare(ev_a, ev_b)
fig, ax = category_deltas(result, top_k=10, save_path="deltas.png")

`report`¶

report(
    coco_eval, *,
    save_path,
    gt_path=None,
    dt_path=None,
    title="COCO Evaluation Report",
)

Generate a publication-quality single-page PDF report. Requires pip install hotcoco[plot].

The report contains:

Header — title and timestamp
Run context — GT/DT file paths, eval params, and dataset statistics (images, annotations, categories, detections)
Summary metrics — AP and AR tables with a PR-curve panel and KPI tiles
Per-category AP — bar chart sorted descending, three columns

The metric rows adapt automatically to the evaluation mode:

Mode	AP rows	AR rows
`bbox` / `segm`	AP AP50 AP75 APs APm APl	AR1 AR10 AR100 ARs ARm ARl
`keypoints`	AP AP50 AP75 APm APl	AR AR50 AR75 ARm ARl
LVIS	AP AP50 AP75 APs APm APl APr APc APf	AR@300 ARs@300 ARm@300 ARl@300

Parameter	Type	Description
`coco_eval`	`COCOeval`	Must have `run()` called first.
`save_path`	`str \| Path`	Output PDF path.
`gt_path`	`str \| None`	Ground-truth JSON path shown in the run context block.
`dt_path`	`str \| None`	Detections JSON path shown in the run context block.
`title`	`str`	Report title shown in the header. Default `"COCO Evaluation Report"`.

Returns None. Raises on I/O error or if run() was not called first.

Themes¶

Four built-in themes:

Theme	Character
`"cold-brew"`	Default. Warm off-white background, 10-color infographic palette (alternating warm/cool).
`"warm-slate"`	Warm off-white background, terracotta + slate series colors.
`"scientific-blue"`	Cool/academic. Light blue-grey background, navy + red anchor colors.
`"ember"`	Warm/editorial. Parchment background, rust + copper + amber palette.

Pass paper_mode=True to set figure and axes backgrounds to white, keeping all other theme colors intact. Useful when embedding plots in LaTeX documents or PowerPoint slides.

# Academic paper
fig, ax = pr_curve(ev, theme="scientific-blue", paper_mode=True, save_path="pr.pdf")

# Warm editorial style
fig, ax = per_category_ap(results, theme="ember", save_path="ap.png")

Use the style() context manager to apply a theme to your own matplotlib code:

from hotcoco.plot import style

with style(theme="scientific-blue", paper_mode=True):
    fig, ax = plt.subplots()
    ax.plot(recall, precision)
    fig.savefig("custom.pdf")

Color palette¶

The cold-brew theme constants are available for custom plots:

from hotcoco.plot import SERIES_COLORS, CHROME, SEQUENTIAL

SERIES_COLORS — 10 infographic-optimized data series colors (fjord, kiln, fern, maize, plum, patina, rose, moss, slate, sienna)
CHROME — non-data element colors (text, label, tick, grid, spine, background)
SEQUENTIAL — 3-stop colormap for heatmaps (stone cream → fjord blue → deep navy)

plot¶

pr_curve_iou_sweep¶

pr_curve_by_category¶

pr_curve_top_n¶

pr_curve¶

confusion_matrix¶

top_confusions¶

per_category_ap¶

tide_errors¶

reliability_diagram¶

comparison_bar¶

category_deltas¶

report¶