error analysis

Name: error analysis
Author: hamelsmu

by hamelsmu·2mo ago

Help the user systematically identify and categorize failure modes in an LLM pipeline by reading traces. Use when start…

Research & DataClaude CodeMedium risk · 中风险open source · 开源

Before you install安装前须知

✓No special access needs declared未声明特殊权限需求

Editor's verdict· 编辑结论

Help the user systematically identify and categorize failure modes in an LLM pipeline by reading traces. Use when starting a new eval project, after significant pipeline changes (new features, model switches, prompt rewrites), when production metrics drop, or after incidents.
— Editorial team · 编辑团队

Install via Skills CLI

Use npx skills add to install this skill into the selected agent. Phase 0 commands are generated from source rules, not verified.

Codex

npx skills add https://github.com/hamelsmu/evals-skills/blob/main/skills/error-analysis/SKILL.md -g -a codex -y

Drop `-g` to install project-locally

✓

Best for适合什么场景

深度研究
资料检索
趋势分析

✗

Not for不适合什么场景

Workflows that require stronger human review than this catalog entry documents.需要比当前目录条目更严格人工复核的工作流。

vs alternativesvs 其他选择

Full compare table完整对比表 →

#1Nuwa Skill

Distill how a person thinks into a reusable skill.

★ 4.7·28k stars

diff · 差异Best when the skill you want to ship is really a captured way of thinking — a researcher's framing, a senior engineer's review checklist, a designer's heuristics. Walks through source material in a research-first loop, extracts mental models, then encodes them into a triggerable skill. Slow and material-hungry on purpose. Skip it for quick automation; use anthropic-skill-creator for that.

#2xlsx

xlsx: agent skill — from anthropics/skills.

132k stars

diff · 差异Best for the cleanup-validate-analyze loop on real-world spreadsheets — inconsistent date formats, mixed types in "numeric" columns, hidden merged cells, schemas that change row-by-row. Reads the file, surfaces structural issues before they corrupt downstream analysis. Strongest on messy production-data scenarios. Skip it for clean exported data; that's just pandas.

#3graphify

graphify: agent skill — from safishamsi/graphify.

46k stars

diff · 差异Best when relationships matter more than rows — knowledge graphs over flat tables, entity-and-edge extraction from text, querying multi-hop connections (who introduced whom, which dependency led where). Strongest on research workflows where you need to trace links across messy sources. Skip it for tabular aggregations; a spreadsheet beats a graph there.

Side-by-side compare维度对比

Key differences with same-lane alternatives

	this skill · 当前error analysis	Nuwa Skill	xlsx	graphify
Rating · 评分	—	4.7	—	—
Stars · 星标	1.3k	28k	132k	46k
Risk · 风险	Medium risk · 中风险	Medium risk · 中风险	Low risk · 低风险	Low risk · 低风险
Best for · 最适合	深度研究	Distill how a person thinks into a reusable skill.	xlsx: agent skill — from anthropics/skills.	graphify: agent skill — from safishamsi/graphify.
Not for · 不适合	Workflows that require stronger human review than this catalog entry documents.	Workflows that require stronger human review than this catalog entry documents.	Workflows that require stronger human review than this catalog entry documents.	Workflows that require stronger human review than this catalog entry documents.

Audit notes审计备注

not individually audited · 未独立审计

Source源码open on GitHub · 公开✓

Author作者community · 社区!

Network网络访问not individually audited · 未独立审计!

Filesystem文件写入not individually audited · 未独立审计!

Dependencies依赖not individually audited · 未独立审计!

Telemetry遥测none · 无✓