RTL Historical Fix Replay v0.1
RTL Historical Fix Replay v0.1
Generated: 2026-05-29T06:32:16.537844+00:00
Dataset: HWE-bench public RTL repair smoke set (HDL-FixBench-shaped)
Buyer-Safe Claim
Historical-fix replay evaluates whether Ark's ranked review targets overlap with regions that were later modified in public RTL bug-fix commits. It is not proof of bug detection, not formal signoff, and not a claim that Ark would have found the original issue independently.
Summary
| Metric | Value |
|---|---|
| Cases analyzed | 15 |
| Cases with ranked targets | 13 |
| Top-1 identifier overlap | 13 / 13 |
| Top-3 identifier overlap | 13 / 13 |
| Top-5 identifier overlap | 13 / 13 |
| Top-1 exact-signal overlap | 10 / 13 |
| Top-3 exact-signal overlap | 12 / 13 |
| Top-5 exact-signal overlap | 12 / 13 |
| Top-1 blind-structural overlap | 12 / 13 |
| Top-3 blind-structural overlap | 12 / 13 |
| Top-5 blind-structural overlap | 12 / 13 |
| Random baseline mean top-1 rate | 0.1455 |
| Random baseline mean top-3 rate | 0.3219 |
| Random baseline mean top-5 rate | 0.424 |
| Mean cone overlap ratio | 0.3425 |
| Median review-compression ratio | 349.3333 |
| Mean case runtime seconds | 0.7588 |
| Median case runtime seconds | 0.2013 |
| Max case runtime seconds | 3.7657 |
Measured top-k denominators include only cases with both inferred repair identifiers and ranked Ark targets. Identifier overlap allows a ranked target expression or cone to contain the repaired identifier; exact-signal overlap requires the ranked target name itself to match. Blind-structural overlap removes repaired identifiers from target tokens and asks whether the remaining ranked-target neighborhood still has internal structural context.
Cases
| Case | Project | Status | Ranked Targets | Repair IDs | Top-1 ID | Top-1 Exact | Blind Top-1 | Random Top-1 | Cone Overlap | Compression | Runtime s |
|---|---|---|---|---|---|---|---|---|---|---|---|
lowRISC__ibex__pr_1735__ibex_id_stage | Ibex | CLEAN | 0 | 79 | None | None | None | 0.02 | 0.0 | None | 1.5228 |
lowRISC__ibex__pr_222__ibex_id_stage | Ibex | SUSPECT | 3 | 54 | True | True | True | 0.18 | 0.2593 | 427.3333 | 3.6357 |
lowRISC__ibex__pr_54__ibex_compressed_decoder | Ibex | SUSPECT | 2 | 41 | True | True | False | 0.643 | 0.3171 | 262.5 | 0.0667 |
lowRISC__ibex__pr_167__ibex_controller | Ibex | SUSPECT | 11 | 68 | True | True | True | 0.209 | 0.3088 | 99.2727 | 3.7657 |
lowRISC__ibex__pr_166__ibex_decoder | Ibex | SUSPECT | 6 | 42 | True | True | True | 0.13 | 0.3571 | 209.1667 | 0.0821 |
openhwgroup__cva6__pr_2248__cv64a6_mmu_config_pkg | CVA6 | SUSPECT | 1 | 10 | True | False | True | 0.0 | 1.0 | 214.0 | 0.0547 |
openhwgroup__cva6__pr_2374__csr_regfile | CVA6 | SUSPECT | 7 | 19 | True | False | True | 0.018 | 0.7368 | 787.7143 | 0.2243 |
openhwgroup__cva6__pr_2916__cva6_mmu | CVA6 | CLEAN | 0 | 14 | None | None | None | 0.047 | 0.0 | None | 0.6873 |
openhwgroup__cva6__pr_2944__issue_read_operands | CVA6 | SUSPECT | 1 | 8 | True | True | True | 0.013 | 0.5 | 2130.0 | 0.3907 |
openhwgroup__cva6__pr_2945__cva6_shared_tlb | CVA6 | SUSPECT | 1 | 27 | True | True | True | 0.093 | 0.3333 | 1004.0 | 0.1526 |
lowRISC__opentitan__pr_23807__usb_fs_nb_out_pe | OpenTitan | SUSPECT | 1 | 26 | True | True | True | 0.09 | 0.2308 | 885.0 | 0.0703 |
lowRISC__opentitan__pr_16176__hmac | OpenTitan | SUSPECT | 1 | 17 | True | True | True | 0.066 | 0.1176 | 1283.0 | 0.3188 |
lowRISC__opentitan__pr_6523__rom_ctrl_mux | OpenTitan | REVIEW_CANDIDATE | 1 | 37 | True | True | True | 0.403 | 0.0541 | 130.0 | 0.1233 |
lowRISC__opentitan__pr_7722__keymgr_ctrl | OpenTitan | SUSPECT | 23 | 26 | True | False | True | 0.063 | 0.3846 | 69.6087 | 0.2013 |
lowRISC__opentitan__pr_8724__spi_host_fsm | OpenTitan | SUSPECT | 3 | 26 | True | True | True | 0.208 | 0.5385 | 349.3333 | 0.0862 |
Misses And Notes
lowRISC__ibex__pr_1735__ibex_id_stage(Ibex): cone signals did not intersect repaired signals; Ark produced no ranked review targets; taxonomy: clean_equivalent_or_low_signal_delta, local_cone_context_gap, multi_hunk_or_multi_region_repair, no_ranked_targetopenhwgroup__cva6__pr_2916__cva6_mmu(CVA6): cone signals did not intersect repaired signals; Ark produced no ranked review targets; taxonomy: clean_equivalent_or_low_signal_delta, local_cone_context_gap, no_ranked_target
Cross-Domain Bottleneck Reads
- Cycle-consistency blind spot: clean-equivalent RTL deltas can erase ranked targets even when a historical repair changed local code. Treat these as semantic round-trip-loss cases, not evidence of no review value.
- Bridge-pattern miss: when repaired identifiers do not intersect cone tokens, the repair may be outside the current local bridge. Escalate to file/module-level context before declaring a miss.
Claim Boundaries
- Not proof of bug detection.
- Not final verification closure.
- Not security signoff.
- Not a claim that Ark would have found the original issue independently.
- Repair-region overlap is a triage relevance metric, not correctness proof.
- Public dataset cases should not be framed as upstream vulnerability claims.