Ark Evaluation Packet

RTL Historical Fix Replay v0.1

RTL Historical Fix Replay v0.1

Generated: 2026-05-29T06:32:16.537844+00:00

Dataset: HWE-bench public RTL repair smoke set (HDL-FixBench-shaped)

Buyer-Safe Claim

Historical-fix replay evaluates whether Ark's ranked review targets overlap with regions that were later modified in public RTL bug-fix commits. It is not proof of bug detection, not formal signoff, and not a claim that Ark would have found the original issue independently.

Summary

MetricValue
Cases analyzed15
Cases with ranked targets13
Top-1 identifier overlap13 / 13
Top-3 identifier overlap13 / 13
Top-5 identifier overlap13 / 13
Top-1 exact-signal overlap10 / 13
Top-3 exact-signal overlap12 / 13
Top-5 exact-signal overlap12 / 13
Top-1 blind-structural overlap12 / 13
Top-3 blind-structural overlap12 / 13
Top-5 blind-structural overlap12 / 13
Random baseline mean top-1 rate0.1455
Random baseline mean top-3 rate0.3219
Random baseline mean top-5 rate0.424
Mean cone overlap ratio0.3425
Median review-compression ratio349.3333
Mean case runtime seconds0.7588
Median case runtime seconds0.2013
Max case runtime seconds3.7657

Measured top-k denominators include only cases with both inferred repair identifiers and ranked Ark targets. Identifier overlap allows a ranked target expression or cone to contain the repaired identifier; exact-signal overlap requires the ranked target name itself to match. Blind-structural overlap removes repaired identifiers from target tokens and asks whether the remaining ranked-target neighborhood still has internal structural context.

Cases

CaseProjectStatusRanked TargetsRepair IDsTop-1 IDTop-1 ExactBlind Top-1Random Top-1Cone OverlapCompressionRuntime s
lowRISC__ibex__pr_1735__ibex_id_stageIbexCLEAN079NoneNoneNone0.020.0None1.5228
lowRISC__ibex__pr_222__ibex_id_stageIbexSUSPECT354TrueTrueTrue0.180.2593427.33333.6357
lowRISC__ibex__pr_54__ibex_compressed_decoderIbexSUSPECT241TrueTrueFalse0.6430.3171262.50.0667
lowRISC__ibex__pr_167__ibex_controllerIbexSUSPECT1168TrueTrueTrue0.2090.308899.27273.7657
lowRISC__ibex__pr_166__ibex_decoderIbexSUSPECT642TrueTrueTrue0.130.3571209.16670.0821
openhwgroup__cva6__pr_2248__cv64a6_mmu_config_pkgCVA6SUSPECT110TrueFalseTrue0.01.0214.00.0547
openhwgroup__cva6__pr_2374__csr_regfileCVA6SUSPECT719TrueFalseTrue0.0180.7368787.71430.2243
openhwgroup__cva6__pr_2916__cva6_mmuCVA6CLEAN014NoneNoneNone0.0470.0None0.6873
openhwgroup__cva6__pr_2944__issue_read_operandsCVA6SUSPECT18TrueTrueTrue0.0130.52130.00.3907
openhwgroup__cva6__pr_2945__cva6_shared_tlbCVA6SUSPECT127TrueTrueTrue0.0930.33331004.00.1526
lowRISC__opentitan__pr_23807__usb_fs_nb_out_peOpenTitanSUSPECT126TrueTrueTrue0.090.2308885.00.0703
lowRISC__opentitan__pr_16176__hmacOpenTitanSUSPECT117TrueTrueTrue0.0660.11761283.00.3188
lowRISC__opentitan__pr_6523__rom_ctrl_muxOpenTitanREVIEW_CANDIDATE137TrueTrueTrue0.4030.0541130.00.1233
lowRISC__opentitan__pr_7722__keymgr_ctrlOpenTitanSUSPECT2326TrueFalseTrue0.0630.384669.60870.2013
lowRISC__opentitan__pr_8724__spi_host_fsmOpenTitanSUSPECT326TrueTrueTrue0.2080.5385349.33330.0862

Misses And Notes

Cross-Domain Bottleneck Reads

Claim Boundaries