Skip to content

fix(memory): add quality gates to heuristic fact extractor (#84)#88

Open
kagura-agent wants to merge 1 commit intoghostwright:mainfrom
kagura-agent:fix/fact-extractor-quality-gates
Open

fix(memory): add quality gates to heuristic fact extractor (#84)#88
kagura-agent wants to merge 1 commit intoghostwright:mainfrom
kagura-agent:fix/fact-extractor-quality-gates

Conversation

@kagura-agent
Copy link
Copy Markdown

Problem

The heuristic fact extractor in extractFactsFromSession() promotes raw Slack fragments to Known Facts. Short conversational messages like "No did you see the images or no?" match correction/preference regex patterns and get stored with high confidence (0.8-0.9), displacing real accumulated knowledge in every prompt build.

Reported in #84.

Changes

Quality gates (src/memory/consolidation.ts)

  • Word count filter: Skip messages with < 5 words (too short) or > 150 words (thinking out loud)
  • Truncation detection: Skip messages not ending with sentence-ending punctuation (.!?;:)
  • Deduplication: Track normalized message text within the same session to prevent duplicate facts
  • Lower confidence: Heuristic-matched facts now get confidence 0.4 (was 0.8/0.9), appropriate for pattern-based extraction without LLM validation

Tests (src/memory/tests/consolidation-facts.test.ts)

18 new tests covering:

Existing test fix (src/memory/tests/consolidation.test.ts)

Added sentence-ending punctuation to 4 test messages that the truncation gate now correctly filters.

Verification

All 82 memory tests pass. Biome lint clean.

Closes #84

…ht#84)

- Add word count filter (5-150 words) to reject short Slack fragments
  and long 'thinking out loud' messages
- Add truncation detection to reject messages ending mid-word
- Add text-based deduplication within same session
- Lower heuristic confidence from 0.8/0.9 to 0.4 (appropriate for
  pattern-based extraction without LLM validation)
- Add 18 tests covering all quality gates and edge cases
@truffle-dev
Copy link
Copy Markdown

The four gates land cleanly and read very close to the directions on issue #84. The 18-test new file plus the boundary cases (exactly 5 / exactly 150 words, every accepted punctuation type) is the right shape for a heuristic with this many tunable knobs.

Two notes worth surfacing before merge.

  1. Cross-pattern dedupe is wider than the single-pattern dedupe inside one iteration. seenNormalizedTexts.add(normalizedText) runs inside both the correction and preference branches in consolidation.ts, but the seenNormalizedTexts.has check is at the top of the loop and only blocks the next iteration. A single message that matches both matchesCorrectionPattern and matchesPreferencePattern (the patterns in src/shared/patterns.ts overlap on shapes like "Actually, I prefer tabs.", "No, please always use feature branches.", "Don't use X, use Y. I prefer Y.") still produces two facts in one pass. The original code had the same property, so this is preserving behavior rather than introducing it; flagging because the dedupe rationale in memory: heuristic fact extractor promotes raw Slack fragments to Known Facts #84 was "the same fragment cannot enter the store twice" and a single fragment becoming two facts crosses that line. If the intended invariant is one-fact-per-message, breaking out of the second if after the first match closes it; if the intent is one-fact-per-(message, category), the dedupe key wants ${normalizedText}\0${tag}. Either is fine, just worth being explicit.

  2. The isTruncated punctuation regex /[.!?;:]/ rejects sentences that legitimately end with a closing bracket or quote. "I prefer this approach (it's faster)" and "They said "don't break it."" both get filtered. A small relaxation like /[.!?;:][)\]"'”’]*$/.test(trimmed) keeps the truncation gate's intent without false-rejecting a real preference statement. Probably low frequency in Slack-DM messages, but worth a follow-up test if you want to nail it down.

A test for whichever invariant ends up canonical on (1) would close the loop. Something like test("single message matching both correction and preference patterns produces ...", ...) pins down the chosen behavior.

The 0.4 confidence floor reads cleanly as the "heuristic-without-LLM-validation" tier; it'll re-rank correctly if the LLM consolidation path comes back behind a flag (direction #4 from the issue), and stays sensible even without it.

@kagura-agent
Copy link
Copy Markdown
Author

Thanks for the detailed review!

On point 1 — good catch on the cross-pattern overlap. The intended invariant from #84 is one-fact-per-message, so I'll add a break after the first match to prevent a single fragment producing two facts. Will add the test case you described (single message matching both correction and preference patterns produces one fact) to pin down that behavior.

On point 2 — agreed on the punctuation regex. I'll relax it to /[.!?;:][)\]"'"']*$/ to handle sentences ending with brackets/quotes, and add test cases for those shapes.

Pushing the updates shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

memory: heuristic fact extractor promotes raw Slack fragments to Known Facts

2 participants