-
Notifications
You must be signed in to change notification settings - Fork 64
feat(prompts): add capture_errors option for error tracking #427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -56,6 +56,9 @@ class Prompts: | |
| # Or with direct options (no PostHog client needed) | ||
| prompts = Prompts(personal_api_key='phx_xxx', host='https://us.posthog.com') | ||
|
|
||
| # With error tracking: prompt fetch failures are reported to PostHog | ||
| prompts = Prompts(posthog, capture_errors=True) | ||
|
|
||
| # Fetch with caching and fallback | ||
| template = prompts.get('support-system-prompt', fallback='You are a helpful assistant.') | ||
|
|
||
|
|
@@ -74,6 +77,7 @@ def __init__( | |
| personal_api_key: Optional[str] = None, | ||
| host: Optional[str] = None, | ||
| default_cache_ttl_seconds: Optional[int] = None, | ||
| capture_errors: bool = False, | ||
| ): | ||
| """ | ||
| Initialize Prompts. | ||
|
|
@@ -83,11 +87,15 @@ def __init__( | |
| personal_api_key: Direct API key (optional if posthog provided) | ||
| host: PostHog host (defaults to app endpoint) | ||
| default_cache_ttl_seconds: Default cache TTL (defaults to 300) | ||
| capture_errors: If True and a PostHog client is provided, prompt fetch | ||
| failures are reported to PostHog error tracking via capture_exception(). | ||
| """ | ||
| self._default_cache_ttl_seconds = ( | ||
| default_cache_ttl_seconds or DEFAULT_CACHE_TTL_SECONDS | ||
| ) | ||
| self._cache: Dict[str, CachedPrompt] = {} | ||
| self._client = posthog if posthog is not None else None | ||
| self._capture_errors = capture_errors | ||
|
|
||
|
Comment on lines
+97
to
99
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unchecked capture_exception call
Prompt To Fix With AIThis is a comment left during a code review.
Path: posthog/ai/prompts.py
Line: 97:99
Comment:
**Unchecked capture_exception call**
`_maybe_capture_error()` assumes the provided `posthog` object has a `capture_exception` method. With `capture_errors=True`, passing a PostHog client version (or a stub) that lacks `capture_exception` will raise `AttributeError`, which then gets swallowed by the broad `except Exception`, meaning errors silently won’t be reported (and it’s hard to diagnose). Consider guarding the call (e.g., `hasattr(self._client, "capture_exception")`) and/or logging at least once at WARNING when the method is missing so users know why capture isn’t happening.
How can I resolve this? If you propose a fix, please make it concise. |
||
| if posthog is not None: | ||
| self._personal_api_key = getattr(posthog, "personal_api_key", None) or "" | ||
|
|
@@ -152,6 +160,8 @@ def get( | |
| return prompt | ||
|
|
||
| except Exception as error: | ||
| self._maybe_capture_error(error) | ||
|
|
||
| # Fallback order: | ||
| # 1. Return stale cache (with warning) | ||
| if cached is not None: | ||
|
|
@@ -211,6 +221,15 @@ def clear_cache(self, name: Optional[str] = None) -> None: | |
| else: | ||
| self._cache.clear() | ||
|
|
||
| def _maybe_capture_error(self, error: Exception) -> None: | ||
| """Report a prompt fetch error to PostHog error tracking if enabled.""" | ||
| if not self._capture_errors or self._client is None: | ||
| return | ||
| try: | ||
| self._client.capture_exception(error) | ||
| except Exception: | ||
| log.debug("[PostHog Prompts] Failed to capture exception to error tracking") | ||
|
|
||
| def _fetch_prompt_from_api(self, name: str) -> str: | ||
| """ | ||
| Fetch prompt from PostHog API. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -509,6 +509,121 @@ def test_handle_variables_with_dots(self): | |
| self.assertEqual(result, "Company: Acme") | ||
|
|
||
|
|
||
| class TestPromptsCaptureErrors(TestPrompts): | ||
| """Tests for the capture_errors option.""" | ||
|
|
||
| @patch("posthog.ai.prompts._get_session") | ||
| def test_capture_exception_called_on_fetch_failure_with_fallback( | ||
|
Comment on lines
+512
to
+516
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tests are not parameterized The new Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time! Prompt To Fix With AIThis is a comment left during a code review.
Path: posthog/test/ai/test_prompts.py
Line: 512:516
Comment:
**Tests are not parameterized**
The new `TestPromptsCaptureErrors` cases are all the same shape (setup fetch failure/success → call `get()` → assert on `capture_exception`). This increases repetition and makes it easier to miss scenarios when adding new branches. The repo preference is parameterized tests; these can be collapsed into a single parameterized test that covers (fallback vs stale cache vs re-raise vs success) and toggles (`capture_errors` on/off, client present/absent).
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise. |
||
| self, mock_get_session | ||
| ): | ||
| """Should call capture_exception on fetch failure when capture_errors=True.""" | ||
| mock_get = mock_get_session.return_value.get | ||
| mock_get.side_effect = Exception("Network error") | ||
|
|
||
| posthog = self.create_mock_posthog() | ||
| prompts = Prompts(posthog, capture_errors=True) | ||
|
|
||
| result = prompts.get("test-prompt", fallback="fallback prompt") | ||
|
|
||
| self.assertEqual(result, "fallback prompt") | ||
| posthog.capture_exception.assert_called_once() | ||
| captured_exc = posthog.capture_exception.call_args[0][0] | ||
| self.assertIn("Network error", str(captured_exc)) | ||
|
|
||
| @patch("posthog.ai.prompts._get_session") | ||
| @patch("posthog.ai.prompts.time.time") | ||
| def test_capture_exception_called_on_fetch_failure_with_stale_cache( | ||
| self, mock_time, mock_get_session | ||
| ): | ||
| """Should call capture_exception when falling back to stale cache.""" | ||
| mock_get = mock_get_session.return_value.get | ||
| mock_get.side_effect = [ | ||
| MockResponse(json_data=self.mock_prompt_response), | ||
| Exception("Network error"), | ||
| ] | ||
| mock_time.return_value = 1000.0 | ||
|
|
||
| posthog = self.create_mock_posthog() | ||
| prompts = Prompts(posthog, capture_errors=True) | ||
|
|
||
| # First call populates cache | ||
| prompts.get("test-prompt", cache_ttl_seconds=60) | ||
|
|
||
| # Expire cache | ||
| mock_time.return_value = 1061.0 | ||
|
|
||
| # Second call falls back to stale cache | ||
| result = prompts.get("test-prompt", cache_ttl_seconds=60) | ||
| self.assertEqual(result, self.mock_prompt_response["prompt"]) | ||
| posthog.capture_exception.assert_called_once() | ||
|
|
||
| @patch("posthog.ai.prompts._get_session") | ||
| def test_capture_exception_called_when_error_is_raised(self, mock_get_session): | ||
| """Should call capture_exception even when the error is re-raised (no fallback, no cache).""" | ||
| mock_get = mock_get_session.return_value.get | ||
| mock_get.side_effect = Exception("Network error") | ||
|
|
||
| posthog = self.create_mock_posthog() | ||
| prompts = Prompts(posthog, capture_errors=True) | ||
|
|
||
| with self.assertRaises(Exception): | ||
| prompts.get("test-prompt") | ||
|
|
||
| posthog.capture_exception.assert_called_once() | ||
|
|
||
| @patch("posthog.ai.prompts._get_session") | ||
| def test_no_capture_exception_when_capture_errors_is_false(self, mock_get_session): | ||
| """Should NOT call capture_exception when capture_errors=False (default).""" | ||
| mock_get = mock_get_session.return_value.get | ||
| mock_get.side_effect = Exception("Network error") | ||
|
|
||
| posthog = self.create_mock_posthog() | ||
| prompts = Prompts(posthog) | ||
|
|
||
| prompts.get("test-prompt", fallback="fallback prompt") | ||
|
|
||
| posthog.capture_exception.assert_not_called() | ||
|
|
||
| @patch("posthog.ai.prompts._get_session") | ||
| def test_no_capture_exception_without_client(self, mock_get_session): | ||
| """Should not error when capture_errors=True but no client provided.""" | ||
| mock_get = mock_get_session.return_value.get | ||
| mock_get.side_effect = Exception("Network error") | ||
|
|
||
| prompts = Prompts(personal_api_key="phx_test_key", capture_errors=True) | ||
|
|
||
| result = prompts.get("test-prompt", fallback="fallback prompt") | ||
|
|
||
| self.assertEqual(result, "fallback prompt") | ||
|
|
||
| @patch("posthog.ai.prompts._get_session") | ||
| def test_no_capture_exception_on_successful_fetch(self, mock_get_session): | ||
| """Should NOT call capture_exception on successful fetch.""" | ||
| mock_get = mock_get_session.return_value.get | ||
| mock_get.return_value = MockResponse(json_data=self.mock_prompt_response) | ||
|
|
||
| posthog = self.create_mock_posthog() | ||
| prompts = Prompts(posthog, capture_errors=True) | ||
|
|
||
| prompts.get("test-prompt") | ||
|
|
||
| posthog.capture_exception.assert_not_called() | ||
|
|
||
| @patch("posthog.ai.prompts._get_session") | ||
| def test_capture_exception_failure_does_not_affect_fallback(self, mock_get_session): | ||
| """If capture_exception itself throws, the fallback should still be returned.""" | ||
| mock_get = mock_get_session.return_value.get | ||
| mock_get.side_effect = Exception("Network error") | ||
|
|
||
| posthog = self.create_mock_posthog() | ||
| posthog.capture_exception.side_effect = Exception("capture failed") | ||
| prompts = Prompts(posthog, capture_errors=True) | ||
|
|
||
| result = prompts.get("test-prompt", fallback="fallback prompt") | ||
|
|
||
| self.assertEqual(result, "fallback prompt") | ||
|
|
||
|
|
||
| class TestPromptsClearCache(TestPrompts): | ||
| """Tests for the Prompts.clear_cache() method.""" | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,4 @@ | ||
| VERSION = "7.8.3" | ||
| VERSION = "7.8.4" | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you dont need to do this
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| if __name__ == "__main__": | ||
| print(VERSION, end="") # noqa: T201 | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we are reporting our own errors?
what's the final user gonna do with this when those errors are in their error tracking tab?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question — I think the framing is actually from the user's perspective, not ours. The errors surfaced here aren't "PostHog internals" — they're 404s (prompt typo / deleted), 403s (key perms), auth failures, and network errors. All actionable by the app developer. If my app's prompts were silently falling back for two weeks and PostHog didn't surface it, I'd be pretty upset. It's opt-in (
capture_errors=Falseby default), runs before the fallback chain so visibility doesn't cost reliability, and swallows capture-side failures so it can't break the caller. WDYT?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmm I see, then yeah for a few errors that do make sense eg 404, 403 etc but not everything
eg network errors, timeouts, etc, this would be just noisy, affecting the noise x ratio error tracking quality cus those errors specifically arent actionable
can we at least filter only to capture the errors that are actually actionable? (this will depend on the API contract - http errors)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm if PostHog is down and my latest prompts aren't getting to my app you are saying to not error track that in my PostHog as a user? Do errors have to be actionable? Is that some sort of thing that users are aware or when using error tracking or something?
I'm not super up on error tracking fully tbh but feel a bit uncomfortable making such decisions here on behalf of future users (unless this kinda just is the norm or way error tracking works on posthog? Eg "actionable error tracking" where we do sort of try curate on your behalf, that's a bit different then).
Opt in too here as a gate helps maybe.
The noise problem then is something you manage in error tracking itself via the groupings and rules and AI no?
Or maybe there is some way to make the different error types configurable and use like a sane default more like what you say perhaps.
Although if my prod app was not getting latest prompts and error tracking was not surfacing that and it turned out PostHog prompts was just down and we missed it for a bit say, and then I found this thread where we decided to just not surface it. I'd be very mad.
Or am I being a bit stubborn here? Like I think of an error as an error, but of course it's always more nuanced where you draw the line so maybe that's the point here perhaps of where that is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't disagree with most of your comments, but if posthog is down, or your internet is down, there's not much you can do besides having a fallback on the client
also, capturing errors most likely wont go thru anyway since most likely that endpoint is also down, or you are shooting the server with more requests and causing even more problems
Error tracking should focus on actionable errors I can address on my end.
I wouldn't want to pay for error events that the only solution is "tell PostHog to be more reliable or tell me to have a more stable internet", but I am very opinionated about this, I know
The way I see this is more about "SDK diagnostics", we failed to request the prompts, and that's an SDK issue and not a customer issue.
I'm not familiar with this product, so I'll approve it and leave it with you. I just wanted to share my 2 cents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I know what you mean, I'm kinda thinking of the potential code in here to cherry pick errors as an implicit suppression rule in PostHog error tracking.
Going to research a bit more on how it all works and what other parts of the SDK already do for stuff like this just to make sure I'm at least in line with them and things like that.
Eg does PostHog error tracking itself then maybe need some sort of new template for these new potential prompt errors etc.
Will check all that before merging to be sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome, thx