Skip to content

[FEATURE] Add support for known speakers in SpeakerDiarizationConfig #44

@nsepehr

Description

@nsepehr

Which SDK is this feature request for?

  • speechmatics-rt (Real-Time SDK)
  • speechmatics-batch (Batch SDK)
  • Both SDKs
  • General/Repository

Feature Request: Add support for known speakers in SpeakerDiarizationConfig

Summary

Request to add support for known speaker identification/enrollment in the Python SDK's SpeakerDiarizationConfig to enable persistent speaker identification across sessions.

Context

Currently, the SpeakerDiarizationConfig only supports:

  • max_speakers
  • speaker_sensitivity
  • prefer_current_speaker

However, the API documentation mentions SpeakersResult as a preview feature, and the SDK code contains references to GET_SPEAKERS and SPEAKERS_RESULT message types (marked as "Internal, Speechmatics only").

Use Case

We're building voice AI applications where identifying specific speakers across sessions is critical, such as:

  • Meeting transcription: Identifying recurring participants without having to voice match with their speaker labels again

Currently, every time the same speakers in our system join a meeting we have to match their identities to the speaker. Which in the case of diarization becomes very annoying to have to do each time and a bad user experience. This feature to allow them to be identified beforehand would be incredibly useful.

Proposed Solution

Add a speakers field to SpeakerDiarizationConfig to support known speaker enrollment:

@dataclass
class SpeakerDiarizationConfig:
    max_speakers: Optional[int] = None
    speaker_sensitivity: Optional[float] = None
    prefer_current_speaker: Optional[bool] = None
    speakers: Optional[Dict[str, List[str]]] = None  # New field for known speakers

# Usage example:
config = SpeakerDiarizationConfig(
    max_speakers=2,
    speaker_sensitivity=0.5,
    speakers={
        "John": ["speaker_id_john_123"],  # Speaker name -> identifiers
        "Jane": ["speaker_id_jane_456"],
    }
)

Current Workaround

We tested whether the API would accept a speakers field even though it's not in the SDK:

config.speaker_diarization_config = {
    "speakers": {
        "John": ["speaker_id_john_123"],
        "Jane": ["speaker_id_jane_456"],
    }
}

But the API rejects it with:

Error: Additional property speakers is not allowed

Questions

  1. Is the SpeakersResult preview feature available for early access?
  2. Is there a timeline for when known speaker support will be added to the public API?
  3. Would you accept a PR to add this functionality to the SDK once the API supports it?

Related

Environment

  • speechmatics-rt version: 0.4.0
  • Python version: 3.11
  • Use case: Real-time transcription with speaker identification

Would love to hear if this is on the roadmap or if there's an alternative approach we should consider!

Related issues/PRs
Link any related issues or pull requests:

  • Closes #
  • Related to #

Priority/Impact
How important is this feature to you?

  • Critical - blocking current work
  • High - would significantly improve workflow
  • Medium - nice to have improvement
  • Low - minor enhancement

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions