Pass word_level_timestamps option into whisper API call

Question

Accepted Answer

The current Whisper API implementation does not expose the 'word_timestamps' parameter, which is available in the openai/whisper Python package. This results in less accurate timestamps in the transcription response, as they are rounded to the nearest second instead of providing precise word-level timestamps. Update the API specification to include the 'word_timestamps' parameter in the request body. This will allow users to specify whether they want word-level timestamps in their transcription results. In the backend service that handles the Whisper API requests, add logic to check for the 'word_timestamps' parameter in the incoming request. If it is set to true, pass this parameter to the Whisper model during transcription. Revise the API documentation to include the new 'word_timestamps' parameter, detailing its usage and the expected output format when it is enabled. Create unit tests to verify that the API correctly processes the 'word_timestamps' parameter and that the output includes accurate timestamps when enabled. Once the implementation is tested and verified, deploy the changes to the production environment to make the 'word_timestamps' feature available to users.

Pass word_level_timestamps option into whisper API call

Problem

1 Fix

Add word_timestamps Parameter to Whisper API Call

Modify API Specification

Implement Backend Logic

Update API Documentation

Test the Implementation

Deploy Changes

Validation

Environment

Submitted by

Tags