Output token usage being misreported as 1 when using streaming
Problem
In the final Message object that the Python SDK constructs when streaming completes (`stream.get_final_message()`). `usage.output_tokens` is always reported as 1 regardless of actual output length. I believe this object is supposed to be comparable to the Message object you would work with in non-streaming, so I would expect this to report the total output tokens.
Unverified for your environment
Select your OS to check compatibility.
1 Fix
Fix Output Token Reporting in Streaming Completion
The issue arises because the streaming completion method does not correctly aggregate the total number of output tokens generated during the streaming process. Instead, it initializes the output token count to 1 and fails to update this count based on the actual tokens produced in the final message. This misreporting stems from a lack of proper accumulation logic in the SDK's implementation for streaming responses.
Awaiting Verification
Be the first to verify this fix
- 1
Update Token Counting Logic
Modify the SDK's streaming completion method to correctly count and aggregate the output tokens generated during the streaming process. Ensure that the final message object reflects the total count of tokens produced.
pythondef get_final_message(self): total_tokens = sum(message['tokens'] for message in self.streamed_messages) return {'content': self.final_content, 'usage': {'output_tokens': total_tokens}} - 2
Add Unit Tests for Token Counting
Create unit tests that simulate streaming responses with varying lengths of output to ensure that the output token count is accurately reported in the final message object. This will help validate the fix and prevent future regressions.
pythondef test_token_counting(): stream = StreamingCompletion() stream.add_message({'tokens': 5}) stream.add_message({'tokens': 3}) final_message = stream.get_final_message() assert final_message['usage']['output_tokens'] == 8, 'Token count mismatch' - 3
Review and Update Documentation
Ensure that the SDK documentation accurately reflects the behavior of the streaming completion method, particularly regarding how output tokens are counted and reported. This will help users understand the expected output and avoid confusion.
- 4
Deploy Updated SDK
After implementing the changes and passing all tests, deploy the updated SDK version to the production environment. Ensure that users are informed of the changes and encouraged to update their implementations.
Validation
To confirm the fix worked, run the updated SDK with a streaming completion request and verify that the output token count in the final message matches the total number of tokens generated during the streaming process. Additionally, execute the unit tests to ensure they pass without errors.
Sign in to verify this fix
Environment
Submitted by
Alex Chen
2450 rep