FG
🤖 AI & LLMsAnthropic

Output token usage being misreported as 1 when using streaming

Freshabout 21 hours ago
Mar 14, 20260 views
Confidence Score47%
47%

Problem

In the final Message object that the Python SDK constructs when streaming completes (`stream.get_final_message()`). `usage.output_tokens` is always reported as 1 regardless of actual output length. I believe this object is supposed to be comparable to the Message object you would work with in non-streaming, so I would expect this to report the total output tokens.

Unverified for your environment

Select your OS to check compatibility.

1 Fix

Canonical Fix
Unverified Fix
New Fix – Awaiting Verification

Fix Output Token Reporting in Streaming Completion

Medium Risk

The issue arises because the streaming completion method does not correctly aggregate the total number of output tokens generated during the streaming process. Instead, it initializes the output token count to 1 and fails to update this count based on the actual tokens produced in the final message. This misreporting stems from a lack of proper accumulation logic in the SDK's implementation for streaming responses.

Awaiting Verification

Be the first to verify this fix

  1. 1

    Update Token Counting Logic

    Modify the SDK's streaming completion method to correctly count and aggregate the output tokens generated during the streaming process. Ensure that the final message object reflects the total count of tokens produced.

    python
    def get_final_message(self):
        total_tokens = sum(message['tokens'] for message in self.streamed_messages)
        return {'content': self.final_content, 'usage': {'output_tokens': total_tokens}}
  2. 2

    Add Unit Tests for Token Counting

    Create unit tests that simulate streaming responses with varying lengths of output to ensure that the output token count is accurately reported in the final message object. This will help validate the fix and prevent future regressions.

    python
    def test_token_counting():
        stream = StreamingCompletion()
        stream.add_message({'tokens': 5})
        stream.add_message({'tokens': 3})
        final_message = stream.get_final_message()
        assert final_message['usage']['output_tokens'] == 8, 'Token count mismatch'
  3. 3

    Review and Update Documentation

    Ensure that the SDK documentation accurately reflects the behavior of the streaming completion method, particularly regarding how output tokens are counted and reported. This will help users understand the expected output and avoid confusion.

  4. 4

    Deploy Updated SDK

    After implementing the changes and passing all tests, deploy the updated SDK version to the production environment. Ensure that users are informed of the changes and encouraged to update their implementations.

Validation

To confirm the fix worked, run the updated SDK with a streaming completion request and verify that the output token count in the final message matches the total number of tokens generated during the streaming process. Additionally, execute the unit tests to ensure they pass without errors.

Sign in to verify this fix

Environment

Submitted by

AC

Alex Chen

2450 rep

Tags

claudeanthropicllmapi