openai-stream-proxy
A transparent proxy for OpenAI-compatible HTTP and WebSocket traffic. For supported HTTP JSON endpoints, it converts
non-streaming OpenAI API requests into upstream SSE streaming requests, aggregates the stream in memory, and sends a
non-streaming response to the downstream client.
Supports the Responses API (/v1/responses) and Chat Completions API (/v1/chat/completions). Requests already
using stream=true are passed through unchanged.
WebSocket requests are proxied to the corresponding upstream ws:// or wss:// URL. Data is forwarded in both
directions as it arrives, with backpressure and without payload rewriting or whole-session buffering. For established
upstream WebSocket sessions, normal close reasons are forwarded and abnormal disconnects are reported to the downstream
client as WebSocket internal errors.
This project only converts between non-streaming and streaming modes. It does not support protocol conversion, including
but not limited to converting the Responses protocol to the Chat Completions protocol, or converting the Responses
protocol to the Anthropic protocol.
Usage
Download executables from the releases page
or build from source.
Create config.json in the working directory:
| Field | Description | Default |
|---|
timeoutSeconds | Upstream request timeout in seconds | |
Each rule maps a listen port to an upstream base URL. All paths under that port are proxied to the corresponding
upstream. With the config above:
POST http://localhost:8080/v1/responses →
The proxy listens on the configured ports and converts non-streaming requests whose path ends with /responses or
/chat/completions into upstream SSE streams. Streaming requests (stream: true) and requests to any other path are
forwarded unchanged.
For example, a non-streaming Chat Completions request from the downstream client:
POST /v1/chat/completions
Content-Type: application/json
{"model": "gpt-5.5", "messages": [{"role": "user", "content": "Hello"}]}
is rewritten to the following before being sent upstream:
POST /v1/chat/completions
Content-Type: application/json
{"model": "gpt-5.5", "messages": [{"role": "user", "content": "Hello"}], "stream": true, "stream_options": {"include_usage": true}}
The upstream SSE stream is then aggregated in memory and returned as a single non-streaming JSON response to the
downstream client.
For WebSocket requests, the proxy keeps the same path and query string, maps http:// upstream URLs to ws:// and
https:// upstream URLs to wss://, and forwards request headers such as Authorization. WebSocket traffic is
forwarded bidirectionally in streaming mode with backpressure so the proxy can handle long-lived connections with low
memory usage. When the upstream closes the WebSocket normally, the close reason is forwarded downstream. If the upstream
WebSocket session ends without a normal close, the downstream WebSocket is closed with an internal error.
The request path is appended to upstreamUrl, so upstreamUrl typically should not include /v1.
Then run the executable with the config file in place. For JVM:
java -jar openai-stream-proxy-0.0.6-fat.jar
For native (Linux):
./openai-stream-proxy-0.0.6-linuxX64.kexe
The proxy reads config.json from the working directory by default. To use a different path:
java -jar openai-stream-proxy-0.0.6-fat.jar --config-file /path/to/config.json
java -jar openai-stream-proxy-0.0.6-fat.jar -c /path/to/config.json
./openai-stream-proxy-0.0.6-linuxX64.kexe --config-file /path/to/config.json
Build
./gradlew :cli:shadowJar
./gradlew :cli:linkReleaseExecutableLinuxX64
./gradlew :cli:linkReleaseExecutableMingwX64
./gradlew :cli:linkReleaseExecutableMacosArm64
The fat JAR is output to cli/build/libs/. Native executables are output to
cli/build/bin/<target>/releaseExecutable/.
Proxy Library
The proxy module is a Kotlin Multiplatform library you can embed in your own application. It is engine-agnostic — you
provide the Ktor HttpClientEngine, and the library handles request rewriting, SSE aggregation, and response assembly.
WebSocket passthrough is implemented by the CLI module, not by the proxy library classes.
Gradle Dependency
dependencies {
implementation("com.hiczp:openai-stream-proxy:0.0.6")
}
The library is published for JVM, mingwX64, linuxX64, linuxArm64, and macosArm64.
Quick Start
How It Works
Conversion proxy classes follow the same flow:
Requests that don't match the conversion criteria are forwarded unchanged (passthrough). This also applies when a
protocol-specific proxy receives traffic outside its path match: for example, ResponsesApiProxy will passthrough
non-*/responses requests, and ChatCompletionsApiProxy will passthrough non-*/chat/completions requests.
PassthroughApiProxy skips conversion entirely and forwards every request unchanged.
Proxy Classes
All proxy classes extend AbstractApiProxy, which provides streaming passthrough() for forwarding requests unchanged,
upstream error responses for failed or invalid upstream streams, and stripHopByHopHeaders() for header cleanup.
Resource Lifecycle
The HttpClientEngine is not owned by the proxy — the caller is responsible for closing it when no longer needed (
e.g., on application shutdown). Do not call close() on the internal HttpClient created by the proxy, as this would
shut down the shared engine.
val engine = CIO.create()
try {
val proxy = ResponsesApiProxy(engine, upstreamUrl)
} finally {
engine.close()
}
Accumulators
SSE event aggregation is handled by SseAccumulator implementations:
Accumulators are not thread-safe — accumulate() must be called from a single coroutine.
Build Requirements
JVM toolchain: 21
License
MIT