LLM Clients
A Kotlin library for interacting with various Large Language Model (LLM) providers through a unified interface. Supports OpenAI and Google Gemini with a simple, consumer-friendly API.
Features
- Unified Interface: Single API for multiple LLM providers (OpenAI, Gemini)
- Security First: No hardcoded API keys - consumers must provide their own keys explicitly
- Explicit HTTP Client: Consumers must provide and pass their own HttpClient instance (no library-provided HTTP client)
- Type-Safe: Kotlin serialization for structured request/response handling
- Production Ready: Built-in timeouts, connection pooling, and error handling
- Factory Pattern: Instantiate once with API keys and HttpClient; create provider clients without repeating secrets
- Extensible: Clean architecture for adding new providers
- Streaming: Stream partial responses via Kotlin Flow; design details in docs/STREAMING_API_DESIGN.md
Installation
Gradle (Kotlin DSL)
repositories {
mavenCentral()
}
dependencies {
implementation("io.github.researchforyounow:llm-clients:0.7.5")
}
Maven
<dependency>
<groupId>io.github.researchforyounow</groupId>
<artifactId>llm-clients</artifactId>
<version>0.7.5</version>
</dependency>
Quick Start
1. Get Your API Keys
SECURITY NOTICE: This library requires you to provide your own API keys explicitly. No hardcoded API keys are included for security reasons.
2. Initialize Library
fun initializeLibrary(): LlmClientFactory {
val httpClient = ExampleHttpClient.createRecommendedHttpClient()
return LlmClientFactory(
httpClient = httpClient,
openAiApiKey = System.getenv("OPENAI_API_KEY") ?: "your-openai-api-key-here",
geminiApiKey = System.getenv("GEMINI_API_KEY") ?: "your-gemini-api-key-here"
)
}
3. Use Clients Anywhere
Note: This library does not use keys.properties. Configure API keys via environment variables or your secrets manager.
The factory holds your keys and shared HttpClient so callers only supply model options.
val factory = initializeLibrary()
val openAiClient = factory.createOpenAiClient(
config = OpenAiConfig(
modelName = Models.GPT_4O_2024_08_06,
temperature = 0.3
)
)
val geminiClient = factory.createGeminiClient(
config = GeminiConfig(
model = GeminiModel.GEMINI_1_5_FLASH_LATEST
)
)
structuredResult = openAiClient.generate(
request = GenerationRequest.of(
,
),
responseType = MyDataClass::.java
)
textResult = openAiClient.generateText(
GenerationRequest.of()
)
structuredResult.fold(
onSuccess = { response -> println() },
onFailure = { error -> println() }
)
Complete Example
For a comprehensive, production-ready example showing:
- One-time library initialization
- Multiple client configurations
- Error handling and best practices
- Real-world usage patterns
See: ConsumerReadyExample.kt
Supported Models
OpenAI
- GPT-4o (default:
gpt-4o-2024-08-06)
- GPT-4.1 family
- Search preview models (
gpt-4o-search-preview, gpt-4o-mini-search-preview)
- Custom models via configuration
Google Gemini
- Gemini 1.5 Flash (default:
gemini-1.5-flash-latest)
- Gemini 1.5 Pro
- Custom models via configuration
Configuration Options
OpenAI Client
val client = factory.createOpenAiClient(
config = OpenAiConfig(
modelName = Models.GPT_4O_2024_08_06,
temperature = 0.7,
maxTokens = 2000
),
)
Gemini Client
val client = factory.createGeminiClient(
config = GeminiConfig(
model = GeminiModel.GEMINI_1_5_FLASH_LATEST
),
)
Retry Policy
Both OpenAiConfig and GeminiConfig accept an optional retryPolicy parameter.
Retries use exponential backoff with jitter and are only applied when an idempotencyKey
is supplied in the GenerationRequest.
val policy = RetryPolicy(maxAttempts = 3, initialDelayMillis = 200, jitterMillis = 100)
val client = factory.createOpenAiClient(
config = OpenAiConfig(retryPolicy = policy)
)
Advanced Usage
Custom Response Types
@Serializable
data class StructuredResponse(
val answer: String,
val confidence: Double,
val reasoning: String
)
val result = client.generate(
request = GenerationRequest.of(
"Explain photosynthesis",
"Respond in JSON format with answer, confidence, and reasoning fields."
),
responseType = StructuredResponse::class.java
)
Multiple Configurations
val creativeClient = factory.createOpenAiClient(
config = OpenAiConfig(temperature = 0.9)
)
val factualClient = factory.createOpenAiClient(
config = OpenAiConfig(temperature = 0.1)
)
Usage Metrics Hook
Both OpenAI and Gemini can report token usage information. Supply a usageSink
in the configuration to observe normalized metrics:
val client = factory.createOpenAiClient(
config = OpenAiConfig(
usageSink = { usage ->
println("prompt=${usage.promptTokens} completion=${usage.completionTokens}")
}
)
)
The LlmUsage model normalizes provider-specific fields (e.g., OpenAI
prompt_tokens/completion_tokens). Providers that don't return usage simply
never invoke the sink.
OpenAI Web Search (chat completions)
Search-preview models use chat completions with web_search_options. Enable it
via enableWebSearch = true and omit sampling params (they are ignored).
@Serializable
data class WebSearchResult(
val answer: String,
val sources: List<String>,
)
val client = factory.createOpenAiClient(
config = OpenAiConfig(
modelName = Models.GPT_4O_MINI_SEARCH_PREVIEW,
responseFormat = ResponseFormat.JSON_SCHEMA,
jsonSchemaName = "web_search_result",
jsonSchema = """
{
"type": "object",
"properties": {
"answer": { "type": "string" },
"sources": { "type": "array", "items": { "type": "string" } }
},
"required": ["answer", "sources"]
}
""".trimIndent(),
enableWebSearch = true,
),
)
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
API Stability and Internal Types
This library follows a small, stable public API surface centered on the LlmClient
interfaces and the request/response models shown in the examples. Implementation
details that live in core-api.
Provider Support Matrix
Image Generation (OpenAI)
- Generate images using OpenAI Images API via OpenAiClient.generateImage.
- Returns either URLs or base64-encoded JSON depending on ImageResponseFormat.
- Models and sizes are constrained by OpenAI (e.g., DALL·E 3 supports 1024x1024, 1024x1792, 1792x1024 with n=1).
- See full example: examples/src/main/kotlin/examples/OpenAiImageExample.kt
Example:
val client = factory.createOpenAiClient(OpenAiConfig.defaultConfig())
val imgReq = ImageGenerationRequest(
prompt = "A watercolor painting of a mountain at sunrise",
n = 1,
size = "1024x1024",
responseFormat = ImageResponseFormat.URL,
model = OpenAiImageModel.DALL_E_3
)
val result = client.generateImage(imgReq)
result.onSuccess { images ->
images.forEach { println(it.url ?: "[base64 image]") }
}
Configuration Reference
The LlmClientFactory injects API keys from your environment or secrets manager (see Quick Start). You configure per-provider behavior via config objects when creating clients.
OpenAI configuration (OpenAiConfig)
Gemini configuration (GeminiConfig)
Environment variables (suggested)
- OPENAI_API_KEY: OpenAI secret used by LlmClientFactory.
- GEMINI_API_KEY: Gemini secret used by LlmClientFactory.
See examples in examples/ for usage, including streaming and structured responses.
Speech-to-Text (OpenAI)
File transcription
val audioFile = AudioFile(
bytes = File("/path/to/audio.mp3").readBytes(),
fileName = "audio.mp3",
contentType = "audio/mpeg",
)
val request = AudioTranscriptionRequest(
file = audioFile,
model = "gpt-4o-transcribe",
responseFormat = AudioResponseFormat.TEXT,
)
val result = openAiClient.transcribe(request).getOrThrow()
println(result.text)
File transcription (streaming results)
val request = AudioTranscriptionRequest(
file = audioFile,
model = "gpt-4o-mini-transcribe",
responseFormat = AudioResponseFormat.TEXT,
stream = true,
)
openAiClient.streamTranscription(request).collect { event ->
println(event.type)
}
Realtime transcription (live audio)
val request = RealtimeTranscriptionSessionRequest(
transcriptionModel = "gpt-4o-mini-transcribe",
)
val session = openAiClient.createRealtimeTranscriptionSession(request).getOrThrow()
val connection = openAiClient.openRealtimeTranscriptionConnection(request, session).getOrThrow()
val controller = RealtimeTranscriptionController(connection)
controller.start()