OpenAI-Compatible Client

LeapOpenAIClient / leap-openai-client (introduced in v0.10.0) is a small, dependency-light client for any OpenAI-compatible chat-completions endpoint — OpenAI itself, OpenRouter, vLLM, llama-server, or your own proxy. It ships in the same SDK release as LeapSDK, so you can route requests between an on-device LFM and a cloud model from a single app.

When to use it

Hybrid on-device + cloud routing. Run small / fast models on-device with LeapSDK, fall back to a larger cloud model for hard prompts.
Standardised cloud API. Talk to any OpenAI-compatible backend without pulling in a heavier OpenAI SDK.
Streaming first. SSE streaming is the only mode — non-streaming requests aren’t exposed. streamChatCompletion(...) forces stream = true on the outgoing request regardless of the stream field on the ChatCompletionRequest you pass in.

Add the dependency

iOS / macOS (SPM)
Android (Gradle)
JVM (Gradle)
Kotlin/Native (Gradle)

Add the LeapOpenAIClient product to your target. See the Quick Start for the full SPM setup.

dependencies: [
    .package(url: "https://github.com/Liquid4All/leap-sdk.git", from: "0.10.7")
]

targets: [
    .target(
        name: "YourApp",
        dependencies: [
            .product(name: "LeapOpenAIClient", package: "leap-sdk"),
        ]
    )
]

In Swift sources, import LeapOpenAIClient. The Darwin (URLSession) Ktor engine is bundled — no extra HTTP setup needed.

dependencies {
  implementation("ai.liquid.leap:leap-sdk:0.10.7")
  implementation("ai.liquid.leap:leap-openai-client:0.10.7")
}

Bundles an OkHttp-engine Ktor client. No extra HTTP setup needed.

dependencies {
    implementation("ai.liquid.leap:leap-sdk:0.10.7")
    implementation("ai.liquid.leap:leap-openai-client:0.10.7")
}

JVM support landed in v0.10.7 (the jvm slice was absent in the v0.10.0–v0.10.6 cascade). Pure-Maven JVM projects should consume the -jvm classifier directly: ai.liquid.leap:leap-openai-client-jvm:0.10.7. Bundles the CIO Ktor engine.

dependencies {
    implementation("ai.liquid.leap:leap-sdk:0.10.7")
    implementation("ai.liquid.leap:leap-openai-client:0.10.7")
}

Targets linuxX64, linuxArm64, mingwX64 (Windows native), and wasmJs (browser via Ktor Js engine, added in v0.10.7).

Basic usage

Swift (iOS / macOS)
Kotlin (all platforms)

The leap-sdk-openai-client Kotlin module does not apply the SKIE plugin in v0.10.7 (only leap-sdk, leap-sdk-model-downloader, and leap-ui do). That means Flow<ChatCompletionEvent> is not bridged to a Swift AsyncSequence and the onEnum(of:) helper is not generated for ChatCompletionEvent. Swift consumers on v0.10.7 must collect the Kotlin Flow through its native collector and downcast each event with as?. For most Swift apps that just need cloud chat completions, an off-the-shelf OpenAI Swift client is more ergonomic — use LeapOpenAIClient from Swift only if you need to share Kotlin code with Android.Coming in the next release: SKIE will be enabled on leap-sdk-openai-client, adding the same Swift-friendly surface as LeapSDK — for try await event in client.streamChatCompletion(...), onEnum(of: event) exhaustive switching, and nested-class Swift names (ChatCompletionEvent.Delta instead of the current flattened ChatCompletionEventDelta). Swift convenience inits and builders for OpenAiClientConfig are also planned. Pin to v0.10.7 if you need the current behavior frozen; otherwise expect the more ergonomic surface to land soon.

Manual collection pattern (the Flow<ChatCompletionEvent>.collect(...) shape varies by Kotlin/Native version — check the framework header in your Xcode build for the exact label):

import LeapOpenAIClient

// The Kotlin top-level `fun OpenAiClient(config: OpenAiClientConfig)` exports as
// `OpenAiClientKt.OpenAiClient(config:)` (PascalCase preserved from the Kotlin
// function name). Without SKIE the K/N export also flattens Kotlin's nested
// class names — `ChatMessage.User` → `ChatMessageUser`,
// `ChatCompletionEvent.Delta` → `ChatCompletionEventDelta`, etc.
let client = OpenAiClientKt.OpenAiClient(
    config: OpenAiClientConfig(
        apiKey: "sk-…",
        baseUrl: "https://api.openai.com/v1"
    )
)

let request = ChatCompletionRequest(
    model: "gpt-4o-mini",
    messages: [
        ChatMessageSystem(content: "You are a helpful assistant."),
        ChatMessageUser(content: "What is the capital of Japan?")
    ],
    temperature: 0.7
)

// Pseudocode — actual collector signature depends on your Kotlin/Native version
// and framework headers. Without SKIE, there is no `for try await` integration.
try await client.streamChatCompletion(request: request).collect(
    collector: FlowCollector { event in
        if let delta = event as? ChatCompletionEventDelta {
            print(delta.content, terminator: "")
        } else if let done = event as? ChatCompletionEventDone {
            if let usage = done.usage { print("\nTokens: \(usage.totalTokens)") }
        } else if let err = event as? ChatCompletionEventError {
            print("\nError: \(err.message)")
        }
    }
)

client.close()  // closes the underlying URLSession-backed HttpClient

import ai.liquid.leap.openai.ChatCompletionEvent
import ai.liquid.leap.openai.ChatCompletionRequest
import ai.liquid.leap.openai.ChatMessage
import ai.liquid.leap.openai.OpenAiClient
import ai.liquid.leap.openai.OpenAiClientConfig

val client = OpenAiClient(
    config = OpenAiClientConfig(
        apiKey = "sk-…",
        baseUrl = "https://api.openai.com/v1",
    )
)

val request = ChatCompletionRequest(
    model = "gpt-4o-mini",
    messages = listOf(
        ChatMessage.System("You are a helpful assistant."),
        ChatMessage.User("What is the capital of Japan?"),
    ),
    temperature = 0.7,
)

client.streamChatCompletion(request).collect { event ->
    when (event) {
        is ChatCompletionEvent.Delta -> print(event.content)
        is ChatCompletionEvent.Done  -> event.usage?.let { println("\nTokens: ${it.totalTokens}") }
        is ChatCompletionEvent.Error -> println("\nError: ${event.message}")
    }
}

client.close()

Configuration

OpenAiClientConfig is a Kotlin data class bridged identically on every platform.

data class OpenAiClientConfig(
    val apiKey: String,
    val baseUrl: String = "https://api.openai.com/v1",
    val chatCompletionsPath: String = "/chat/completions",
    val extraHeaders: Map<String, String> = emptyMap(),
)

Field	Default	Notes
`apiKey`	— (required)	Sent as `Authorization: Bearer <apiKey>`.
`baseUrl`	`https://api.openai.com/v1`	Override for OpenRouter, a self-hosted backend, etc.
`chatCompletionsPath`	`/chat/completions`	Appended to `baseUrl`.
`extraHeaders`	`{}`	Merged into every request — e.g. OpenRouter’s `HTTP-Referer`.

OpenRouter

Swift (iOS / macOS)
Kotlin (all platforms)

// The leap-sdk-openai-client module has no SKIE plugin applied, so the
// top-level Kotlin `fun OpenAiClient(config:)` factory is exported as
// `OpenAiClientKt.OpenAiClient(config:)`. See the [Basic usage](#basic-usage)
// warning for the full reasoning.
let client = OpenAiClientKt.OpenAiClient(
    config: OpenAiClientConfig(
        apiKey: "sk-or-…",
        baseUrl: "https://openrouter.ai/api/v1",
        extraHeaders: [
            "HTTP-Referer": "https://yourapp.example.com",
            "X-Title": "Your App"
        ]
    )
)

val client = OpenAiClient(
    OpenAiClientConfig(
        apiKey = "sk-or-…",
        baseUrl = "https://openrouter.ai/api/v1",
        extraHeaders = mapOf(
            "HTTP-Referer" to "https://yourapp.example.com",
            "X-Title" to "Your App",
        ),
    )
)

Self-hosted vLLM / llama-server

Swift (iOS / macOS)
Kotlin (all platforms)

let client = OpenAiClientKt.OpenAiClient(
    config: OpenAiClientConfig(
        apiKey: "anything",  // Required by config but typically unused
        baseUrl: "http://10.0.0.42:8000/v1"
    )
)

val client = OpenAiClient(
    OpenAiClientConfig(
        apiKey = "anything",
        baseUrl = "http://10.0.0.42:8000/v1",
    )
)

Request shape

ChatCompletionRequest covers standard OpenAI fields plus a few OpenRouter-specific extensions. OpenRouter-only fields are silently ignored by stock OpenAI-compatible APIs.

data class ChatCompletionRequest(
    val model: String,
    val messages: List<ChatMessage>,
    val temperature: Double? = null,
    val topP: Double? = null,
    val maxCompletionTokens: Int? = null,   // Preferred for newer OpenAI versions
    val maxTokens: Int? = null,             // Legacy alias — some custom backends still require it
    val frequencyPenalty: Double? = null,
    val presencePenalty: Double? = null,
    val stop: List<String>? = null,
    val stream: Boolean = true,
    // OpenRouter extensions:
    val topK: Int? = null,
    val repetitionPenalty: Double? = null,
    val minP: Double? = null,
    val topA: Double? = null,
    val transforms: List<String>? = null,
    val models: List<String>? = null,
    val route: String? = null,
    val provider: ProviderPreferences? = null,
)

ChatMessage (the OpenAI-client one, distinct from LeapSDK.ChatMessage) is a sealed type with three cases — System, User, Assistant.

Response shape

streamChatCompletion(request) returns a Flow<ChatCompletionEvent> (Kotlin) — and the same Flow is exposed verbatim to Swift in v0.10.7 (no SKIE on this module yet, so it’s not bridged to a Swift AsyncSequence; collect it via the native Flow.collect(...) shape shown above). Events:

Variant	Meaning
`Delta(content: String)`	Text chunk from the model. May be empty for role-only deltas.
`Done(usage: Usage?)`	Stream finished. `usage` is non-`null` when the API includes token counts.
`Error(message: String)`	HTTP error or stream parsing failure.

data class Usage(val promptTokens: Int, val completionTokens: Int, val totalTokens: Int)

Hybrid routing example

Route simple prompts to a small on-device LFM; escalate harder prompts to a cloud model.

Swift (iOS / macOS)
Kotlin (Android)
Kotlin (JVM / native)

import LeapModelDownloader
import LeapOpenAIClient

@MainActor
final class HybridChatViewModel: ObservableObject {
    private let onDevice: Conversation
    private let cloud: OpenAiClient

    init(onDevice: Conversation, cloud: OpenAiClient) {
        self.onDevice = onDevice
        self.cloud = cloud
    }

    func send(_ text: String, useCloud: Bool) async throws {
        if useCloud {
            // Cloud path: leap-sdk-openai-client has no SKIE — collect the Kotlin
            // Flow manually and downcast each event with `as?`. Note the flattened
            // Swift type names (`ChatMessageUser`, `ChatCompletionEventDelta`).
            let request = ChatCompletionRequest(
                model: "gpt-4o-mini",
                messages: [ChatMessageUser(content: text)]
            )
            try await cloud.streamChatCompletion(request: request).collect(
                collector: FlowCollector { event in
                    if let delta = event as? ChatCompletionEventDelta {
                        appendChunk(delta.content)
                    }
                }
            )
        } else {
            // On-device path: leap-sdk has SKIE — `for try await` + `onEnum(of:)`
            // work as written.
            let userMessage = ChatMessage(role: .user, textContent: text)
            for try await response in onDevice.generateResponse(message: userMessage) {
                if case let .chunk(c) = onEnum(of: response) { appendChunk(c.text) }
            }
        }
    }

    private func appendChunk(_ text: String) { /* … */ }

    deinit { cloud.close() }
}

import ai.liquid.leap.Conversation
import ai.liquid.leap.message.MessageResponse
import ai.liquid.leap.openai.ChatCompletionEvent
import ai.liquid.leap.openai.ChatCompletionRequest
import ai.liquid.leap.openai.ChatMessage as CloudChatMessage
import ai.liquid.leap.openai.OpenAiClient
import ai.liquid.leap.message.ChatMessage
import ai.liquid.leap.message.ChatMessageContent
import androidx.lifecycle.ViewModel
import androidx.lifecycle.viewModelScope
import kotlinx.coroutines.launch

class HybridChatViewModel(
    private val onDevice: Conversation,
    private val cloud: OpenAiClient,
) : ViewModel() {

    fun send(text: String, useCloud: Boolean) = viewModelScope.launch {
        if (useCloud) {
            val request = ChatCompletionRequest(
                model = "gpt-4o-mini",
                messages = listOf(CloudChatMessage.User(text)),
            )
            cloud.streamChatCompletion(request).collect { event ->
                if (event is ChatCompletionEvent.Delta) appendChunk(event.content)
            }
        } else {
            val message = ChatMessage(
                role = ChatMessage.Role.USER,
                content = listOf(ChatMessageContent.Text(text)),
            )
            onDevice.generateResponse(message).collect { resp ->
                if (resp is MessageResponse.Chunk) appendChunk(resp.text)
            }
        }
    }

    private fun appendChunk(text: String) { /* … */ }

    override fun onCleared() {
        super.onCleared()
        cloud.close()
    }
}

suspend fun hybridSend(
    onDevice: Conversation,
    cloud: OpenAiClient,
    text: String,
    useCloud: Boolean,
) {
    if (useCloud) {
        val request = ChatCompletionRequest(
            model = "gpt-4o-mini",
            messages = listOf(CloudChatMessage.User(text)),
        )
        cloud.streamChatCompletion(request).collect { event ->
            if (event is ChatCompletionEvent.Delta) print(event.content)
        }
    } else {
        onDevice.generateResponse(text).collect { resp ->
            if (resp is MessageResponse.Chunk) print(resp.text)
        }
    }
}

See Cloud AI Comparison for a side-by-side feature breakdown.

Lifecycle

The platform OpenAiClient(config:) factory (Kotlin fun OpenAiClient(config:) → Swift OpenAiClientKt.OpenAiClient(config:)) creates an HttpClient internally and ties it to the returned client — call close() when you’re done.

Swift (iOS / macOS)
Kotlin (all platforms)

deinit { client.close() }

The lower-level constructor that accepts an externally-managed HttpClient is part of the Kotlin/Ktor surface and isn’t a useful entry point from Swift — the Ktor engine machinery isn’t bridged into the public Swift API. Use OpenAiClientKt.OpenAiClient(config:) and let the SDK own the session. If multiple consumers share a client, share the OpenAiClient instance and close() once at teardown.

override fun onCleared() {
    super.onCleared()
    client.close()
}

If you need to share an HttpClient across multiple clients (e.g., you already manage one for other Ktor-based code), use the lower-level constructor that takes a pre-built HttpClient — you then own its lifetime and shouldn’t call close() on the OpenAiClient:

val shared = HttpClient(OkHttp)  // your own instance
val client = OpenAiClient(config = config, httpClient = shared)
// Don't call client.close() — you own `shared` and decide when it dies

Documentation Index

​When to use it

​Add the dependency

​Basic usage

​Configuration

​OpenRouter

​Self-hosted vLLM / llama-server

​Request shape

​Response shape

​Hybrid routing example

​Lifecycle