Skip to content

deepgram/deepgram-java-sdk

Repository files navigation

Deepgram Java SDK

Built with Fern Java Version License: MIT

Official Java SDK for Deepgram's automated speech recognition, text-to-speech, and language understanding APIs.

Power your applications with world-class speech and language AI models.


Documentation

You can learn more about the Deepgram API at developers.deepgram.com.

Installation

Gradle

Add the dependency to your build.gradle:

dependencies {
    implementation 'com.deepgram:deepgram-java-sdk:0.1.0'
}

Maven

Add the dependency to your pom.xml:

<dependency>
    <groupId>com.deepgram</groupId>
    <artifactId>deepgram-java-sdk</artifactId>
    <version>0.1.0</version>
</dependency>

Quickstart

Authentication

The SDK supports API Key authentication with automatic environment variable loading:

import DeepgramClient;

// Using environment variable (DEEPGRAM_API_KEY)
DeepgramClient client = DeepgramClient.builder().build();

// Using API key directly
DeepgramClient client = DeepgramClient.builder()
    .apiKey("YOUR_DEEPGRAM_API_KEY")
    .build();

Get your API key from the Deepgram Console.

Bearer Token Authentication

Use an access token (JWT) for Bearer authentication. When provided, the access token takes precedence over any API key:

// With access token (Bearer auth)
DeepgramClient client = DeepgramClient.builder()
    .accessToken("your-jwt-token")
    .build();

Session ID

Attach a custom session identifier sent as the x-deepgram-session-id header with every request and WebSocket connection. If not provided, a UUID is auto-generated:

// With custom session ID
DeepgramClient client = DeepgramClient.builder()
    .apiKey("your-api-key")
    .sessionId("my-session-123")
    .build();

Features

Speech-to-Text (Listen)

Transcribe pre-recorded audio from files or URLs.

import DeepgramClient;
import resources.listen.v1.media.requests.ListenV1RequestUrl;
import resources.listen.v1.media.types.MediaTranscribeResponse;

DeepgramClient client = DeepgramClient.builder().build();

// Transcribe from URL
MediaTranscribeResponse result = client.listen().v1().media().transcribeUrl(
    ListenV1RequestUrl.builder()
        .url("https://static.deepgram.com/examples/Bueller-Life-moves-702702070.wav")
        .build()
);

// Access transcription
String transcript = result.getResults()
    .getChannels().get(0)
    .getAlternatives().get(0)
    .getTranscript();
System.out.println(transcript);

Transcribe from File

import java.nio.file.Files;
import java.nio.file.Path;

byte[] audioData = Files.readAllBytes(Path.of("audio.wav"));

// File upload body is byte[]
MediaTranscribeResponse result = client.listen().v1().media().transcribeFile(audioData);

Text-to-Speech (Speak)

Convert text to natural-sounding speech.

import DeepgramClient;
import resources.speak.v1.audio.requests.SpeakV1Request;
import java.io.InputStream;

DeepgramClient client = DeepgramClient.builder().build();

// Generate speech audio
InputStream audioStream = client.speak().v1().audio().generate(
    SpeakV1Request.builder()
        .text("Hello, world! Welcome to Deepgram.")
        .build()
);

// Write audio to file or play it

Text Intelligence (Read)

Analyze text for sentiment, topics, summaries, and intents.

import DeepgramClient;
import types.ReadV1Request;
import types.ReadV1Response;

DeepgramClient client = DeepgramClient.builder().build();

ReadV1Response result = client.read().v1().text().analyze(
    ReadV1Request.builder()
        .text("Deepgram's speech recognition is incredibly accurate and fast.")
        .build()
);

Management

Manage projects, API keys, members, usage, and billing.

import DeepgramClient;

DeepgramClient client = DeepgramClient.builder().build();

// List projects
var projects = client.manage().v1().projects().list();

// List API keys for a project
var keys = client.manage().v1().projects().keys().list("project-id");

// Get usage statistics
var usage = client.manage().v1().projects().usage().get("project-id");

Voice Agent

Manage voice agent configurations and models.

import DeepgramClient;

DeepgramClient client = DeepgramClient.builder().build();

// List available agent think models
var models = client.agent().v1().settings().think().models().list();

WebSocket APIs

The SDK includes built-in WebSocket clients for real-time streaming.

Live Transcription (Listen WebSocket)

Stream audio for real-time speech-to-text.

import DeepgramClient;
import resources.listen.v1.websocket.V1WebSocketClient;
import resources.listen.v1.websocket.V1ConnectOptions;
import types.ListenV1Model;

DeepgramClient client = DeepgramClient.builder().build();

V1WebSocketClient ws = client.listen().v1().websocket();

// Register event handlers
ws.onResults(results -> {
    String transcript = results.getChannel()
        .getAlternatives().get(0)
        .getTranscript();
    System.out.println("Transcript: " + transcript);
});

ws.onMetadata(metadata -> {
    System.out.println("Metadata received");
});

ws.onError(error -> {
    System.err.println("Error: " + error.getMessage());
});

// Connect with options (model is required)
ws.connect(V1ConnectOptions.builder()
    .model(ListenV1Model.NOVA3)
    .build());

ws.send(audioBytes);

// Close when done
ws.close();

Text-to-Speech Streaming (Speak WebSocket)

Stream text for real-time audio generation.

import DeepgramClient;
import resources.speak.v1.websocket.V1WebSocketClient;

DeepgramClient client = DeepgramClient.builder().build();

var ttsWs = client.speak().v1().websocket();

// Register event handlers
ttsWs.onAudioData(audioData -> {
    // Process audio chunks as they arrive
});

ttsWs.onMetadata(metadata -> {
    System.out.println("Metadata received");
});

ttsWs.onError(error -> {
    System.err.println("Error: " + error.getMessage());
});

// Connect and send text
ttsWs.connect();
ttsWs.send("Hello, this is streamed text-to-speech.");

// Close when done
ttsWs.close();

Agent WebSocket

Connect to Deepgram's voice agent for real-time conversational AI.

import DeepgramClient;
import resources.agent.v1.websocket.V1WebSocketClient;

DeepgramClient client = DeepgramClient.builder().build();

var agentWs = client.agent().v1().websocket();

// Register event handlers
agentWs.onWelcome(welcome -> {
    System.out.println("Agent connected");
});

agentWs.onError(error -> {
    System.err.println("Error: " + error.getMessage());
});

// Connect and interact
agentWs.connect();
agentWs.send(audioBytes);

// Close when done
agentWs.close();

Configuration

Custom Timeouts

DeepgramClient client = DeepgramClient.builder()
    .apiKey("YOUR_DEEPGRAM_API_KEY")
    .timeout(30)  // 30 seconds
    .build();

Retry Configuration

DeepgramClient client = DeepgramClient.builder()
    .apiKey("YOUR_DEEPGRAM_API_KEY")
    .maxRetries(3)
    .build();

Custom HTTP Client

import okhttp3.OkHttpClient;
import java.util.concurrent.TimeUnit;

OkHttpClient httpClient = new OkHttpClient.Builder()
    .connectTimeout(30, TimeUnit.SECONDS)
    .readTimeout(30, TimeUnit.SECONDS)
    .build();

DeepgramClient client = DeepgramClient.builder()
    .apiKey("YOUR_DEEPGRAM_API_KEY")
    .httpClient(httpClient)
    .build();

Custom Base URL

For on-premises deployments or custom endpoints:

import core.Environment;

DeepgramClient client = DeepgramClient.builder()
    .apiKey("YOUR_DEEPGRAM_API_KEY")
    .environment(Environment.custom()
        .base("https://your-custom-endpoint.com")
        .build())
    .build();

Environment Variable

Set your API key as an environment variable to avoid passing it in code:

export DEEPGRAM_API_KEY="your-api-key-here"
// API key is loaded automatically from DEEPGRAM_API_KEY
DeepgramClient client = DeepgramClient.builder().build();

Custom Headers

DeepgramClient client = DeepgramClient.builder()
    .apiKey("YOUR_DEEPGRAM_API_KEY")
    .addHeader("X-Custom-Header", "custom-value")
    .build();

Async Client

The SDK provides a fully asynchronous client for non-blocking operations:

import AsyncDeepgramClient;
import resources.listen.v1.media.requests.ListenV1RequestUrl;
import resources.listen.v1.media.types.MediaTranscribeResponse;
import java.util.concurrent.CompletableFuture;

AsyncDeepgramClient asyncClient = AsyncDeepgramClient.builder().build();

// Async transcription
CompletableFuture<MediaTranscribeResponse> future = asyncClient.listen().v1().media()
    .transcribeUrl(ListenV1RequestUrl.builder()
        .url("https://static.deepgram.com/examples/Bueller-Life-moves-702702070.wav")
        .build());

future.thenAccept(result -> {
    System.out.println("Transcription complete!");
});

Error Handling

The SDK provides structured error handling:

import errors.BadRequestError;

try {
    var result = client.listen().v1().media().transcribeUrl(
        ListenV1RequestUrl.builder()
            .url("https://example.com/audio.mp3")
            .build()
    );
} catch (BadRequestError e) {
    System.err.println("Bad request: " + e.getMessage());
} catch (Exception e) {
    System.err.println("Error: " + e.getMessage());
}

Raw Response Access

All client methods support raw response access for advanced use cases:

import core.RawResponse;

// Access raw HTTP response
var rawResponse = client.listen().v1().media()
    .withRawResponse()
    .transcribeUrl(request);

int statusCode = rawResponse.statusCode();
var headers = rawResponse.headers();
var body = rawResponse.body();

Complete SDK Reference

The SDK provides comprehensive access to Deepgram's APIs:

Listen (Speech-to-Text)

client.listen().v1().media().transcribeUrl(request)    // Transcribe audio from URL
client.listen().v1().media().transcribeFile(body)      // Transcribe audio from file bytes
client.listen().v1().websocket()                       // Real-time streaming transcription

Speak (Text-to-Speech)

client.speak().v1().audio().generate(request)          // Generate speech from text
client.speak().v1().websocket()                        // Real-time streaming TTS

Read (Text Intelligence)

client.read().v1().text().analyze(request)             // Analyze text content

Agent (Voice Agent)

client.agent().v1().settings().think().models().list() // List available agent models
client.agent().v1().websocket()                        // Real-time agent WebSocket

Manage (Project Management)

// Projects
client.manage().v1().projects().list()                 // List all projects
client.manage().v1().projects().get(projectId)         // Get project details
client.manage().v1().projects().update(projectId, req) // Update project
client.manage().v1().projects().delete(projectId)      // Delete project

// API Keys
client.manage().v1().projects().keys().list(projectId) // List API keys
client.manage().v1().projects().keys().create(id, req) // Create new key
client.manage().v1().projects().keys().get(id, keyId)  // Get key details
client.manage().v1().projects().keys().delete(id, key) // Delete key

// Members
client.manage().v1().projects().members().list(id)     // List members
client.manage().v1().projects().members().delete(p, m) // Remove member

// Usage
client.manage().v1().projects().usage().get(projectId) // Get usage summary

// Models
client.manage().v1().projects().models().list(id)      // List project models
client.manage().v1().models().list()                   // List all models

Auth (Authentication)

client.auth().v1().tokens().grant(request)             // Generate access token

Self-Hosted

client.selfHosted().v1().distributionCredentials().list(id)   // List credentials
client.selfHosted().v1().distributionCredentials().create(req) // Create credentials
client.selfHosted().v1().distributionCredentials().get(p, id)  // Get credentials
client.selfHosted().v1().distributionCredentials().delete(p,i) // Delete credentials

Development

Requirements

  • Java 11 or higher (Java 17 recommended for development)
  • Gradle (wrapper included)
  • Deepgram API key (sign up)

Running Tests

# Unit tests
./gradlew unitTest

# Integration tests (requires DEEPGRAM_API_KEY)
./gradlew integrationTest

# All tests
./gradlew test

Using Make

make check             # lint + build + unit tests
make test-integration  # integration tests only
make test-all          # full test suite
make format            # auto-format code

API Reference

See reference.md for complete API documentation.

Getting Help

Contributing

We welcome contributions! Please see CONTRIBUTING.md for details.

Community Code of Conduct

Please see our community code of conduct before contributing to this project.

License

This project is licensed under the MIT License - see the LICENSE file for details.


Built by Deepgram

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages