Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 50 additions & 18 deletions docs/resources/language-release-process.mdx
Original file line number Diff line number Diff line change
@@ -1,39 +1,71 @@
---
title: "Language release process"
description: "Here's what API users can expect when DeepL adds translation support for a new language or language variant."
description: >-
Here's what API users can expect when DeepL adds translation support for a new language or language
variant.
mode: "wide"
---

On a regular basis, DeepL adds translation support for new languages or language variants. In this article, we describe the process we'll follow with a new language or variant release.
On a regular basis, DeepL adds translation support for new languages or language variants. In this article,
we describe the process we'll follow with a new language or variant release.

## Language codes follow BCP 47

DeepL language codes follow [BCP 47](https://www.rfc-editor.org/rfc/rfc5646). A language code always includes a base language subtag (e.g. `en`, `zh`), and may include additional subtags for script, region, or variant where needed to distinguish variants. For example:
DeepL language codes follow [BCP 47](https://www.rfc-editor.org/rfc/rfc5646). A language code always
includes a base language subtag (e.g. `en`, `zh`), and may include additional subtags for script, region,
or variant where needed to distinguish variants. For example:

* `EN-US`, `PT-BR` -- region subtag to distinguish regional variants.
* `ZH-HANS`, `ZH-HANT` -- script subtag to distinguish writing systems.

BCP 47 is an expansive standard, and language codes can vary significantly in structure and length. As DeepL adds support for more languages and variants, new codes may use any combination of subtags permitted by the spec. For example, codes like `sr-Cyrl-RS` or `sr-Latn-RS` (Serbian in Cyrillic vs. Latin script, as used in Serbia) are valid BCP 47 codes -- while DeepL does not support these today, your integration should be able to handle codes of this form if they are added in the future.
BCP 47 is an expansive standard, and language codes can vary significantly in structure and length. As DeepL
adds support for more languages and variants, new codes may use any combination of subtags permitted by the
spec. For example, codes like `sr-Cyrl-RS` or `sr-Latn-RS` (Serbian in Cyrillic vs. Latin script, as used in
Serbia) are valid BCP 47 codes -- while DeepL does not support these today, your integration should be able
to handle codes of this form if they are added in the future.

<Warning>
**Do not hardcode assumptions about the format of language codes.** For example, do not assume that language codes will always be exactly two letters, or that a hyphenated code will always be in the format `xx-YY`. Instead, always treat the `lang` codes returned by the [/languages endpoint](/api-reference/languages) as opaque identifiers. If you need to parse language codes, use a BCP 47-compliant library rather than writing custom parsing logic -- the full spec includes subtags for script, region, variant, extensions, and private use, and partial implementations are a common source of bugs.
**Do not hardcode assumptions about the format of language codes.** For example, do not assume that language
codes will always be exactly two letters, or that a hyphenated code will always be in the format `xx-YY`.
Instead, always treat the `lang` codes returned by the [/languages endpoint](/api-reference/languages) as
opaque identifiers. If you need to parse language codes, use a BCP 47-compliant library rather than writing
custom parsing logic -- the full spec includes subtags for script, region, variant, extensions, and private
use, and partial implementations are a common source of bugs.
</Warning>

## What happens when a new language is released

* We will add the language code for the newly supported language or variant to the "Source languages" and "Target languages" lists on the [Supported languages](/docs/getting-started/supported-languages) page in the API documentation. We'll include a note on that page if the language or variant does *not* support both text and document translation.
* If a newly added language or variant supports both text and document translation, we will add the language or variant to the `/languages` endpoint response. The variant code used depends on the characteristics of the variant:
* In some cases, a variant is primarily used in a specific region, and so a region subtag is the best way to identify it (e.g. `EN-US`, `PT-BR`).
* In other cases, a variant is used widely across multiple regions, and so a script subtag is more appropriate (e.g. `ZH-HANS`, `ZH-HANT`). The subtag structure will be selected by DeepL on a case-by-case basis following BCP 47 conventions.
* In cases where a new language code with a variant duplicates the behavior of an existing language code without a variant (e.g. `ZH-HANS` was recently added as a language code for translating into simplified Chinese, along with `ZH`):
* In the `/languages` endpoint response, we will continue to return both language codes in two separate dicts with the same value in the `"name"` field.
* For backwards compatibility, we will continue to support the original language code (in this example, `ZH`) for text and document translation.
* We will add the language code for the newly supported language or variant to our [OpenAPI spec](https://github.com/DeepLcom/openapi/).
### Language release process for v3/languages

<Info>
**Note about the**`/languages`**endpoint:** In the future, we plan to extend the language information returned by the API.
The [`/v3/languages`](/api-reference/languages/retrieve-supported-languages) endpoint provides flexibility
to specify which languages are supported by different products and which features are supported by each
language. Languages are added individually to each API resource, and new languages may initially be flagged
as beta before they are stable.

### Language release process for v2/languages

This will allow us to specify whether a language supports both text and document translation, whether a language code is considered deprecated because it's been duplicated by a variant language code, and so on.
<Info>
The `/v2/languages` endpoint is deprecated, and may not be extended with all new languages we support.
You should build your integration to use `/v3/languages` instead.
</Info>

The additional metadata would also allow us, for example, to add languages like `AR` and `ZH-HANT` to the languages endpoint even before document translation is supported.
</Info>
* We will add the language code for the newly supported language or variant to the list on the
[Supported languages](/docs/getting-started/supported-languages) page in the API documentation. The list
shows support for text and document translation.
* If a newly added language or variant supports both text and document translation, we will add the language
or variant to the [`/v2/languages`](/api-reference/languages) endpoint response. The variant code used
depends on the characteristics of the variant:
* In some cases, a variant is primarily used in a specific region, and so a region subtag is the best way
to identify it (e.g. `EN-US`, `PT-BR`).
* In other cases, a variant is used widely across multiple regions, and so a script subtag is more
appropriate (e.g. `ZH-HANS`, `ZH-HANT`). The subtag structure will be selected by DeepL on a case-by-case
basis following BCP 47 conventions.
* In cases where a new language code with a variant duplicates the behavior of an existing language code
without a variant (e.g. `ZH-HANS` was recently added as a language code for translating into simplified
Chinese, along with `ZH`):
* In the [`/v2/languages`](/api-reference/languages) endpoint response, we will continue to return both
language codes in two separate dicts with the same value in the `"name"` field.
* For backwards compatibility, we will continue to support the original language code (in this example,
`ZH`) for text and document translation.
* We will add the language code for the newly supported language or variant to our
[OpenAPI spec](https://github.com/DeepLcom/openapi/).