Linguistic metadata for human languages: grammatical gender, writing direction, canonical and native names, BCP 47 normalization, and proficiency level systems (CEFR, JLPT, HSK). Curated, compile-time data with zero runtime dependencies.
Sibling library to beamlab_countries — beamlab_countries knows where languages are spoken, beamlab_languages knows what they are like.
- "Does Russian use grammatical gender? If so, what genders?"
- "Is Arabic written right-to-left?"
- "What's the canonical English name of
fr? The endonym?" - "Does the user's locale string
en-UScollapse to a base I can use as a key?" - "What CEFR levels exist, in teaching order?" "What about JLPT or HSK?"
defp deps do
[
{:beamlab_languages, "~> 0.3"}
]
endThen mix deps.get.
BeamlabLanguages.has_gender?("fr")
# true
BeamlabLanguages.genders("de")
# ["m", "f", "n"]
BeamlabLanguages.direction("ar")
# :rtl
BeamlabLanguages.name("ja")
# "Japanese"
BeamlabLanguages.native_name("ja")
# "日本語"
BeamlabLanguages.normalize("en-US")
# "en"
BeamlabLanguages.get("fr")
# %BeamlabLanguages.Language{
# code: "fr",
# name: "French",
# native_name: "Français",
# direction: :ltr,
# has_gender: true,
# genders: ["m", "f"]
# }
BeamlabLanguages.levels("cefr")
# ["A1", "A2", "B1", "B2", "C1", "C2"]
BeamlabLanguages.level_info("cefr", "A1")
# %{key: "A1", label: "A1", description: "Beginner"}Every function that takes a language code runs normalize/1 internally, so "en-US", "FR", and " fr " all work. Predicates (has_gender?/1, known?/1) return false for nil or unknown input rather than raising — handy in form-validation paths.
Full API docs at HexDocs.
v1 covers 50+ languages: the top-spoken languages worldwide plus all CEFR / JLPT / HSK targets.
Data files:
priv/data/languages.json— core language metadatapriv/data/levels.json— proficiency level systemspriv/data/conjugation/*.json— verb conjugation paradigms
Open a PR to add more or correct an entry.
These are intentionally deferred so v1 ships small. The v1 API is shaped to leave room for them:
- Localized language names —
BeamlabLanguages.name("fr", in: "es")→"francés" - Plural rules (CLDR categories:
:zero,:one,:two,:few,:many,:other) - Articles (definite/indefinite, by gender)
- Case marking (Slavic, Finnic, etc.)
- Noun classes (Bantu)
- Scripts / writing systems per language
- IPA inventory
- Honorific levels (Japanese / Korean)
- Not a CLDR wrapper. No locale formatting (numbers, dates, currencies). That belongs elsewhere.
- Not a translation API. Knows what languages are; doesn't translate text.
- No GenServer / Agent / ETS. All data is compile-time.
- Fork it
- Create a feature branch (
git checkout -b my-new-feature) - Edit
priv/data/languages.json,priv/data/levels.json, and/or code mix testandmix format- Open a PR
MIT — see LICENSE.md.