Language and Transliteration¶

2 endpoints for language identification and script transliteration.

Method	Endpoint	Purpose
`POST`	`/v1/text/detect-language`	Detect language and return confidence-ranked candidates
`POST`	`/v1/text/transliterate`	Convert Unicode text into ASCII approximation

REST API Examples¶

Detect language¶

curl -X POST "https://textanalysis.toolkitapi.io/v1/text/detect-language" \
  -H "X-API-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "Bonjour tout le monde, comment allez-vous?"}'

const resp = await fetch("https://textanalysis.toolkitapi.io/v1/text/detect-language", {
  method: "POST",
  headers: { "X-API-Key": "YOUR_KEY", "Content-Type": "application/json" },
  body: JSON.stringify({ text: "Bonjour tout le monde" }),
});
const data = await resp.json();
console.log(`Language: ${data.language} (${Math.round(data.confidence * 100)}%)`);

Transliterate text¶

curl -X POST "https://textanalysis.toolkitapi.io/v1/text/transliterate" \
  -H "X-API-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "Привет мир", "from_script": "Cyrillic", "to_script": "Latin"}'

Python SDK Examples¶

Language detection¶

from toolkitapi import TextAnalysis

samples = [
    "This platform provides deterministic text analysis APIs.",
    "Bonjour tout le monde, comment allez-vous aujourd'hui?",
    "Esto es una prueba de deteccion de idioma.",
]

with TextAnalysis(api_key="tk_...") as ta:
    for text in samples:
        result = ta.language(text=text, top_n=3)
        print(result["detected"], result["confidence"])
        print(result["candidates"])

Transliteration¶

from toolkitapi import TextAnalysis

text = "Zaz\u00f3\u0142\u0107 g\u0119\u015bl\u0105 ja\u017a\u0144 - \u041f\u0440\u0438\u0432\u0435\u0442 \u043c\u0438\u0440 - \u3053\u3093\u306b\u3061\u306f\u4e16\u754c"

with TextAnalysis(api_key="tk_...") as ta:
    result = ta.transliterate(text=text)

print(result["transliterated"])
print(result["non_ascii_characters"])

Request Parameters¶

POST /v1/text/detect-language¶

Parameter	Type	Description
`text`	string	Input text, max 1048576 characters
`top_n`	integer	Number of candidates to return, 1 to 20

POST /v1/text/transliterate¶

Parameter	Type	Description
`text`	string	Input text, max 1048576 characters

Response Fields¶

Language detection¶

Field	Type	Description
`detected`	string	Top detected language code
`name`	string	Top language name
`confidence`	number	Confidence for top candidate
`candidates`	array	Ranked candidate objects with language, name, confidence
`error`	string	Present when text is too short for reliable detection

Transliteration¶

Field	Type	Description
`transliterated`	string	ASCII approximated output
`original_length`	integer	Length of input text
`result_length`	integer	Length of output text
`non_ascii_characters`	integer	Count of non-ASCII input characters

Tip

For short snippets such as titles or labels, language detection may be unstable. Use longer surrounding context when available to improve confidence.

Filtering and Frequency

Analytics Toolkit