Language and Transliteration¶
2 endpoints for language identification and script transliteration.
| Method |
Endpoint |
Purpose |
POST |
/v1/text/language |
Detect language and return confidence-ranked candidates |
POST |
/v1/text/transliterate |
Convert Unicode text into ASCII approximation |
Python SDK Examples¶
Language detection¶
from toolkitapi import TextAnalysis
samples = [
"This platform provides deterministic text analysis APIs.",
"Bonjour tout le monde, comment allez-vous aujourd'hui?",
"Esto es una prueba de deteccion de idioma.",
]
with TextAnalysis(api_key="tk_...") as ta:
for text in samples:
result = ta.language(text=text, top_n=3)
print(result["detected"], result["confidence"])
print(result["candidates"])
Transliteration¶
from toolkitapi import TextAnalysis
text = "Zaz\u00f3\u0142\u0107 g\u0119\u015bl\u0105 ja\u017a\u0144 - \u041f\u0440\u0438\u0432\u0435\u0442 \u043c\u0438\u0440 - \u3053\u3093\u306b\u3061\u306f\u4e16\u754c"
with TextAnalysis(api_key="tk_...") as ta:
result = ta.transliterate(text=text)
print(result["transliterated"])
print(result["non_ascii_characters"])
Request Parameters¶
POST /v1/text/language¶
| Parameter |
Type |
Description |
text |
string |
Input text, max 1048576 characters |
top_n |
integer |
Number of candidates to return, 1 to 20 |
POST /v1/text/transliterate¶
| Parameter |
Type |
Description |
text |
string |
Input text, max 1048576 characters |
Response Fields¶
Language detection¶
| Field |
Type |
Description |
detected |
string |
Top detected language code |
name |
string |
Top language name |
confidence |
number |
Confidence for top candidate |
candidates |
array |
Ranked candidate objects with language, name, confidence |
error |
string |
Present when text is too short for reliable detection |
Transliteration¶
| Field |
Type |
Description |
transliterated |
string |
ASCII approximated output |
original_length |
integer |
Length of input text |
result_length |
integer |
Length of output text |
non_ascii_characters |
integer |
Count of non-ASCII input characters |
Tip
For short snippets such as titles or labels, language detection may be unstable. Use longer surrounding context when available to improve confidence.