Text Analysis Toolkit

Run deterministic text analysis across 9 endpoints. Compute readability scores, produce extractive summaries, compare documents, create structured diffs, mask sensitive data, filter profanity, analyze term frequency, detect language, and transliterate Unicode text.

Base URL

https://textanalysis.toolkitapi.io/v1/

Endpoints

Readability and Summarization

Method Endpoint Description
POST /v1/text/readability Compute readability metrics and audience interpretation
POST /v1/text/summarize Build an extractive summary with top ranked sentences

Similarity and Diff

Method Endpoint Description
POST /v1/text/similarity Compare two strings using Levenshtein, cosine, or Jaccard
POST /v1/text/diff Generate unified and structured line-level differences

Filtering and Frequency

Method Endpoint Description
POST /v1/text/pii-mask Detect and mask PII types such as email, phone, and SSN
POST /v1/text/profanity Detect and optionally mask profane terms
POST /v1/text/word-frequency Return top word frequencies with percentages

Language Tools

Method Endpoint Description
POST /v1/text/language Detect language and return ranked candidates
POST /v1/text/transliterate Convert Unicode text to ASCII approximation

Quick SDK Example

from toolkitapi import TextAnalysis

with TextAnalysis(api_key="tk_...") as ta:
    readability = ta.readability(text="This is a short sample paragraph for scoring.")
    similarity = ta.similarity(
        a="The quick brown fox jumps over the lazy dog.",
        b="A quick brown fox jumped over a lazy dog.",
        method="cosine",
    )

print(readability["scores"]["flesch_reading_ease"])
print(similarity["similarity"])

Python SDK

pip install toolkitapi
from toolkitapi import TextAnalysis

with TextAnalysis(api_key="tk_...") as ta:
    result = ta.language(text="Bonjour tout le monde", top_n=3)
    print(result["detected"], result["confidence"])

See drilldowns for endpoint-specific request and response fields.

Tip

Language detection requires at least 10 characters for reliable scoring. For very short inputs, aggregate adjacent text before calling the endpoint.