Keyword Density¶
Fetch a URL, strip out scripts/styles/markup, and return the page's content shape: total word count, sentence count, average sentence length, Flesch Reading Ease, and the top keywords with raw frequency plus density percentage.
Useful for content audits ("are we accidentally keyword-stuffing?"), readability checks on landing pages and blog posts, and quick comparisons between drafts. Pair with /v1/seo/audit for tag-level signals or /v1/seo/compare for cross-page word-count comparisons.
Endpoint¶
GET /v1/seo/keyword-density
Base URL: https://seo.toolkitapi.io
Query Parameters¶
| Field | Type | Required | Description |
|---|---|---|---|
url |
string | Yes | Absolute URL to analyse. Must be http:// or https://. |
Response Fields¶
| Field | Type | Description |
|---|---|---|
url |
string | The URL that was analysed (echoed from the request). |
word_count |
integer | Total visible words in the body text. |
sentence_count |
integer | Sentence count (split on ./!/? boundaries). |
avg_sentence_length |
number | word_count / sentence_count, rounded to 2 dp. 0.0 if no sentences. |
flesch_reading_ease |
number | Standard Flesch Reading Ease score. Higher = easier (≥ 60 ≈ plain English). |
top_keywords |
object[] | Most frequent words, sorted by count desc. Each entry: { word, count, density_percent }. |
density_percent is count / word_count * 100, so the top-N entries together give you a quick read on whether one term is dominating the page.
The endpoint operates on the page's visible body text.
<script>and<style>blocks are stripped before counting. Common stop words are filtered from the keyword list so the top entries are content words, not "the" / "and" / "of".
Examples¶
curl¶
curl -G "https://seo.toolkitapi.io/v1/seo/keyword-density" \
-H "x-api-key: $TOOLKIT_API_KEY" \
--data-urlencode "url=https://example.com/blog/post"
Python¶
import requests
resp = requests.get(
"https://seo.toolkitapi.io/v1/seo/keyword-density",
params={"url": "https://example.com/blog/post"},
headers={"x-api-key": API_KEY},
timeout=30,
)
data = resp.json()
print(f"{data['word_count']} words, "
f"{data['sentence_count']} sentences, "
f"avg {data['avg_sentence_length']} words/sentence")
print(f"Flesch Reading Ease: {data['flesch_reading_ease']}")
# Flag any single keyword above 4% density (rough stuffing threshold)
for kw in data["top_keywords"][:10]:
flag = " ⚠️" if kw["density_percent"] > 4 else ""
print(f" {kw['word']:<20} {kw['count']:>4} {kw['density_percent']:.2f}%{flag}")
JavaScript¶
const url = new URL("https://seo.toolkitapi.io/v1/seo/keyword-density");
url.searchParams.set("url", "https://example.com/blog/post");
const resp = await fetch(url, {
headers: { "x-api-key": process.env.TOOLKIT_API_KEY },
});
const data = await resp.json();
console.log(
`${data.word_count} words · ${data.sentence_count} sentences · ` +
`Flesch ${data.flesch_reading_ease}`,
);
const stuffing = data.top_keywords.filter((k) => k.density_percent > 4);
if (stuffing.length) {
console.warn("Possible keyword stuffing:", stuffing);
}
Example Response¶
{
"url": "https://example.com/blog/post",
"word_count": 842,
"sentence_count": 47,
"avg_sentence_length": 17.91,
"flesch_reading_ease": 62.4,
"top_keywords": [
{ "word": "deploy", "count": 24, "density_percent": 2.85 },
{ "word": "kubernetes", "count": 19, "density_percent": 2.26 },
{ "word": "cluster", "count": 17, "density_percent": 2.02 },
{ "word": "service", "count": 14, "density_percent": 1.66 },
{ "word": "config", "count": 12, "density_percent": 1.43 }
]
}
Reading the Score¶
Flesch Reading Ease is a 0–100 scale; higher is easier:
| Score | Description |
|---|---|
| 90–100 | Very easy (5th grade) |
| 80–89 | Easy |
| 70–79 | Fairly easy |
| 60–69 | Plain English (8th–9th grade) |
| 50–59 | Fairly difficult |
| 30–49 | Difficult (college level) |
| 0–29 | Very confusing (college graduate) |
For a marketing landing page, aim for 60+. For long-form developer content, 30–60 is normal.
Notes¶
top_keywordsis bounded server-side; expect roughly the top 10–25 content words back, not the full distribution.- Pages with very little visible content (e.g. SPA shells before hydration) will return tiny
word_countvalues and aflesch_reading_easethat isn't meaningful — the endpoint does not execute JavaScript. - Each call goes through the same SSRF guard as
/v1/seo/audit; private/loopback hosts are rejected with400.