Keyword Density¶

Fetch a URL, strip out scripts/styles/markup, and return the page's content shape: total word count, sentence count, average sentence length, Flesch Reading Ease, and the top keywords with raw frequency plus density percentage.

Useful for content audits ("are we accidentally keyword-stuffing?"), readability checks on landing pages and blog posts, and quick comparisons between drafts. Pair with /v1/seo/audit for tag-level signals or /v1/seo/compare for cross-page word-count comparisons.

Endpoint¶

GET /v1/seo/keyword-density

Base URL: https://seo.toolkitapi.io

Query Parameters¶

Field	Type	Required	Description
`url`	string	Yes	Absolute URL to analyse. Must be `http://` or `https://`.

Response Fields¶

Field	Type	Description
`url`	string	The URL that was analysed (echoed from the request).
`word_count`	integer	Total visible words in the body text.
`sentence_count`	integer	Sentence count (split on `.`/`!`/`?` boundaries).
`avg_sentence_length`	number	`word_count / sentence_count`, rounded to 2 dp. `0.0` if no sentences.
`flesch_reading_ease`	number	Standard Flesch Reading Ease score. Higher = easier (≥ 60 ≈ plain English).
`top_keywords`	object[]	Most frequent words, sorted by `count` desc. Each entry: `{ word, count, density_percent }`.

density_percent is count / word_count * 100, so the top-N entries together give you a quick read on whether one term is dominating the page.

The endpoint operates on the page's visible body text. <script> and <style> blocks are stripped before counting. Common stop words are filtered from the keyword list so the top entries are content words, not "the" / "and" / "of".

Examples¶

curl¶

curl -G "https://seo.toolkitapi.io/v1/seo/keyword-density" \
  -H "x-api-key: $TOOLKIT_API_KEY" \
  --data-urlencode "url=https://example.com/blog/post"

Python¶

import requests

resp = requests.get(
    "https://seo.toolkitapi.io/v1/seo/keyword-density",
    params={"url": "https://example.com/blog/post"},
    headers={"x-api-key": API_KEY},
    timeout=30,
)
data = resp.json()

print(f"{data['word_count']} words, "
      f"{data['sentence_count']} sentences, "
      f"avg {data['avg_sentence_length']} words/sentence")
print(f"Flesch Reading Ease: {data['flesch_reading_ease']}")

# Flag any single keyword above 4% density (rough stuffing threshold)
for kw in data["top_keywords"][:10]:
    flag = "  ⚠️" if kw["density_percent"] > 4 else ""
    print(f"  {kw['word']:<20} {kw['count']:>4}  {kw['density_percent']:.2f}%{flag}")

JavaScript¶

const url = new URL("https://seo.toolkitapi.io/v1/seo/keyword-density");
url.searchParams.set("url", "https://example.com/blog/post");

const resp = await fetch(url, {
  headers: { "x-api-key": process.env.TOOLKIT_API_KEY },
});
const data = await resp.json();

console.log(
  `${data.word_count} words · ${data.sentence_count} sentences · ` +
    `Flesch ${data.flesch_reading_ease}`,
);

const stuffing = data.top_keywords.filter((k) => k.density_percent > 4);
if (stuffing.length) {
  console.warn("Possible keyword stuffing:", stuffing);
}

Example Response¶

{
  "url": "https://example.com/blog/post",
  "word_count": 842,
  "sentence_count": 47,
  "avg_sentence_length": 17.91,
  "flesch_reading_ease": 62.4,
  "top_keywords": [
    { "word": "deploy",      "count": 24, "density_percent": 2.85 },
    { "word": "kubernetes",  "count": 19, "density_percent": 2.26 },
    { "word": "cluster",     "count": 17, "density_percent": 2.02 },
    { "word": "service",     "count": 14, "density_percent": 1.66 },
    { "word": "config",      "count": 12, "density_percent": 1.43 }
  ]
}

Reading the Score¶

Flesch Reading Ease is a 0–100 scale; higher is easier:

Score	Description
90–100	Very easy (5th grade)
80–89	Easy
70–79	Fairly easy
60–69	Plain English (8th–9th grade)
50–59	Fairly difficult
30–49	Difficult (college level)
0–29	Very confusing (college graduate)

For a marketing landing page, aim for 60+. For long-form developer content, 30–60 is normal.

Notes¶

top_keywords is bounded server-side; expect roughly the top 10–25 content words back, not the full distribution.
Pages with very little visible content (e.g. SPA shells before hydration) will return tiny word_count values and a flesch_reading_ease that isn't meaningful — the endpoint does not execute JavaScript.
Each call goes through the same SSRF guard as /v1/seo/audit; private/loopback hosts are rejected with 400.

Compare

Broken Links