Keyword Density

Fetch a URL, strip out scripts/styles/markup, and return the page's content shape: total word count, sentence count, average sentence length, Flesch Reading Ease, and the top keywords with raw frequency plus density percentage.

Useful for content audits ("are we accidentally keyword-stuffing?"), readability checks on landing pages and blog posts, and quick comparisons between drafts. Pair with /v1/seo/audit for tag-level signals or /v1/seo/compare for cross-page word-count comparisons.

Endpoint

GET /v1/seo/keyword-density

Base URL: https://seo.toolkitapi.io

Query Parameters

Field Type Required Description
url string Yes Absolute URL to analyse. Must be http:// or https://.

Response Fields

Field Type Description
url string The URL that was analysed (echoed from the request).
word_count integer Total visible words in the body text.
sentence_count integer Sentence count (split on ./!/? boundaries).
avg_sentence_length number word_count / sentence_count, rounded to 2 dp. 0.0 if no sentences.
flesch_reading_ease number Standard Flesch Reading Ease score. Higher = easier (≥ 60 ≈ plain English).
top_keywords object[] Most frequent words, sorted by count desc. Each entry: { word, count, density_percent }.

density_percent is count / word_count * 100, so the top-N entries together give you a quick read on whether one term is dominating the page.

The endpoint operates on the page's visible body text. <script> and <style> blocks are stripped before counting. Common stop words are filtered from the keyword list so the top entries are content words, not "the" / "and" / "of".

Examples

curl

curl -G "https://seo.toolkitapi.io/v1/seo/keyword-density" \
  -H "x-api-key: $TOOLKIT_API_KEY" \
  --data-urlencode "url=https://example.com/blog/post"

Python

import requests

resp = requests.get(
    "https://seo.toolkitapi.io/v1/seo/keyword-density",
    params={"url": "https://example.com/blog/post"},
    headers={"x-api-key": API_KEY},
    timeout=30,
)
data = resp.json()

print(f"{data['word_count']} words, "
      f"{data['sentence_count']} sentences, "
      f"avg {data['avg_sentence_length']} words/sentence")
print(f"Flesch Reading Ease: {data['flesch_reading_ease']}")

# Flag any single keyword above 4% density (rough stuffing threshold)
for kw in data["top_keywords"][:10]:
    flag = "  ⚠️" if kw["density_percent"] > 4 else ""
    print(f"  {kw['word']:<20} {kw['count']:>4}  {kw['density_percent']:.2f}%{flag}")

JavaScript

const url = new URL("https://seo.toolkitapi.io/v1/seo/keyword-density");
url.searchParams.set("url", "https://example.com/blog/post");

const resp = await fetch(url, {
  headers: { "x-api-key": process.env.TOOLKIT_API_KEY },
});
const data = await resp.json();

console.log(
  `${data.word_count} words · ${data.sentence_count} sentences · ` +
    `Flesch ${data.flesch_reading_ease}`,
);

const stuffing = data.top_keywords.filter((k) => k.density_percent > 4);
if (stuffing.length) {
  console.warn("Possible keyword stuffing:", stuffing);
}

Example Response

{
  "url": "https://example.com/blog/post",
  "word_count": 842,
  "sentence_count": 47,
  "avg_sentence_length": 17.91,
  "flesch_reading_ease": 62.4,
  "top_keywords": [
    { "word": "deploy",      "count": 24, "density_percent": 2.85 },
    { "word": "kubernetes",  "count": 19, "density_percent": 2.26 },
    { "word": "cluster",     "count": 17, "density_percent": 2.02 },
    { "word": "service",     "count": 14, "density_percent": 1.66 },
    { "word": "config",      "count": 12, "density_percent": 1.43 }
  ]
}

Reading the Score

Flesch Reading Ease is a 0–100 scale; higher is easier:

Score Description
90–100 Very easy (5th grade)
80–89 Easy
70–79 Fairly easy
60–69 Plain English (8th–9th grade)
50–59 Fairly difficult
30–49 Difficult (college level)
0–29 Very confusing (college graduate)

For a marketing landing page, aim for 60+. For long-form developer content, 30–60 is normal.

Notes

  • top_keywords is bounded server-side; expect roughly the top 10–25 content words back, not the full distribution.
  • Pages with very little visible content (e.g. SPA shells before hydration) will return tiny word_count values and a flesch_reading_ease that isn't meaningful — the endpoint does not execute JavaScript.
  • Each call goes through the same SSRF guard as /v1/seo/audit; private/loopback hosts are rejected with 400.