How to Extract Meta Tags from Any URL with an API

February 19, 2026 · 8 min read

Every time you paste a link into Slack, Twitter, or iMessage, something happens behind the scenes: the platform fetches the page, reads its meta tags, and renders a preview card. That card -- the title, description, and thumbnail -- is powered by Open Graph (OG) tags and Twitter Card markup.

If you're building a CMS, link aggregator, social scheduling tool, or SEO dashboard, you need to do exactly the same thing. This guide shows you how to extract meta tags from any URL programmatically using an API, and how to build an OG tag checker into your workflow.

What Are Meta Tags (and Why Should You Care)?

Meta tags are HTML elements in a page's <head> that provide metadata about the page. The ones that matter most for link previews and SEO:

Open Graph tags (og:title, og:description, og:image, og:url) -- used by Facebook, LinkedIn, Slack, Discord, and most chat apps
Twitter Card tags (twitter:card, twitter:title, twitter:image) -- used by X/Twitter for card previews
Standard meta tags (title, description, robots, canonical) -- used by search engines
Favicon and icons -- displayed in browser tabs and bookmarks

Missing or broken OG tags mean ugly link previews, which directly hurts click-through rates when your content is shared on social media. A meta tag extractor lets you audit this at scale.

The DIY Approach (And Its Problems)

The naive approach is to fetch the HTML and parse it yourself:

# Python - basic meta tag extraction
import requests
from bs4 import BeautifulSoup

resp = requests.get("https://example.com", timeout=10)
soup = BeautifulSoup(resp.text, "html.parser")

og_title = soup.find("meta", property="og:title")
print(og_title["content"] if og_title else "No og:title found")

This works for static pages, but falls apart quickly:

JavaScript-rendered pages -- SPAs built with React, Vue, or Next.js often inject meta tags client-side. A simple HTTP fetch won't see them.
Redirects and bot detection -- many sites serve different content to bots, block scrapers, or require cookie consent.
Rate limiting -- if you're checking thousands of URLs, you'll get blocked fast.
Edge cases -- malformed HTML, missing charsets, relative URLs for images, duplicate tags.

For anything beyond a quick script, you want an API that handles the rendering and parsing for you.

Using the MetaPeek API

MetaPeek is a meta tag extraction API built on top of GrabShot's browser infrastructure. It renders pages in a real browser (Chromium), waits for JavaScript, and returns structured metadata.

Quick Start with cURL

# Extract all meta tags from a URL
curl "https://metapeek.grabshot.dev/api/extract?url=https://github.com" \
  -H "X-API-Key: YOUR_API_KEY"

Response:

{
  "url": "https://github.com",
  "title": "GitHub: Let's build from here",
  "description": "GitHub is where over 100 million developers shape the future of software...",
  "og": {
    "title": "GitHub: Let's build from here",
    "description": "GitHub is where over 100 million developers...",
    "image": "https://github.githubassets.com/assets/campaign-social.png",
    "url": "https://github.com",
    "type": "website",
    "site_name": "GitHub"
  },
  "twitter": {
    "card": "summary_large_image",
    "site": "@github",
    "title": "GitHub: Let's build from here",
    "image": "https://github.githubassets.com/assets/campaign-social.png"
  },
  "favicon": "https://github.githubassets.com/favicons/favicon.svg",
  "canonical": "https://github.com/"
}

Node.js Example

// Extract meta tags with Node.js
const API_KEY = process.env.METAPEEK_API_KEY;

async function extractMeta(url) {
  const res = await fetch(
    `https://metapeek.grabshot.dev/api/extract?url=${encodeURIComponent(url)}`,
    { headers: { "X-API-Key": API_KEY } }
  );
  return res.json();
}

// Check OG tags for a list of URLs
const urls = [
  "https://yoursite.com/blog/post-1",
  "https://yoursite.com/blog/post-2",
  "https://yoursite.com/about",
];

for (const url of urls) {
  const meta = await extractMeta(url);
  const issues = [];

  if (!meta.og?.title) issues.push("Missing og:title");
  if (!meta.og?.image) issues.push("Missing og:image");
  if (!meta.og?.description) issues.push("Missing og:description");
  if (meta.og?.description?.length > 200) issues.push("og:description too long");

  if (issues.length) {
    console.log(`\n${url}:`);
    issues.forEach(i => console.log(`  - ${i}`));
  } else {
    console.log(`${url}: All OG tags present`);
  }
}

Python Example

import requests
import os

API_KEY = os.environ["METAPEEK_API_KEY"]
BASE = "https://metapeek.grabshot.dev/api/extract"

def check_og_tags(url: str) -> dict:
    """Extract and validate OG tags for a URL."""
    resp = requests.get(
        BASE,
        params={"url": url},
        headers={"X-API-Key": API_KEY},
        timeout=30,
    )
    resp.raise_for_status()
    data = resp.json()

    return {
        "url": url,
        "title": data.get("og", {}).get("title", "MISSING"),
        "description": data.get("og", {}).get("description", "MISSING"),
        "image": data.get("og", {}).get("image", "MISSING"),
        "twitter_card": data.get("twitter", {}).get("card", "MISSING"),
    }

# Audit your entire sitemap
import xml.etree.ElementTree as ET

sitemap = requests.get("https://yoursite.com/sitemap.xml").text
root = ET.fromstring(sitemap)
ns = {"s": "http://www.sitemaps.org/schemas/sitemap/0.9"}

for loc in root.findall(".//s:loc", ns):
    result = check_og_tags(loc.text)
    missing = [k for k, v in result.items() if v == "MISSING"]
    if missing:
        print(f"  {loc.text} -- missing: {', '.join(missing)}")

Building an OG Tag Checker

A practical OG tag checker validates more than just "does the tag exist." Here's what to look for:

Check	Rule	Why It Matters
`og:title`	Present, 30-60 chars	Too long gets truncated in previews
`og:description`	Present, 50-160 chars	Same truncation issue
`og:image`	Present, absolute URL, loads successfully	Broken images = no preview card
`og:image` size	At least 1200x630px	Facebook/LinkedIn recommended minimum
`twitter:card`	Present, valid type	Controls X/Twitter preview format
`canonical`	Matches actual URL	Prevents duplicate content issues

You can combine MetaPeek for tag extraction with GrabShot's screenshot API to also capture what the page actually looks like -- useful for visual verification that the OG image matches the page content.

Real-World Use Cases

1. CMS Publishing Workflow

Before publishing a blog post, automatically check that all required meta tags are present. Reject drafts that are missing OG images or have descriptions that are too long. This prevents embarrassing link previews when the post gets shared.

2. Social Media Scheduling

If you're building a Buffer or Hootsuite competitor, show users a preview of how their link will appear when shared. Extract the OG data, render the card, and let them fix issues before posting.

3. SEO Auditing

Crawl an entire site's sitemap and audit every page's meta tags in bulk. Flag pages with missing descriptions, duplicate titles, or broken OG images. This is exactly what tools like Ahrefs and Screaming Frog do -- but you can build a custom version with a simple API call per URL.

4. Link Aggregators and Bookmarking Apps

When a user saves a URL, extract the title, description, and thumbnail automatically. No need to ask users to fill in metadata manually. Products like Pocket, Raindrop, and Notion all do this.

Combining with Screenshots

Meta tags tell you what a page claims to look like. A screenshot shows you what it actually looks like. Combining both gives you the full picture:

# Get meta tags AND a screenshot in parallel
curl "https://metapeek.grabshot.dev/api/extract?url=https://example.com" \
  -H "X-API-Key: YOUR_KEY" &

curl "https://grabshot.dev/api/screenshot?url=https://example.com&width=1200&height=630" \
  -H "X-API-Key: YOUR_KEY" \
  -o screenshot.png &

wait
echo "Both done"

This is particularly useful for generating dynamic OG images -- you can screenshot a custom HTML template and use it as the og:image for every page on your site.

Try MetaPeek Free

Extract meta tags from any URL. 25 free requests per month, no credit card required.

Get Started

Handling Edge Cases

A few things to watch out for when building a meta tag extractor into your pipeline:

Relative OG image URLs -- some sites use /images/og.png instead of full URLs. Always resolve these against the page's base URL.
Multiple OG images -- the spec allows multiple og:image tags. Most platforms use the first one, but your checker should note all of them.
Fallback chains -- if og:title is missing, platforms fall back to <title>. Your checker should distinguish between "present via OG" and "falling back to HTML."
Encoding issues -- non-UTF-8 pages, HTML entities in tag values, and emoji in titles can all cause display issues. The API normalizes these for you.

Wrapping Up

Meta tag extraction is one of those problems that seems simple until you try to do it reliably across the entire web. JavaScript rendering, bot detection, encoding issues, and edge cases in the OG spec all add complexity.

Using an API like MetaPeek lets you skip the infrastructure headaches and focus on what you're actually building -- whether that's an SEO tool, a social scheduler, or a link preview component. Combine it with GrabShot for screenshots and you have the full toolkit for understanding any URL on the web.