Extract Content API

Interactive Tutorial

Welcome to the Tutorial! ๐Ÿ‘‹

Learn how to extract news from Bold.dk using the Extract Content API. This step-by-step tutorial will show you how to build a complete news reader.

Vanilla JavaScript
Web Components
Container Queries

Step 1: Fetch Basic Headlines

Let's start by extracting article headlines from Bold.dk using CSS selectors

example-1.js Vanilla JavaScript
// Step 1: Define API endpoint and extract parameters
const API_URL = 'https://extract-content.deno.dev';
const extractParams = {
  "overskrifter": ".article-headline"
};

// Step 2: Build the request URL
const url = `\${API_URL}/?from=https://bold.dk&extract=\${encodeURIComponent(JSON.stringify(extractParams))}`;

// Step 3: Fetch data from API
const response = await fetch(url);
const data = await response.json();

// Step 4: Display headlines
data.overskrifter.forEach(headline => {
  console.log(headline);
});
Extract Parameter (JSON)
{
  "overskrifter": ".article-headline"
}

๐Ÿ’ก How it works:

The CSS selector .article-headline targets all elements with the class "article-headline". The API returns an array of text content from these elements.

Step 2: Add Categories & Timestamps

Extract multiple pieces of data by using multiple CSS selectors

example-2.js Enhanced Extraction
// Extract multiple data points with different selectors
const extractParams = {
  "overskrifter": ".article-headline",
  "kategorier": ".ArticleListItem__tag",
  "tidspunkter": ".ArticleListItem__timestamp"
};

const url = `https://extract-content.deno.dev/?from=https://bold.dk&extract=\${encodeURIComponent(JSON.stringify(extractParams))}`;

const response = await fetch(url);
const data = await response.json();

// Combine arrays into structured article objects
const articles = data.overskrifter.map((title, i) => ({
  title,
  category: data.kategorier[i] || 'N/A',
  time: data.tidspunkter[i] || 'N/A'
}));
Extract Parameter (JSON)
{
  "overskrifter": ".article-headline",
  "kategorier": ".ArticleListItem__tag",
  "tidspunkter": ".ArticleListItem__timestamp"
}

๐Ÿ’ก Multiple Selectors:

  • โ€ข .ArticleListItem__tag extracts category tags
  • โ€ข .ArticleListItem__timestamp extracts post times
  • โ€ข Each selector returns an array in the same order

Step 3: Build Beautiful News Cards

Create reusable Web Components for a professional news layout

news-card.js Web Component
// Define a custom Web Component
class NewsCard extends HTMLElement {
  connectedCallback() {
    const title = this.getAttribute('title');
    const category = this.getAttribute('category');
    const time = this.getAttribute('time');

    this.innerHTML = `
      <div class="card">
        <span class="category">\${category}</span>
        <h3>\${title}</h3>
        <span class="time">\${time}</span>
      </div>
    `;
  }
}

// Register the custom element
customElements.define('news-card', NewsCard);
Usage in HTML
<!-- Use the custom element -->
<news-card
  title="Article headline"
  category="VM kval. UEFA"
  time="12:47"
></news-card>

๐ŸŽฏ Benefits:

  • โœ“ Reusable across projects
  • โœ“ Encapsulated logic
  • โœ“ Native browser support
  • โœ“ Responsive with container queries

๐Ÿ“š API Reference

Complete reference for the Extract Content API

Endpoint

GET https://extract-content.deno.dev/?from={URL}&extract={JSON}

Parameters

from - Target URL to scrape (e.g., https://bold.dk)
extract - URL-encoded JSON object with CSS selectors

CSS Selectors for Bold.dk

.article-headline

Article titles

.ArticleListItem__tag

Category tags

.ArticleListItem__timestamp

Publication time

.ArticleListItem__commentsCount

Comment count

Example Request

const extractParams = {
  "overskrifter": ".article-headline",
  "kategorier": ".ArticleListItem__tag"
};

const url = `https://extract-content.deno.dev/?from=https://bold.dk&extract=${
  encodeURIComponent(JSON.stringify(extractParams))
}`;

const response = await fetch(url);
const data = await response.json();

console.log(data.overskrifter); // Array of headlines
console.log(data.kategorier);   // Array of categories

Response Format

{
  "overskrifter": [
    "Headline 1",
    "Headline 2",
    "Headline 3"
  ],
  "kategorier": [
    "VM kval. UEFA",
    "Bold+",
    "Superligaen"
  ]
}