DataDistill API Documentation (v1)

Welcome to the DataDistill API! This guide provides detailed documentation for our v1 REST API, which offers programmatic access to our powerful text processing and artifact management services.

API Base URL: https://api.datadistill.co/api/v1

Authentication

All requests to the /api/v1 gateway (excluding the public health check endpoint) must be authenticated using HTTP Basic Authentication.

Username: Your API Key
Password: Your API Secret (You must use the original, unhashed secret provided when you first created the key.)

You can generate and manage your API credentials from the “API Keys” section within your project settings on the DataDistill dashboard.

The Authorization header must be formatted as Basic <credentials>, where <credentials> is the Base64-encoded string of your-api-key:your-api-secret.

Authentication & Dynamic URL Examples

Python


import requests
import os
import json
 
# --- Configuration ---
# Best practice: Store credentials as environment variables
API_KEY = os.environ.get("DATADISTILL_API_KEY", "YOUR_API_KEY")
API_SECRET = os.environ.get("DATADISTILL_API_SECRET", "YOUR_API_SECRET")
 
# Base URL for all API v1 endpoints
BASE_URL = "https://api.datadistill.co/api/v1"
 
# The library handles the Base64 encoding for you.
auth_credentials = (API_KEY, API_SECRET)
 
# --- Dynamic Request Function ---
def make_api_request(method, endpoint_path, params=None, json_payload=None, files=None):
    """A helper function to make authenticated API requests."""
    full_url = f"{BASE_URL}{endpoint_path}"
    print(f"Making {method} request to: {full_url}")
 
    try:
        response = requests.request(
            method,
            full_url,
            auth=auth_credentials,
            params=params,
            json=json_payload,
            files=files
        )
        response.raise_for_status() # Raises an exception for bad status codes (4xx or 5xx)
 
        print(f"Status Code: {response.status_code}")
        # For 204 No Content, there is no body
        if response.status_code != 204:
            print("Response JSON:")
            print(json.dumps(response.json(), indent=2))
 
    except requests.exceptions.HTTPError as http_err:
        print(f"HTTP error occurred: {http_err}")
        print(f"Response body: {response.text}")
    except requests.exceptions.RequestException as e:
        print(f"An error occurred: {e}")

JavaScript


// Best practice: Store credentials securely on your server
const API_KEY = process.env.DATADISTILL_API_KEY || "YOUR_API_KEY";
const API_SECRET = process.env.DATADISTILL_API_SECRET || "YOUR_API_SECRET";
 
// Base URL for all API v1 endpoints
const BASE_URL = "https://api.datadistill.co/api/v1";
 
// Manually encode the credentials for the Authorization header
const credentials = btoa(`${API_KEY}:${API_SECRET}`);
 
async function makeApiRequest(method, endpointPath, body = null, isJson = true) {
  const fullUrl = `${BASE_URL}${endpointPath}`;
  console.log(`Making ${method} request to: ${fullUrl}`);
 
  const headers = { 'Authorization': `Basic ${credentials}` };
  const options = { method, headers };
 
  if (body) {
    if (isJson) {
      headers['Content-Type'] = 'application/json';
      options.body = JSON.stringify(body);
    } else {
      // For file uploads (FormData), don't set Content-Type; the browser will handle it.
      options.body = body;
    }
  }
 
  try {
    const response = await fetch(fullUrl, options);
    if (!response.ok) {
      throw new Error(`HTTP error! status: ${response.status}, message: ${await response.text()}`);
    }
 
    console.log(`Status Code: ${response.status}`);
    if (response.status !== 204) {
      console.log('Success:', await response.json());
    }
  } catch (error) {
    console.error('Error fetching data:', error);
  }
}

cURL


# --- Configuration ---
API_KEY="YOUR_API_KEY"
API_SECRET="YOUR_API_SECRET"
BASE_URL="https://api.datadistill.co/api/v1"
 
# --- Example Usage ---
# The -u flag formats the credentials and the Authorization header for you
# ENDPOINT_PATH="/api-testing/test/protected_data"
# curl -u "${API_KEY}:${API_SECRET}" "${BASE_URL}${ENDPOINT_PATH}"

Metering and Credit System

API usage is metered through a credit-based system. Each request consumes credits, which are automatically debited from your account. If your balance is insufficient, the API will respond with an HTTP 402 Payment Required. You can manage your credits at https://app.datadistill.co .

General Error Response Format (for 4xx/5xx errors):


{
  "error": {
    "code": "INSUFFICIENT_CREDITS",
    "message": "Your account does not have enough credits to process this request.",
    "details": "Current balance: 0 credits. Required: 5 credits."
  }
}

API Health Check

GET / - Check API Status

Description: Confirms that the API is running and accessible. This is the only endpoint that does not require authentication.
Credit Cost: 0
Parameters: None
Use Case: Use this as a basic health check in your monitoring systems to ensure the API is reachable before attempting authenticated requests.

Python


make_api_request("GET", "/")

JavaScript


makeApiRequest("GET", "/")

cURL


curl "${BASE_URL}/"

Success Response (200 OK):


{
  "status": "API is running",
  "timestamp": "2025-09-22T14:00:00.123456Z",
  "version": "v1"
}

Text Processing API

Base Path: /api/v1/text-processing

These endpoints stream responses using Server-Sent Events (SSE). Your client should listen for a data event containing the final JSON payload. The stream begins with an initial event and ends with a final data: [DONE] marker.

General Streaming Response Format:

Events: data: {"progress": "processing", "eta": "2s"}
Final: data: {"result": {...}, "status": "completed"}

POST /ai-detect

Description: Analyzes text to determine the probability that it was generated by an AI model.
Credit Cost: 1
Parameters:
- text (string, required): The text to analyze for AI generation.
Use Case: Integrate into a content management system to flag submissions that may be AI-generated for editorial review, helping to maintain content authenticity.

Python


make_api_request("POST", "/text-processing/ai-detect", json_payload={"text": "This text might be AI generated."})

JavaScript


makeApiRequest("POST", "/text-processing/ai-detect", {text: "This text might be AI generated."})

cURL


curl -u "${API_KEY}:${API_SECRET}" -X POST -H "Content-Type: application/json" -d '{"text": "This text might be AI generated."}' "${BASE_URL}/text-processing/ai-detect"

Success Response (200 OK, streamed final data):


{
  "result": {
    "ai_probability": 0.85,
    "verdict": "likely_ai",
    "reasoning": "Repetitive structure and unnatural phrasing detected."
  },
  "status": "completed",
  "credits_consumed": 1
}

POST /clone-writing-style

Description: Analyzes a sample text and generates new content in the same style.
Credit Cost: 5
Parameters:
- sample_text (string, required): The reference text exemplifying the desired style.
- new_text_prompt (string, required): The prompt for generating new content.
Use Case: Create marketing copy that matches your brand’s established tone of voice by providing a sample of past successful content.

Python


make_api_request("POST", "/text-processing/clone-writing-style", json_payload={"sample_text": "The quick brown fox jumps over the lazy dog in a whimsical manner.", "new_text_prompt": "Write a short story about a cat."})

JavaScript


makeApiRequest("POST", "/text-processing/clone-writing-style", {sample_text: "The quick brown fox jumps over the lazy dog in a whimsical manner.", new_text_prompt: "Write a short story about a cat."})

cURL


curl -u "${API_KEY}:${API_SECRET}" -X POST -H "Content-Type: application/json" -d '{"sample_text": "The quick brown fox jumps over the lazy dog in a whimsical manner.", "new_text_prompt": "Write a short story about a cat."}' "${BASE_URL}/text-processing/clone-writing-style"

Success Response (200 OK, streamed final data):


{
  "result": {
    "cloned_text": "The sly tabby cat leaps across the moonlit fence with a playful twirl.",
    "style_match_score": 0.92
  },
  "status": "completed",
  "credits_consumed": 5
}

POST /humanize

Description: Makes AI-generated text sound more natural and less robotic.
Credit Cost: 3
Parameters:
- text (string, required): The AI-generated text to humanize.
- target_audience (string, optional, default: “general”): The intended audience (e.g., “general”, “professional”, “casual”).
Use Case: Refine AI-generated blog post drafts or product descriptions to make them more engaging and relatable to a human audience.

Python


make_api_request("POST", "/text-processing/humanize", json_payload={"text": "The product is very efficient and performs optimally.", "target_audience": "general"})

JavaScript


makeApiRequest("POST", "/text-processing/humanize", {text: "The product is very efficient and performs optimally.", target_audience: "general"})

cURL


curl -u "${API_KEY}:${API_SECRET}" -X POST -H "Content-Type: application/json" -d '{"text": "The product is very efficient and performs optimally.", "target_audience": "general"}' "${BASE_URL}/text-processing/humanize"

Success Response (200 OK, streamed final data):


{
  "result": {
    "humanized_text": "This product works like a charm—super efficient and gets the job done without any fuss.",
    "changes_made": 3
  },
  "status": "completed",
  "credits_consumed": 3
}

POST /grammar-check

Description: Corrects grammatical errors in the provided text.
Credit Cost: 2
Parameters:
- text (string, required): The text to check and correct for grammar.
Use Case: Build a writing assistant application that provides real-time grammar and spelling suggestions to users as they type.

Python


make_api_request("POST", "/text-processing/grammar-check", json_payload={"text": "he walk to store."})

JavaScript


makeApiRequest("POST", "/text-processing/grammar-check", {text: "he walk to store."})

cURL


curl -u "${API_KEY}:${API_SECRET}" -X POST -H "Content-Type: application/json" -d '{"text": "he walk to store."}' "${BASE_URL}/text-processing/grammar-check"

Success Response (200 OK, streamed final data):


{
  "result": {
    "corrected_text": "He walks to the store.",
    "corrections": [
      {
        "original": "he walk",
        "corrected": "He walks",
        "reason": "Subject-verb agreement and capitalization"
      }
    ]
  },
  "status": "completed",
  "credits_consumed": 2
}

POST /summarize

Description: Creates a concise summary of a longer piece of text.
Credit Cost: 4
Parameters:
- text (string, required): The long text to summarize.
- max_length (integer, optional, default: 100): Maximum length of the summary in words.
Use Case: Automatically generate executive summaries for long business reports or abstracts for academic papers.

Python


make_api_request("POST", "/text-processing/summarize", json_payload={"text": "A long article about climate change...", "max_length": 150})

JavaScript


makeApiRequest("POST", "/text-processing/summarize", {text: "A long article about climate change...", max_length: 150})

cURL


curl -u "${API_KEY}:${API_SECRET}" -X POST -H "Content-Type: application/json" -d '{"text": "A long article about climate change...", "max_length": 150}' "${BASE_URL}/text-processing/summarize"

Success Response (200 OK, streamed final data):


{
  "result": {
    "summary": "Climate change poses significant risks to global ecosystems, requiring immediate action.",
    "word_count": 12,
    "coverage_score": 0.95
  },
  "status": "completed",
  "credits_consumed": 4
}

POST /paraphrase

Description: Rewrites text while retaining the original meaning.
Credit Cost: 3
Parameters:
- text (string, required): The text to paraphrase.
- creativity (float, optional, default: 0.5, range: 0.0-1.0): Level of creative rephrasing (0.0: minimal change, 1.0: highly creative).
Use Case: Avoid plagiarism by rephrasing source material for research papers or generate multiple unique versions of a product description for A/B testing.

Python


make_api_request("POST", "/text-processing/paraphrase", json_payload={"text": "The weather is nice today.", "creativity": 0.8})

JavaScript


makeApiRequest("POST", "/text-processing/paraphrase", {text: "The weather is nice today.", creativity: 0.8})

cURL


curl -u "${API_KEY}:${API_SECRET}" -X POST -H "Content-Type: application/json" -d '{"text": "The weather is nice today.", "creativity": 0.8}' "${BASE_URL}/text-processing/paraphrase"

Success Response (200 OK, streamed final data):


{
  "result": {
    "paraphrased_text": "Today's climate is delightfully pleasant.",
    "similarity_score": 0.88
  },
  "status": "completed",
  "credits_consumed": 3
}

Artifact Management (API)

Base Path: /api/v1/artifacts

GET / - List Artifacts

Description: Lists all artifacts for the user with pagination and filtering.
Credit Cost: 1
Parameters (query):
- status (string, optional): Filter by status (e.g., “processing”, “ready”, “failed”).
- limit (integer, optional, default: 10, max: 100): Number of artifacts per page.
- page (integer, optional, default: 1): Page number.
Use Case: Populate a dashboard in your application that shows a user all the files they have previously uploaded.

Python


make_api_request("GET", "/artifacts", params={"status": "ready", "limit": 10})

JavaScript


makeApiRequest("GET", "/artifacts?status=ready&limit=10")

cURL


curl -u "${API_KEY}:${API_SECRET}" "${BASE_URL}/artifacts?status=ready&limit=10"

Success Response (200 OK):


{
  "artifacts": [
    {
      "id": "art_12345678-1234-1234-1234-123456789abc",
      "filename": "document.pdf",
      "status": "ready",
      "size_bytes": 102400,
      "upload_date": "2025-09-30T10:00:00Z",
      "content_type": "application/pdf"
    }
  ],
  "pagination": {
    "page": 1,
    "limit": 10,
    "total": 50,
    "pages": 5
  },
  "credits_consumed": 1
}

GET /search - Search Artifacts

Description: Provides advanced search for a user’s artifacts.
Credit Cost: 2
Parameters (query):
- q (string, required): Search query (filename, metadata, or content keywords).
- limit (integer, optional, default: 10): Number of results.
Use Case: Implement a search bar in your application that allows users to find specific documents by filename, content hash, or upload date.

Python


make_api_request("GET", "/artifacts/search", params={"q": "invoice"})

JavaScript


makeApiRequest("GET", "/artifacts/search?q=invoice")

cURL


curl -G -u "${API_KEY}:${API_SECRET}" --data-urlencode "q=invoice" "${BASE_URL}/artifacts/search"

Success Response (200 OK):


{
  "results": [
    {
      "id": "art_87654321-4321-4321-4321-cba987654321",
      "filename": "invoice_2025.pdf",
      "relevance_score": 0.95,
      "snippet": "Invoice #INV-123 for services rendered."
    }
  ],
  "total": 3,
  "credits_consumed": 2
}

GET `{artifact_id}` - Get Artifact Details

Description: Retrieves all metadata and job history for a specific artifact.
Credit Cost: 1
Path Parameters:
- artifact_id (string, required): The unique ID of the artifact.
Use Case: Display a detailed view of a selected file, showing its status, type, size, and a history of all processing jobs performed on it.

Python


ARTIFACT_ID = "art_12345678-1234-1234-1234-123456789abc"
make_api_request("GET", f"/artifacts/{ARTIFACT_ID}")

JavaScript


const ARTIFACT_ID = "art_12345678-1234-1234-1234-123456789abc";
makeApiRequest("GET", `/artifacts/${ARTIFACT_ID}`);

cURL


curl -u "${API_KEY}:${API_SECRET}" "${BASE_URL}/artifacts/art_12345678-1234-1234-1234-123456789abc"

Success Response (200 OK):


{
  "artifact": {
    "id": "art_12345678-1234-1234-1234-123456789abc",
    "filename": "document.pdf",
    "status": "ready",
    "size_bytes": 102400,
    "upload_date": "2025-09-30T10:00:00Z",
    "content_type": "application/pdf",
    "metadata": {
      "description": "Sample invoice"
    },
    "job_history": [
      {
        "job_id": "job_abc123",
        "type": "extraction",
        "status": "completed",
        "timestamp": "2025-09-30T10:05:00Z"
      }
    ]
  },
  "credits_consumed": 1
}

PATCH `{artifact_id}` - Update Artifact Metadata

Description: Updates an artifact’s mutable metadata.
Credit Cost: 1
Path Parameters:
- artifact_id (string, required): The unique ID of the artifact.
Request Body:
- description (string, optional): Updated description.
- tags (array of strings, optional): Updated tags.
Use Case: Allow users to rename their uploaded files or add descriptive notes for better organization within your application.

Python


ARTIFACT_ID = "art_12345678-1234-1234-1234-123456789abc"
make_api_request("PATCH", f"/artifacts/{ARTIFACT_ID}", json_payload={"description": "Updated invoice description."})

JavaScript


const ARTIFACT_ID = "art_12345678-1234-1234-1234-123456789abc";
makeApiRequest("PATCH", `/artifacts/${ARTIFACT_ID}`, {description: "Updated invoice description."});

cURL


curl -u "${API_KEY}:${API_SECRET}" -X PATCH -H "Content-Type: application/json" -d '{"description": "Updated invoice description."}' "${BASE_URL}/artifacts/art_12345678-1234-1234-1234-123456789abc"

Success Response (200 OK):


{
  "updated_artifact": {
    "id": "art_12345678-1234-1234-1234-123456789abc",
    "description": "Updated invoice description."
  },
  "credits_consumed": 1
}

DELETE `{artifact_id}` - Delete Artifact

Description: Deletes an artifact and its associated data.
Credit Cost: 2
Path Parameters:
- artifact_id (string, required): The unique ID of the artifact.
Use Case: Provide a “delete” button for users to permanently remove their files and associated data from the system.

Python


ARTIFACT_ID = "art_12345678-1234-1234-1234-123456789abc"
make_api_request("DELETE", f"/artifacts/{ARTIFACT_ID}")

JavaScript


const ARTIFACT_ID = "art_12345678-1234-1234-1234-123456789abc";
makeApiRequest("DELETE", `/artifacts/${ARTIFACT_ID}`);

cURL


curl -u "${API_KEY}:${API_SECRET}" -X DELETE "${BASE_URL}/artifacts/art_12345678-1234-1234-1234-123456789abc"

Success Response (204 No Content)

POST /upload - Upload a Single File

Description: Uploads a single file directly and begins processing.
Credit Cost: 5 (includes initial upload and processing start)
Parameters (multipart/form-data):
- file (file, required): The file to upload (supports PDF, DOCX, images, etc.).
Use Case: The primary method for getting user files into the DataDistill system for further processing and extraction.

Python


with open("path/to/doc.pdf", "rb") as f:
    files = {"file": ("doc.pdf", f, "application/pdf")}
    make_api_request("POST", "/artifacts/upload", files=files)

JavaScript


const formData = new FormData();
formData.append('file', fileInput.files[0]); // Assuming fileInput is an <input type="file">
makeApiRequest("POST", "/artifacts/upload", formData, false); // isJson = false

cURL


curl -u "${API_KEY}:${API_SECRET}" -X POST -F "file=@path/to/doc.pdf" "${BASE_URL}/artifacts/upload"

Success Response (201 Created):


{
  "artifact": {
    "id": "art_12345678-1234-1234-1234-123456789abc",
    "filename": "doc.pdf",
    "status": "processing",
    "upload_date": "2025-09-30T10:00:00Z"
  },
  "credits_consumed": 5
}

POST `{artifact_id}`/cancel - Cancel Artifact Processing

Description: Cancels an artifact stuck in the ‘processing’ state.
Credit Cost: 1 (partial refund may apply for unused processing)
Path Parameters:
- artifact_id (string, required): The unique ID of the artifact.
Use Case: Provide a way for users or administrators to stop a processing job that seems to be taking too long, preventing it from consuming further resources.

Python


ARTIFACT_ID = "art_12345678-1234-1234-1234-123456789abc"
make_api_request("POST", f"/artifacts/{ARTIFACT_ID}/cancel")

JavaScript


const ARTIFACT_ID = "art_12345678-1234-1234-1234-123456789abc";
makeApiRequest("POST", `/artifacts/${ARTIFACT_ID}/cancel`);

cURL


curl -u "${API_KEY}:${API_SECRET}" -X POST "${BASE_URL}/artifacts/art_12345678-1234-1234-1234-123456789abc/cancel"

Success Response (200 OK):


{
  "status": "cancelled",
  "message": "Processing job has been cancelled.",
  "credits_consumed": 1,
  "refunded_credits": 2
}

DataDistill API Documentation (v1)

Authentication

Authentication & Dynamic URL Examples

Python

JavaScript

cURL

Metering and Credit System

API Health Check

GET / - Check API Status

Python

JavaScript

cURL

Text Processing API

POST /ai-detect

Python

JavaScript

cURL

POST /clone-writing-style

Python

JavaScript

cURL

POST /humanize

Python

JavaScript

cURL

POST /grammar-check

Python

JavaScript

cURL

POST /summarize

Python

JavaScript

cURL

POST /paraphrase

Python

JavaScript

cURL

Artifact Management (API)

GET / - List Artifacts

Python

JavaScript

cURL

GET /search - Search Artifacts

Python

JavaScript

cURL

GET {artifact_id} - Get Artifact Details

Python

JavaScript

cURL

PATCH {artifact_id} - Update Artifact Metadata

Python

JavaScript

cURL

DELETE {artifact_id} - Delete Artifact

Python

JavaScript

cURL

POST /upload - Upload a Single File

Python

JavaScript

cURL

POST {artifact_id}/cancel - Cancel Artifact Processing

Python

JavaScript

cURL

GET `{artifact_id}` - Get Artifact Details

PATCH `{artifact_id}` - Update Artifact Metadata

DELETE `{artifact_id}` - Delete Artifact

POST `{artifact_id}`/cancel - Cancel Artifact Processing