Author: Daryle De Silva

  • Explicit Formatting Constraints for Laravel AI SDK Structured Output

    When using Laravel AI SDK’s structured output, you might notice the AI produces different formatting styles across multiple runs—even with the same prompt. One run might return markdown with **bold** and bullet lists, while another returns plain text.

    Why This Happens

    Large language models interpret prompts probabilistically. Without explicit constraints, the AI makes formatting decisions based on context and training data patterns. This leads to output variance that’s hard to predict.

    The Solution: Explicit Format Constraints

    Add formatting instructions directly in your schema field descriptions:

    use Illuminate\Contracts\JsonSchema\JsonSchema;
    use Laravel\Ai\Contracts\HasStructuredOutput;
    
    class ReportGenerator implements HasStructuredOutput
    {
        public function schema(JsonSchema $schema): array
        {
            return [
                'summary' => $schema->string()
                    ->description('
                        Write the summary naturally.
                        
                        Format: One paragraph per section.
                        Plain text only - no markdown or formatting.
                        Use newlines for structure.
                        
                        Express times in 24-hour format (0600, 1400, 2000).
                    ')
                    ->required(),
            ];
        }
    }
    

    Key Constraints to Specify

    Format type:

    • “Plain text only – no markdown”
    • “Use markdown formatting”
    • “Return as HTML”

    Structure:

    • “Use newlines for structure”
    • “One item per line”
    • “Separate sections with double newlines”

    Consistency rules:

    • “Express times in HHMM format”
    • “Use sentence case for headings”
    • “No bullet points or numbered lists”

    Real-World Example

    In a data extraction system, the schema description evolved from generic guidance to explicit constraints:

    // Before (inconsistent)
    'schedule' => $schema->string()
        ->description('Extract the itinerary from the source.')
        ->required(),
    
    // After (consistent)
    'schedule' => $schema->string()
        ->description('
            Extract the itinerary from the source.
            
            Format: List each day with activities one per line.
            Show time (HHMM) and description.
            Plain text only - no markdown or formatting.
            Use newlines for structure.
        ')
        ->required(),
    

    The Result

    After adding explicit formatting constraints:

    • Output became consistent across multiple runs
    • No more surprise markdown in plain-text fields
    • Easier to parse and display in templates

    When to Use This

    Always specify format constraints when:

    • Output will be displayed directly to users
    • You’re parsing the output programmatically
    • Consistency matters more than creativity
    • Multiple AI runs need identical formatting

    You can be more lenient when:

    • The AI is generating creative content
    • Formatting flexibility is desired
    • You’re post-processing the output anyway

    Bottom Line

    Don’t assume the AI will infer your formatting preferences. Explicit beats implicit—especially when dealing with probabilistic systems. Two sentences in your schema description can save hours of debugging inconsistent output.

  • Touch Related Model Timestamps Automatically

    When you have related models in your Laravel app and need their timestamps to stay in sync, manually updating them can be tedious and error-prone.

    Say you have a User model with a related Profile model. When a user updates their email, you want the profile’s updated_at to reflect that change for cache invalidation or change tracking.

    Instead of manually touching the profile every time:

    // Manual approach (tedious)
    $user->update(['email' => '[email protected]']);
    $user->profile->touch(); // Have to remember this everywhere
    

    Use Eloquent’s $touches property to do it automatically:

    class User extends Model
    {
        protected $touches = ['profile'];
        
        public function profile()
        {
            return $this->hasOne(Profile::class);
        }
    }
    
    // Now updates automatically touch the related model
    $user->update(['email' => '[email protected]']);
    // Profile.updated_at is automatically updated!
    

    The $touches property accepts an array of relationship method names. Eloquent will automatically call touch() on those relationships whenever the parent model is saved or updated.

    Perfect for:

    • APIs that use updated_at for change tracking
    • Cache invalidation based on timestamps
    • Audit trails that need accurate modification times
    • Keeping related models in sync without manual intervention

    Works with any relationship type: hasOne, hasMany, belongsTo, belongsToMany. Just name the relationship method, and Eloquent handles the rest.

  • Inline Template Strings in Vue: When to Use Them

    Vue components can define their HTML in multiple ways, but one often-overlooked option is inline template strings using the template property. While Single-File Components (.vue files) are the standard in modern Vue apps, inline templates have legitimate use cases in Laravel projects.

    When to Use Inline Templates

    1. Simple, self-contained components
    For dashboard widgets, modals, or small interactive elements that don’t need build-time compilation.

    2. Legacy compatibility
    Inline templates work in both Vue 1.x and 2.x without SFC build setup—perfect for gradual migrations.

    3. Blade-Vue hybrid pages
    When you want Vue reactivity on specific parts of a traditional Blade view without committing to a full SPA architecture.

    4. Quick prototypes
    Admin dashboard widgets or internal tools where build complexity isn’t justified.

    When NOT to Use Them

    • Complex components with lots of markup (becomes unmaintainable)
    • When you have a proper build pipeline (Webpack/Vite) set up
    • Components needing scoped CSS
    • Production SPAs (use .vue files instead)

    Example: Dashboard Widget

    export default {
        name: 'OrderSummary',
        props: ['orderId', 'apiUrl'],
        data() {
            return {
                order: null,
                loading: true
            };
        },
        template: `
            <div class="order-summary">
                <div v-if="loading">Loading...</div>
                <div v-else-if="order">
                    <h3>Order #{{ order.id }}</h3>
                    <p>Status: <span :class="'badge-' + order.status">{{ order.status }}</span></p>
                    <p>Total: {{ order.total_formatted }}</p>
                </div>
                <div v-else>Order not found</div>
            </div>
        `,
        mounted() {
            this.fetchOrder();
        },
        methods: {
            fetchOrder() {
                fetch(this.apiUrl)
                    .then(r => r.json())
                    .then(data => {
                        this.order = data;
                        this.loading = false;
                    });
            }
        }
    };
    

    Pro Tips

    Use template literals (backticks) for multi-line templates. This makes the HTML readable with proper indentation.

    Keep it focused. If your template exceeds ~30 lines, it’s a sign you should move to a .vue SFC file.

    Escape carefully. When embedding inline templates in Blade files, watch out for conflicts between Vue’s {{ }} and Blade’s syntax. Use @{{ }} or the @verbatim directive.

    Inline templates aren’t a replacement for proper SFC architecture, but they’re a valuable tool for hybrid Laravel/Vue applications where you need reactivity without the overhead of a full build pipeline.

  • Cleaner API Routes with Consistent Grouping

    When your routes file starts getting messy, apply these refactoring patterns to bring order back:

    ## 1. Group by middleware first, then prefix and name

    “`php
    // Before: scattered middleware
    Route::post(‘logout’, [AuthController::class, ‘logout’])->middleware(‘auth:sanctum’);
    Route::get(‘profile/votes’, [VoteController::class, ‘index’])->middleware(‘auth:sanctum’);

    // After: group by middleware
    Route::middleware(‘auth:sanctum’)->group(function () {
    Route::prefix(‘auth’)->name(‘auth.’)->group(function () {
    Route::post(‘logout’, [AuthController::class, ‘logout’])->name(‘logout’);
    });

    Route::prefix(‘profile’)->name(‘profile.’)->group(function () {
    // All profile routes here
    });
    });
    “`

    ## 2. Consolidate multiple apiResource calls

    When several resources share the same options, use apiResources():

    “`php
    // Before: verbose
    Route::apiResource(‘regions’, RegionController::class)->except([‘store’, ‘update’, ‘destroy’]);
    Route::apiResource(‘provinces’, ProvinceController::class)->except([‘store’, ‘update’, ‘destroy’]);
    Route::apiResource(‘mountains’, MountainController::class)->except([‘store’, ‘update’, ‘destroy’]);

    // After: consolidated
    Route::apiResources([
    ‘regions’ => RegionController::class,
    ‘provinces’ => ProvinceController::class,
    ‘mountains’ => MountainController::class,
    ], [‘except’ => [‘store’, ‘update’, ‘destroy’]]);
    “`

    ## 3. Use prefix for route parameters

    Instead of repeating the same parameter in every URI, move it to the prefix:

    “`php
    // Before: repeated {trail} in every URI
    Route::post(‘trails/{trail}/comments’, [TrailController::class, ‘storeComment’]);
    Route::post(‘trails/{trail}/climbs’, [TrailController::class, ‘storeClimb’]);

    // After: parameter in prefix
    Route::prefix(‘trails/{trail}’)->name(‘trail.’)->middleware(‘auth:sanctum’)->group(function () {
    Route::post(‘comments’, [TrailController::class, ‘storeComment’])->name(‘comment’);
    Route::post(‘climbs’, [TrailController::class, ‘storeClimb’])->name(‘climb’);
    });
    “`

    **The pattern**: Group by middleware → prefix → name, in that order. Your routes file becomes easier to scan, modify, and maintain.

  • Cleaner API Routes with Consistent Grouping

    When your Laravel routes file starts getting messy with scattered middleware, repeated patterns, and verbose resource declarations, it’s time to refactor. Here are three patterns that will make your API routes cleaner and more maintainable.

    1. Group by Middleware, Then by Prefix and Name

    Instead of scattering ->middleware('auth:sanctum') across individual routes, group authenticated routes together first:

    // ❌ Before: middleware scattered everywhere
    Route::post('logout', [AuthController::class, 'logout'])->middleware('auth:sanctum');
    Route::get('profile/notifications', [NotificationController::class, 'index'])->middleware('auth:sanctum');
    Route::post('reports/{report}/approve', [ReportController::class, 'approve'])->middleware('auth:sanctum');
    
    // ✅ After: group by middleware first
    Route::middleware('auth:sanctum')->group(function () {
        Route::prefix('auth')->name('auth.')->group(function () {
            Route::post('logout', [AuthController::class, 'logout'])->name('logout');
        });
        
        Route::prefix('profile')->name('profile.')->group(function () {
            Route::apiResource('notifications', NotificationController::class)->only(['index', 'show']);
        });
        
        Route::prefix('report/{report}')->name('report.')->group(function () {
            Route::post('approve', [ReportController::class, 'approve'])->name('approve');
        });
    });
    

    This groups all protected routes in one middleware wrapper, then organizes them by prefix and route name. Much easier to scan and understand the structure.

    2. Consolidate Multiple apiResource Calls

    If you have several resources that share the same configuration, use apiResources() (plural) instead of repeating the same options:

    // ❌ Before: repetitive
    Route::apiResource('categories', CategoryController::class)->except(['store', 'update', 'destroy']);
    Route::apiResource('tags', TagController::class)->except(['store', 'update', 'destroy']);
    Route::apiResource('products', ProductController::class)->except(['store', 'update', 'destroy']);
    Route::apiResource('reviews', ReviewController::class)->except(['store', 'update', 'destroy']);
    
    // ✅ After: consolidated
    Route::apiResources([
        'categories' => CategoryController::class,
        'tags' => TagController::class,
        'products' => ProductController::class,
        'reviews' => ReviewController::class,
    ], ['except' => ['store', 'update', 'destroy']]);
    

    This reduces 8 lines to 6, and makes it crystal clear that these are all public read-only resources with the same access rules.

    3. Move Route Parameters to the Prefix

    When multiple routes operate on the same parent resource, move the parameter to the prefix instead of repeating it in every URI:

    // ❌ Before: {task} repeated in every route
    Route::post('tasks/{task}/comments', [TaskController::class, 'comment']);
    Route::post('tasks/{task}/assign', [TaskController::class, 'assign']);
    Route::post('tasks/{task}/complete', [TaskController::class, 'complete']);
    
    // ✅ After: parameter in prefix
    Route::prefix('task/{task}')->name('task.')->middleware('auth:sanctum')->group(function () {
        Route::post('comment', [TaskController::class, 'comment'])->name('comment');
        Route::post('assign', [TaskController::class, 'assign'])->name('assign');
        Route::post('complete', [TaskController::class, 'complete'])->name('complete');
    });
    

    Now the URIs are cleaner (/task/123/comment), the middleware is declared once, and the route names follow a consistent pattern (task.comment, task.assign, etc.).

    The Grouping Order That Works

    Apply grouping in this order for maximum readability:

    1. Middleware — auth, guest, throttle, etc.
    2. Prefix — URL segment grouping
    3. Name — route name prefix

    Following this pattern consistently across your routes file makes it much easier to scan, modify, and maintain as your API grows.

  • Structured Data Extraction with Laravel AI JsonSchema

    Use Laravel AI’s JsonSchema builder with agents to extract structured data from unstructured HTML. Define your schema with detailed descriptions, constraints, and nested objects for reliable extraction.

    The Problem

    Traditional regex/DOM parsing is brittle and breaks when HTML structure changes. You need to extract structured data from web pages reliably.

    The Solution

    Laravel AI’s JsonSchema + agent pattern provides LLM-based extraction with strict type validation:

    use Laravel\Ai\Files\Document;
    use Illuminate\Contracts\JsonSchema\JsonSchema;
    use function Laravel\Ai\agent;
    
    $document = Document::fromString($htmlContent, 'text/plain');
    
    $response = agent(
        instructions: 'You are a data extraction assistant. Extract structured information from the attached document.',
        schema: fn (JsonSchema $schema) => [
            'title' => $schema->string()
                ->description('Main title of the content')
                ->required(),
            'items' => $schema->array()
                ->description('List of related items')
                ->min(1)
                ->items(
                    $schema->object([
                        'name' => $schema->string()->required(),
                        'value' => $schema->integer()->min(0)->nullable(),
                        'metadata' => $schema->object([
                            'lat' => $schema->number()->min(-90)->max(90),
                            'lng' => $schema->number()->min(-180)->max(180),
                        ])->withoutAdditionalProperties()->nullable(),
                    ])->withoutAdditionalProperties()
                ),
            'category' => $schema->string()
                ->enum(['Type1', 'Type2'])
                ->required(),
        ],
    )->prompt(
        'Extract data from the attached document.',
        attachments: [$document],
    );
    
    $data = $response->toArray();

    Key Techniques

    1. Use withoutAdditionalProperties() to Prevent Hallucination

    $schema->object([
        'name' => $schema->string()->required(),
        'count' => $schema->integer()->min(1)->nullable(),
    ])->withoutAdditionalProperties()  // Prevents adding unexpected fields

    2. Add Min/Max Constraints for Numbers

    'elevation' => $schema->integer()->min(0)->max(10000),
    'latitude' => $schema->number()->min(-90)->max(90),

    3. Provide Detailed Descriptions

    'difficulty' => $schema->string()
        ->description('Difficulty level: beginner, intermediate, or expert')
        ->enum(['beginner', 'intermediate', 'expert'])
        ->required(),

    Why This Matters

    • Resilient to format variations: LLMs understand content semantically, not just structure
    • Type-safe output: JsonSchema ensures validated, structured data
    • Prevents hallucination: withoutAdditionalProperties() + constraints = strict validation
    • Self-documenting: Schema descriptions double as documentation

    Real-World Example

    Extracting product data from an e-commerce site:

    $schema->object([
        'title' => $schema->string()->required(),
        'price' => $schema->number()->min(0)->required(),
        'stock_status' => $schema->string()
            ->enum(['in_stock', 'out_of_stock', 'pre_order'])
            ->required(),
        'specs' => $schema->object([
            'brand' => $schema->string()->nullable(),
            'model' => $schema->string()->nullable(),
            'dimensions' => $schema->object([
                'length' => $schema->number()->min(0),
                'width' => $schema->number()->min(0),
                'height' => $schema->number()->min(0),
            ])->withoutAdditionalProperties()->nullable(),
        ])->withoutAdditionalProperties(),
    ])->withoutAdditionalProperties()

    The payoff: When the HTML changes (and it will), your extraction continues working because the LLM understands the meaning of the content, not just its structure.

  • Use Laravel HTTP Global Middleware to Transparently Modify API Requests

    Laravel’s Http facade provides globalRequestMiddleware() and globalResponseMiddleware() methods that intercept ALL outgoing HTTP requests made through the facade. This is perfect for transparently modifying third-party API calls without changing application code.

    The Problem

    You’re using a third-party Laravel package that makes HTTP calls, but you need to:

    • Swap authentication methods (API key → OAuth)
    • Inject custom headers
    • Log all requests
    • Modify request bodies

    …without forking the package or wrapping every HTTP call.

    The Solution

    Register global middleware in AppServiceProvider::boot():

    use Illuminate\Support\Facades\Http;
    
    public function boot()
    {
        Http::globalRequestMiddleware(function ($request) {
            // Only modify requests to specific API
            if (str_contains($request->url(), 'api.example.com')) {
                // Swap authentication method
                $apiKey = $request->header('X-API-Key');
                if ($apiKey && str_starts_with($apiKey, 'key-oauth-')) {
                    // Remove API key header
                    $request->withoutHeader('X-API-Key');
                    // Add OAuth Bearer token instead
                    $request->withHeader('Authorization', 'Bearer ' . $apiKey);
                }
                
                // Inject custom headers
                $request->withHeaders([
                    'User-Agent' => 'MyApp/1.0',
                    'X-Custom-Header' => 'value'
                ]);
                
                // Modify request body (for JSON requests)
                if ($request->isJson()) {
                    $data = $request->data();
                    $data['extra_param'] = 'injected_value';
                    $request->withBody(json_encode($data), 'application/json');
                }
            }
            
            return $request;
        });
    }

    Response Middleware Too

    Http::globalResponseMiddleware(function ($response) {
        // Transform response data globally
        if ($response->json('status') === 'legacy_format') {
            return $response->json(['data' => $response->json()]);
        }
        return $response;
    });

    Why This Matters

    Enables transparent API modification without forking packages or wrapping every HTTP call. Perfect for:

    • Authentication adaptation: Add auth headers packages don’t support
    • Logging: Track all outgoing requests in one place
    • Header injection: Add tracking IDs, custom user agents
    • Rate limiting: Add delays globally

    This approach operates at the HTTP layer (not application layer), making it transparent to packages that use Http:: internally.

    Real-World Example

    Adapting a third-party SDK that only supports API key auth, but your API uses OAuth:

    Http::globalRequestMiddleware(function ($request) {
        if (str_contains($request->url(), 'api.vendor.com')) {
            // Intercept their API key, swap for OAuth token
            $apiKey = $request->header('X-Vendor-API-Key');
            $oauthToken = $this->exchangeKeyForToken($apiKey);
            
            $request->withoutHeader('X-Vendor-API-Key');
            $request->withHeader('Authorization', "Bearer {$oauthToken}");
        }
        
        return $request;
    });

    Zero code changes to the vendor package. Zero maintenance burden.

  • Query WordPress from Laravel with Corcel — No HTTP Overhead

    Instead of using WordPress REST API from Laravel (HTTP overhead), use the Corcel package (jgrossi/corcel) to query WordPress tables directly via Eloquent. Both applications share the same MySQL database.

    Why This Matters

    Eliminates HTTP overhead between Laravel and WordPress. Direct database access is 10-50x faster than REST API calls. Enables real-time data access with zero sync lag.

    The Setup

    1. Install Corcel

    composer require jgrossi/corcel

    2. Add a separate database connection in config/database.php

    'connections' => [
        'mysql' => [
            'driver' => 'mysql',
            'host' => env('DB_HOST'),
            'database' => env('DB_DATABASE'),
            'username' => env('DB_USERNAME'),
            'password' => env('DB_PASSWORD'),
            'prefix' => '',  // Laravel tables have no prefix
        ],
        
        'wordpress' => [
            'driver' => 'mysql',
            'host' => env('DB_HOST'),
            'database' => env('DB_DATABASE'),  // Same database!
            'username' => env('DB_USERNAME'),
            'password' => env('DB_PASSWORD'),
            'prefix' => 'wp_',  // WordPress tables use wp_ prefix
        ],
    ],

    3. Use Corcel models

    use Corcel\Model\Post;
    use Corcel\Model\User;
    
    // Query WordPress posts
    $posts = Post::type('product')
        ->status('publish')
        ->taxonomy('category', 'featured')
        ->get();
    
    // Access custom fields (ACF)
    $price = $post->meta->price;
    $sku = $post->meta->sku;
    
    // Query users
    $wpUsers = User::all();

    Division of Labor

    • WordPress: Content management, ACF fields, admin UI
    • Laravel: API endpoints, business logic, scheduled tasks
    • Shared MySQL: Single source of truth, zero sync lag

    Advanced: Custom OAuth Guard with Corcel

    // Direct Corcel query - no HTTP!
    class WpOAuthGuard implements Guard
    {
        public function user()
        {
            $token = request()->bearerToken();
            
            // Direct database access via Corcel connection
            $tokenData = DB::connection('wordpress')
                ->table('oauth_tokens')
                ->where('access_token', $token)
                ->where('expires', '>', time())
                ->first();
                
            if (!$tokenData) return null;
            
            return User::find($tokenData->user_id);
        }
    }

    The payoff: WordPress handles content management (what it’s good at), Laravel handles API/logic (what it’s good at), and Corcel makes them work together at database speed.

  • Capture Final URLs After Redirects with Guzzle TransferStats

    Need to know the final URL after following redirects? Don’t parse the response—use Guzzle’s TransferStats callback to capture the effective URI automatically.

    The Use Case

    You’re processing shortened URLs (like bit.ly links) and need to extract the final slug or canonical URL after all redirects resolve.

    The Solution

    use GuzzleHttp\TransferStats;
    use Illuminate\Support\Facades\Http;
    
    $effectiveUrl = null;
    
    Http::withOptions([
        'on_stats' => function (TransferStats $stats) use (&$effectiveUrl) {
            $effectiveUrl = (string) $stats->getEffectiveUri();
        },
    ])->head($shortenedUrl);
    
    // $effectiveUrl is now 'https://example.com/articles/full-guide'
    $slug = basename(parse_url($effectiveUrl, PHP_URL_PATH));
    // $slug = 'full-guide'

    How It Works

    1. on_stats is a Guzzle option that registers a callback
    2. The callback runs after the request completes
    3. TransferStats contains metadata about the request, including the final URI
    4. getEffectiveUri() returns the URL after all redirects

    Why HEAD Instead of GET?

    Using head() instead of get() makes this efficient—you only need the final destination, not the response body. The server follows redirects and returns headers, but skips sending the full content.

    Result: Fast, bandwidth-efficient redirect resolution.

    What Else Is in TransferStats?

    Http::withOptions([
        'on_stats' => function (TransferStats $stats) {
            dump([
                'effective_uri' => $stats->getEffectiveUri(),
                'transfer_time' => $stats->getTransferTime(),  // seconds
                'redirect_count' => $stats->getRedirectCount(),
            ]);
        },
    ])->head($url);

    You can also access:

    • getHandlerStats() — low-level curl stats (DNS lookup time, connect time, etc.)
    • getRequest() — the original request object
    • getResponse() — the final response object (if available)

    Real-World Applications

    • Canonical URL discovery: Resolve short links to full URLs for storage
    • Slug extraction: Extract clean slugs from user-submitted URLs
    • Redirect chain analysis: Track how many hops it took to reach the final destination
    • Performance monitoring: Log transfer times and redirect counts for external APIs

    Bonus: Caching Redirects

    Since redirect resolution is deterministic, you can cache the effective URL:

    $cacheKey = 'redirect_'.md5($shortenedUrl);
    
    return Cache::remember($cacheKey, now()->addDay(), function () use ($shortenedUrl) {
        $effectiveUrl = null;
        Http::withOptions([
            'on_stats' => fn(TransferStats $stats) => 
                $effectiveUrl = (string) $stats->getEffectiveUri()
        ])->head($shortenedUrl);
        
        return $effectiveUrl;
    });

    Now you only resolve each shortened URL once, even across multiple requests.

  • Extract Structured Data with Laravel AI SDK and JsonSchema

    Need to extract structured data from messy HTML, PDFs, or unstructured text? Laravel’s AI SDK has a killer feature: structured output that lets you define exactly what data you want and get back a validated, type-safe array.

    The Problem

    You’re scraping product data from old supplier websites. Some pages have tables, others have random HTML layouts. You need to extract product names, prices, features, and locations consistently.

    The Solution: JsonSchema + AI SDK

    use function Laravel\Ai\agent;
    use Illuminate\Contracts\JsonSchema\JsonSchema;
    use Laravel\Ai\Files\Document;
    
    $html = file_get_contents($productUrl);
    $document = Document::fromString($html, 'text/plain');
    
    $response = agent(
        instructions: 'Extract product information from the HTML',
        schema: fn (JsonSchema $schema) => [
            'name' => $schema->string()->required(),
            'price' => $schema->number()->min(0)->required(),
            'features' => $schema->array()->items($schema->string()),
            'location' => $schema->object([
                'city' => $schema->string()->required(),
                'state' => $schema->string()->required(),
            ])->withoutAdditionalProperties(),
        ],
    )->prompt(
        'Extract all product details from this page.',
        attachments: [$document],
    );
    
    $data = $response->toArray();
    // ['name' => 'Pro Widget', 'price' => 29.99, ...]

    What’s Happening Here

    1. You define a JsonSchema describing your exact data structure
    2. The LLM reads the messy HTML and extracts the data
    3. The AI SDK validates the response against your schema
    4. You get back a clean, type-safe array matching your spec

    The schema acts as a contract. If the LLM tries to return invalid data (missing required fields, wrong types, extra keys), the SDK catches it.

    Schema Features

    You can define:

    • Types: string(), number(), integer(), boolean(), array(), object()
    • Constraints: min(), max(), enum(['Option1', 'Option2'])
    • Requirements: required(), nullable()
    • Nested structures: Objects within arrays, arrays within objects
    • Descriptions: description('...') to guide the LLM

    Real-World Use Cases

    • Data migration: Extract structured records from legacy system exports
    • Content extraction: Pull article metadata from blog posts
    • Invoice parsing: Extract line items, totals, dates from PDF invoices
    • Form filling: Auto-populate forms from uploaded documents

    Why This Beats Regex or DOM Parsing

    Traditional scraping breaks when layouts change. LLMs understand meaning, not structure. They can find “the product price” even if it moves from a table to a div to a JSON blob embedded in a script tag.

    The JsonSchema ensures you still get reliable, validated output—even when the source format is chaos.