Skip to content

Best Practices

Key patterns and principles for building robust, production-ready AI applications with PHP LLM.


Caching

Why Caching Matters

Caching is essential for building efficient and cost-effective LLM applications:

  • Cost savings: Eliminate redundant API calls for identical requests, reducing costs significantly
  • Performance: Return cached responses instantly instead of waiting for API roundtrips
  • Reliability: Reduce dependency on external API availability and rate limits
  • Development efficiency: Speed up testing and development cycles with instant cached responses

When to Use Caching

Use caching when:

  • You have repeated identical requests (same model, same conversation, same parameters)
  • You're working in development/testing environments with repetitive queries
  • Cost optimization is a priority for your application
  • You have predictable query patterns that are likely to repeat

Basic Implementation

<?php
use Soukicz\Llm\Cache\FileCache;

// Enable caching for identical requests
$cache = new FileCache(sys_get_temp_dir());
$client = new AnthropicClient('sk-xxxxx', $cache);

// Identical requests will automatically use cache, saving API calls

Learn More

For detailed information on cache implementations, custom cache backends, cache warming strategies, and monitoring, see the comprehensive Caching Guide.


Feedback Loops

Why Feedback Loops Matter

Feedback loops enable self-correcting AI agents that can validate and improve their own outputs:

  • Validate outputs: Automatically check responses against quality criteria before accepting them
  • Self-improve: Request corrections from the LLM without manual intervention
  • Meet requirements: Ensure responses match your exact specifications (format, completeness, accuracy)
  • Build reliability: Create consistent, validated outputs essential for production systems

When to Use Feedback Loops

Use feedback loops when:

  • Output format validation is critical (JSON, XML, specific schemas)
  • Content must meet specific criteria (length, completeness, accuracy requirements)
  • You need guaranteed compliance with business rules
  • Building agentic systems that must produce reliable, consistent results

Key Principle: Loop Counter

Always implement a loop counter to prevent infinite loops. This is a critical safeguard that prevents runaway costs and ensures your application remains responsive even when the LLM struggles to meet validation criteria.

<?php
$maxIterations = 5;
$iteration = 0;

$response = $chainClient->run(
    client: $client,
    request: new LLMRequest(
        model: $model,
        conversation: $conversation
    ),
    feedbackCallback: function (LLMResponse $response) use (&$iteration, $maxIterations): ?LLMMessage {
        $iteration++;

        // CRITICAL: Stop after max attempts to prevent infinite loops
        if ($iteration >= $maxIterations) {
            return null; // Stop iteration
        }

        // Your validation logic here
        $text = $response->getLastText();
        if (!isValidJson($text)) {
            return LLMMessage::createFromUserString(
                'The response was not valid JSON. Please provide a valid JSON response.'
            );
        }

        return null; // Validation passed
    }
);

Without a loop counter, a feedback loop can continue indefinitely if the LLM cannot satisfy the validation criteria, leading to excessive API costs and application hangs.

Learn More

For complete examples of validation patterns, nested LLM validation, progressive feedback strategies, and combining feedback loops with tools, see the Feedback Loops Guide.


Async Operations for Parallel Tool Calls

Why Async Operations Matter

Async operations are crucial for performance and efficiency in LLM applications:

  • Performance: Process multiple requests concurrently instead of sequentially
  • Efficiency: Reduce total execution time when handling multiple independent operations
  • Scalability: Handle higher throughput with the same resources
  • Tool calls: Execute multiple independent tool calls in parallel, dramatically speeding up agentic workflows

When to Use Async Operations

Use async operations when:

  • You have multiple independent LLM requests to process
  • Tool calls can be executed in parallel (no dependencies between them)
  • Processing large batches of items
  • Building real-time applications that need low latency

Parallel Tool Call Pattern

The most important use case for async operations is parallel tool execution. When an LLM agent needs to call multiple tools that don't depend on each other's results, async operations allow them to execute simultaneously rather than waiting for each to complete sequentially.

For example, if an agent needs to fetch data from three different sources, running them in parallel can reduce execution time from 9 seconds (3 × 3 seconds) to just 3 seconds.

<?php
// Process multiple requests concurrently
$promises = [];

foreach ($items as $item) {
    $promises[] = $chainClient->runAsync(
        client: $client,
        request: new LLMRequest(
            model: $model,
            conversation: new LLMConversation([
                LLMMessage::createFromUserString("Analyze: {$item}")
            ])
        )
    );
}

// Wait for all to complete
$responses = Promise\Utils::all($promises)->wait();

// Process results
foreach ($responses as $response) {
    echo $response->getLastText() . "\n";
}

Learn More

For advanced async patterns, batch processing strategies, handling async tool execution, and error handling in concurrent operations, see:


See Also