Batch Processing¶
Process high volumes of LLM requests efficiently using batch operations. Batch processing is ideal for offline workloads where immediate responses aren't required.
Overview¶
Batch processing allows you to: - Submit multiple requests at once - Process them asynchronously - Retrieve results later - Save costs (often 50% cheaper than real-time) - Handle large-scale operations
Note: Batch processing support varies by provider. Check provider-specific documentation.
LLMBatchClient Interface¶
Clients implementing batch operations use the LLMBatchClient interface:
<?php
use Soukicz\Llm\Client\LLMBatchClient;
interface LLMBatchClient {
public function createBatch(array $requests): string;
public function retrieveBatch(string $batchId): ?array;
public function getCode(): string;
}
Basic Usage¶
Submit Batch¶
<?php
use Soukicz\Llm\Client\OpenAI\OpenAIClient;
use Soukicz\Llm\Client\OpenAI\Model\GPT5;
use Soukicz\Llm\LLMConversation;
use Soukicz\Llm\LLMRequest;
use Soukicz\Llm\Message\LLMMessage;
/** @var LLMBatchClient $client */
$client = new OpenAIClient('sk-xxxxx', 'org-xxxxx');
// Prepare multiple requests
$requests = [];
for ($i = 0; $i < 1000; $i++) {
$requests[] = new LLMRequest(
model: new GPT5(GPT5::VERSION_2025_08_07),
conversation: new LLMConversation([
LLMMessage::createFromUserString("Summarize document $i")
])
);
}
// Submit batch
$batchId = $client->createBatch($requests);
echo "Batch created: $batchId\n";
Retrieve Batch¶
<?php
// Retrieve batch information (returns null if not ready, array with status and results when complete)
$batch = $client->retrieveBatch($batchId);
if ($batch !== null) {
// Batch information available
// Check provider-specific documentation for exact response format
var_dump($batch);
}
Complete Example¶
<?php
use Soukicz\Llm\Client\OpenAI\OpenAIClient;
use Soukicz\Llm\Client\OpenAI\Model\GPT5;
use Soukicz\Llm\LLMConversation;
use Soukicz\Llm\LLMRequest;
use Soukicz\Llm\Message\LLMMessage;
$client = new OpenAIClient('sk-xxxxx', 'org-xxxxx');
// Prepare batch of classification tasks
$texts = [
'This product is amazing!',
'Terrible service, would not recommend.',
'It\'s okay, nothing special.',
// ... 1000s more
];
$requests = array_map(
fn($text) => new LLMRequest(
model: new GPT5(GPT5::VERSION_2025_08_07),
conversation: new LLMConversation([
LLMMessage::createFromUserString("Classify sentiment (positive/negative/neutral): $text")
])
),
$texts
);
// Submit batch
$batchId = $client->createBatch($requests);
// Poll until complete
do {
sleep(60); // Wait 1 minute
$batch = $client->retrieveBatch($batchId);
if ($batch !== null) {
// Check provider-specific response format for status
echo "Batch retrieved\n";
break;
}
} while (true);
// Process batch results
// Note: Exact format depends on provider implementation
var_dump($batch);
Async Polling¶
Use async operations for efficient polling:
<?php
use React\EventLoop\Loop;
$batchId = $client->createBatch($requests);
// Check every 60 seconds
Loop::addPeriodicTimer(60, function () use ($client, $batchId, &$timer) {
$batch = $client->retrieveBatch($batchId);
if ($batch !== null) {
// Batch is available, process results
processResults($batch);
Loop::cancelTimer($timer);
}
});
Loop::run();
Use Cases¶
Data Processing¶
Process large datasets:
<?php
// Classify 100k customer reviews
$reviews = loadReviews(); // 100,000 reviews
$batches = array_chunk($reviews, 1000); // Batch size of 1000
foreach ($batches as $batchReviews) {
$requests = array_map(
fn($review) => createClassificationRequest($review),
$batchReviews
);
$batchIds[] = $client->createBatch($requests);
}
// Wait for all batches to complete
waitForBatches($batchIds);
Content Generation¶
Generate content at scale:
<?php
// Generate product descriptions for 10k products
$products = loadProducts();
$requests = array_map(
fn($product) => new LLMRequest(
model: $model,
conversation: new LLMConversation([
LLMMessage::createFromUserString("Write a compelling product description for: {$product->name}")
])
),
$products
);
$batchId = $client->createBatch($requests);
Translation¶
Batch translate documents:
<?php
// Translate 1000 documents to 5 languages
$documents = loadDocuments();
$languages = ['es', 'fr', 'de', 'it', 'pt'];
$requests = [];
foreach ($documents as $doc) {
foreach ($languages as $lang) {
$requests[] = new LLMRequest(
model: $model,
conversation: new LLMConversation([
LLMMessage::createFromUserString("Translate to $lang: {$doc->content}")
])
);
}
}
$batchId = $client->createBatch($requests);
Best Practices¶
- Batch sizing - Keep batches at 1000-10000 requests for optimal processing
- Polling interval - Poll every 60-300 seconds, not more frequently
- Error handling - Handle failed batches gracefully
- Cost monitoring - Track batch costs across operations
- Result storage - Save results immediately after retrieval
- Timeout handling - Set reasonable timeouts for batch completion
- Rate limits - Respect provider rate limits on batch creation
Error Handling¶
<?php
try {
$batchId = $client->createBatch($requests);
} catch (BatchCreationException $e) {
// Handle batch creation error
echo "Failed to create batch: " . $e->getMessage();
// Retry with smaller batch size
$smallerBatches = array_chunk($requests, 500);
foreach ($smallerBatches as $batch) {
$batchId = $client->createBatch($batch);
}
}
// Retrieve batch results
$batch = $client->retrieveBatch($batchId);
if ($batch !== null) {
// Process batch results according to provider-specific format
// Check provider documentation for exact structure
processResults($batch);
}
Cost Comparison¶
Batch processing typically offers 50% cost savings:
<?php
// Real-time: $0.01 per request × 10,000 = $100
$realTimeCost = 10000 * 0.01;
// Batch: $0.005 per request × 10,000 = $50
$batchCost = 10000 * 0.005;
echo "Savings: $" . ($realTimeCost - $batchCost); // $50
Provider Support¶
- ✅ OpenAI - Full batch API support
- ⚠️ Anthropic - Check current API documentation
- ⚠️ Google Gemini - Check current API documentation
- ❌ OpenAI-compatible - Varies by provider
Limitations¶
- Latency - Results may take minutes to hours
- No streaming - Batch responses don't support streaming
- No cancellation - Some providers don't allow batch cancellation
- Result expiration - Results may expire after 24-48 hours
- Size limits - Maximum batch size varies by provider
See Also¶
- Configuration Guide - Request configuration
- Provider Documentation - Provider-specific batch features