Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Discussion options

Currently I only see that the rate limiter component is implemented to have a fallback to another platform for HA.
But in my usecase I need to use this one platform and make sure that when a 429 is hit, the process tries again without the need to switch to another platform.

Is this possible currently, maybe I'm missing something?

Use case:
Using https://docs.voyageai.com/docs/rate-limits with Tier 1
When using chunkSize 1 the first document is succesful and afterwards I get a 429, with 500 chunk size I get a 429 on first request.

    vectorizer:
        voyage:
            platform: 'ai.platform.voyage'
            model: 'voyage-multimodal-3.5'
    indexer:
        tcgCards:
            loader: 'App\Service\TcgCardLoader'
            vectorizer: 'ai.vectorizer.voyage'
            store: 'App\Store\DoctrineODMStore'

VectorizerCommand:

final class VectorizeCommand extends Command
{
    public function __construct(
        private readonly DocumentManager $dm,
        #[Autowire(service: 'ai.vectorizer.voyage')]
        private readonly VectorizerInterface $vectorizer,
    ) {
        parent::__construct();
    }

    protected function execute(InputInterface $input, OutputInterface $output): int
    {
        $verbosityLevelMap = [
            LogLevel::NOTICE => OutputInterface::VERBOSITY_NORMAL,
            LogLevel::INFO   => OutputInterface::VERBOSITY_NORMAL,
        ];

        $qb = $this->dm->createQueryBuilder(TcgCard::class);

        $logger = new ConsoleLogger($output, $verbosityLevelMap);

        $tcgCards = $qb->field('embeddingVector')->equals(null)->getQuery()->execute();

        $chunkSize = 500;
        $counter = 0;
        $chunk = [];

        $logger->info('Starting vectorizer: ' . count($tcgCards), ['total_documents' => count($tcgCards)]);

        foreach ($tcgCards as $document) {
            $chunk[] = $document;
            ++$counter;

            if ($chunkSize === \count($chunk)) {
                $logger->info("Processing chunk, at {$counter}", ['processed_documents' => $counter]);
                $this->vectorizeChunk($chunk, $qb);

                $chunk = [];
            }
        }

        if ([] !== $chunk) {
            $this->vectorizeChunk($chunk, $qb);
        }

        $logger->info('Document processing completed', ['total_documents' => $counter]);
        $this->dm->flush();

        return 0;
    }

    private function vectorizeChunk(array $chunk, Builder $qb): void
    {
        $vectorDocuments = $this->vectorizer->vectorize(
            $chunk,
            ['dimensions' => 512]
        );

        foreach ($vectorDocuments as $document) {
            $qb->findAndUpdate()
                ->field('id')->equals($document->id)
                ->field('embeddingVector')->set($document->vector->getData())
                ->getQuery()
                ->execute();
        }
    }
You must be logged in to vote

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
🙏
Q&A
Labels
None yet
1 participant
Morty Proxy This is a proxified and sanitized view of the page, visit original site.