This page provides recommendations and best practices for preparing your Pinecone indexes for production, anticipating production issues, and enabling reliability and growth.
One of the first steps towards building a production-ready Pinecone index is configuring your project correctly.
Consider creating a separate project for your development and production indexes, to allow for testing changes to your index before deploying them to production.
Ensure that you have properly configured user access within your production environment, so that only those users who need to access the production index can do so.
Consider how best to manage the API key(s) associated with your production project.
Before you move your index to production, make sure that your index is returning accurate results in the context of
your application by identifying the appropriate metrics for
evaluating your results.
Before moving your project to production, consider determining whether your index configuration can serve the load of queries you anticipate from your application. You can write load tests in Python from scratch or using a load testing framework like Locust.
In order to enable long-term retention, compliance archiving, and deployment of new indexes, consider backing up your production indexes by creating collections.
This guidance applies to pod-based indexes only. With serverless indexes, you don’t configure any compute or storage resources, and you don’t manually manage those resources to meet demand, save on cost, or ensure high availability. Instead, serverless indexes scale automatically based on usage.
Depending on your data and the types of workloads you intend to run, your pod-based index may require a different number and size of pods and replicas. Factors to consider include the number of vectors, the dimensions per vector, the amount and cardinality of metadata, and the acceptable queries per second (QPS). Use the index fullness metric to identify how much of your current resources your indexes are using. You can use collections to create indexes with different pod types and sizes to experiment.
This guidance applies to pod-based indexes only. With serverless indexes, you don’t configure any compute or storage resources, and you don’t manually manage those resources to meet demand, save on cost, or ensure high availability. Instead, serverless indexes scale automatically based on usage.
If you need help, visit support.pinecone.io, or talk to the Pinecone community. Ensure that your plan tier matches the support and availability SLAs you need. This may require you to upgrade to Enterprise.