Moving to production

This page provides recommendations and best practices for preparing your Pinecone indexes for production, anticipating production issues, and enabling reliability and growth.

For high-scale use cases, consider using the Pinecone AWS Reference Architecture as a starting point.

Prepare your project structure

One of the first steps towards a production-ready Pinecone index is configuring your project correctly. Consider creating a separate project for your development and production indexes, to allow for testing changes to your index before deploying them to production. Ensure that you have properly configured user access to your production environment so that only those users who need to access the production index can do so. Consider how best to manage the API key associated with your production project.

Test your query results

Before you move your index to production, make sure that your index is returning accurate results in the context of your application. Consider identifying the appropriate metrics for evaluating your results.

Estimate the appropriate number and size of pods and replicas

Depending on your data and the types of workloads you intend to run, your project may require a different number and size of pods and replicas. Factors to consider include the number of vectors, the dimensions per vector, the amount and cardinality of metadata, and the acceptable queries per second (QPS). Use the index fullness metric to identify how much of your current resources your indexes are using. You can use collections to create indexes with different pod types and sizes to experiment.

Load test your indexes

Before moving your project to production, consider determining whether your index configuration can serve the load of queries you anticipate from your application. You can write load tests in Python from scratch or using a load testing framework like Locust.

Back up your indexes

In order to enable long-term retention, compliance archiving, and deployment of new indexes, consider backing up your production indexes by creating collections.

Tune for performance

Before serving production workloads, identify ways to improve latency by making changes to your deployment, project configuration, or client.

Configure monitoring

Prepare to monitor the production performance and availability of your indexes.

Plan for scaling

Before going to production, consider planning ahead for how you might scale your indexes when the need arises. Identify metrics that may indicate the need to scale, such as index fullness and average request latency. Plan for increasing the number of pods, changing to a more performant pod type, vertically scaling the size of your pods, increasing the number of replicas, or increasing storage capacity with a storage-optimized pod type.

Know how to get support

If you need help, visit support.pinecone.io, or talk to the Pinecone community. Ensure that your plan tier matches the support and availability SLAs you need. This may require you to upgrade to Enterprise.

Account management

Getting started

Organizations

Projects

Indexes

Data

Operations

Reference

Moving to production

Prepare your project structure

Test your query results

Estimate the appropriate number and size of pods and replicas

Load test your indexes

Back up your indexes

Tune for performance

Configure monitoring

Plan for scaling

Know how to get support

Account management

Getting started

Organizations

Projects

Indexes

Data

Operations

Reference

​Prepare your project structure

​Test your query results

​Estimate the appropriate number and size of pods and replicas

​Load test your indexes

​Back up your indexes

​Tune for performance

​Configure monitoring

​Plan for scaling

​Know how to get support

Prepare your project structure

Test your query results

Estimate the appropriate number and size of pods and replicas

Load test your indexes

Back up your indexes

Tune for performance

Configure monitoring

Plan for scaling

Know how to get support