[eland](https://github.com/elastic/eland)
Elastic client. We’ll have to clone the “eland” repository and build the docker image before running it:
[sentence-transformers/msmarco-MiniLM-L-12-v3](https://huggingface.co/sentence-transformers/msmarco-MiniLM-L-12-v3)
model from Hugging Face — although you could use any model you’d like. To upload the model to your Elasticsearch deployment, run the following command:
text_embeddings
under the field predicted_value
.
To make the loading process a bit easier, we’re going to pluck the predicted_value
field and add it as its own column:
org.elasticsearch:elasticsearch-spark-30_2.12:8.5.2
We’ll add the Pinecone Databricks connectors from S3:
s3://pinecone-jars/spark-pinecone-uberjar.jar
Restart the cluster if needed. Next, we’ll create a new notebook, attach it to the cluster and import the required dependencies:
predicted_value
field is an array with a depth of 1, as shown below:
sentence-transformers/msmarco-MiniLM-L-12-v3
model. Then, we’ll use the Pinecone client to issue the query. We’ll do this in a Python notebook.
We’ll start by installing the required dependencies: