The dynamic leap in technology and the management of vast amounts of unstructured data in the form of text, video and audio pose challenges. Standard databases are increasingly proving inadequate, prompting a search for innovative solutions. Vector databases are one such group.
The aforementioned vector databases are proving to be a breakthrough in data analysis and processing. They make it possible to store huge collections of information, but also support efficient searches based on similar features. This property contributes to opening the door to advanced search engine recommendation systems. This is not the only advantage, as vector databases are now able to support applications that not long ago seemed unrealistic.
That's why this article aims to introduce you to the possibilities of vector databases. You will learn about their advantages and practical applications. You will learn an example of the implementation of a recommendation system, which was created on the basis of Qdrant and AI technology.
Table of Contents:
4. Key applications of vector databases
5. Example of implementation of a recommendation system
8. Creating collections in Qdrant
10. Inserting data into the database
What is a vector database?
A vector database is an example of databases that is responsible for storing data in the form of vectors. Compared to traditional databases that are stored in text or by human-readable numbers, a vector database collects data in a numeric format, such as [0.5, 0.4, -0.2].
This type of database is specifically designed to handle queries and indexing in the mentioned form. In this way, it is possible to perform advanced analysis and search based on data similarity.
Why use vector databases?
Vector databases are extremely useful when you need to understand and process unstructured data (text, images, sounds) simultaneously. They represent data in the form of vectors, which makes it possible to perform advanced analysis and search for data based on their similarity. Vector databases can be used, for example, in the form of a recommendation system. A person who is planning to buy a bicycle can receive suggestions for accessories from the described system. Thus, a cyclist can receive results on the helmet or lighting in the form of similarity data.
What are vectors?
Since we have information about vector bases, it's time to add to our knowledge of vectors themselves. These are numerical representations of complex information. In order to create data in this form, embedding is necessary, which transforms data stored in text form into numerical vectors. Embedding can use an appropriate AI model to quickly produce the right results.
Staying with the example of a cyclist, imagine that the name “classic helmet” can be transformed into a vector [0.5, -0.3, 0.4, -0.3]. This kind of operation is possible with an embedding model like nomic-embed-text.
Key applications of vector databases
Vector databases provide a number of functions that can be applied to a wide range of industries and scenarios, including:
Similar data search - allows you to quickly yet efficiently find items that are similar to your query. In the case of multimedia search engines, it is possible to find and select images with similar content according to a model image.
Recommendation systems - help suggest products or services based on available information about the user, such as purchase history or preferences. Using a bicyclist as an example, a person buying a bicycle may receive suggestions for helmets, bidons or maintenance tools. Recommendations impact the user experience by providing personalized recommendations in real time.
Example of implementation of a recommendation system
In order to authenticate our cyclist's story, we implemented an accessory recommendation system, using Qdrant's vector database and embedding process. For embedding, we used the OLLama AI tool. The system, on the other hand, was subjected to implementation in GO language, using the Qdrant client (GO cilent Qdrant).
Key implementation steps:
1. Qdrant client setup
The Qdrant client connects to our cloud database. The credentials are stored in an .env file .
func CloudClient() *qdrant.Client {
err := godotenv.Load()
if err != nil {
log.Fatal("Error loading .env file")
}
apiKey := os.Getenv("API")
host := os.Getenv("HOST")
client, _ := qdrant.NewClient(&qdrant.Config{
Host: host,
Port: 6334,
APIKey: apiKey,
UseTLS: true,
})
return client
}
2. Creating collections in Qdrant
A collection is a place where vector data is stored. All vectors in the collection must have the same dimension.
client.CreateCollection(context.Background(), &qdrant.CreateCollection{
CollectionName: CollectionName,
VectorsConfig: qdrant.NewVectorsConfig(&qdrant.VectorParams{
Size: 768,
Distance: qdrant.Distance_Cosine,
}),
})
3. Embedding data
Data, such as accessory descriptions, are transformed into vectors using an embedding model. In our solution, we use a locally running instance of Ollama for embedding, which allows us to quickly and efficiently transform data into vectors with minimal latency.
func embed(text []string) []*types.Embedding {
ef, err := ollama.NewOllamaEmbeddingFunction(
ollama.WithBaseURL("http://127.0.0.1:11434"),
ollama.WithModel("nomic-embed-text"))
if err != nil {
fmt.Printf("Error creating Ollama embedding function: %s \n", err)
}
embedding, err := ef.EmbedDocuments(context.Background(), text)
if err != nil {
fmt.Printf("Error embedding documents: %s \n", err)
}
return embedding
}
4. Inserting data into the database
func Insert(vectors []float32, payload map[string]any, CollectionName string, client *qdrant.Client) {
operationInfo, err := client.Upsert(context.Background(), &qdrant.UpsertPoints{
CollectionName: CollectionName,
Points: []*qdrant.PointStruct{
{
Id: qdrant.NewID(uuid.New().String()),
Vectors: qdrant.NewVectorsDense(vectors),
Payload: qdrant.NewValueMap(payload),
},
},
})
if err != nil {
panic(err)
}
fmt.Println(operationInfo)
}
5. Recommending accessories
The system searches for products most similar to those indicated by the user.
func recommendAccessory(id string, accessoryType string, client *qdrant.Client) []*qdrant.ScoredPoint {
searchResult, err := client.Query(context.Background(), &qdrant.QueryPoints{
CollectionName: CollectionName,
Query: qdrant.NewQueryRecommend(&qdrant.RecommendInput{
Positive: []*qdrant.VectorInput{
qdrant.NewVectorInputID(qdrant.NewID(id)),
},
}),
Filter: &qdrant.Filter{
Must: []*qdrant.Condition{
qdrant.NewMatch("type", accessoryType),
},
},
})
if err != nil {
panic(err)
}
return searchResult
}
Summary
Vector databases, such as Qdrant, open up new possibilities for analyzing and processing unstructured data. Implementing a recommendation system using them enables better tailoring of offerings to customers' needs. The combination of AI technologies, such as model embedding, and the flexibility of a vector database make them indispensable tools in modern business applications.