Faiss is a library dedicated to efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM.
Project Address: https://github.com/facebookresearch/faiss
Development Team: Facebook AI Research (Meta AI)
Development Language: C++, with complete wrappers for Python and C
Faiss is written in C++ with complete wrappers for Python and C. Some of the most useful algorithms are implemented for the GPU using CUDA.
Faiss indexes vectors using sophisticated algorithms (such as k-means clustering and product quantization) that make nearest neighbor search fast.
Faiss is organized as a toolbox that contains a variety of indexing methods. It generally involves a chain of components (preprocessing, compression, non-exhaustive search).
On the CPU side, Faiss makes extensive use of:
Faiss provides reference brute-force algorithms that compute all similarities exactly and exhaustively, and return a list of the most similar elements. This provides a "golden standard" reference result list.
conda install -c pytorch faiss-gpu
pip install faiss-cpu
pip install faiss-gpu
import faiss
import numpy as np
dimension = 64
database_size = 10000
query_size = 100
database_vectors = np.random.random((database_size, dimension)).astype('float32')
query_vectors = np.random.random((query_size, dimension)).astype('float32')
index = faiss.IndexFlatL2(dimension)
index.add(database_vectors)
k = 5
distances, indices = index.search(query_vectors, k)
print(f"indices: {indices.shape}")
print(f"distances: {distances.shape}")
The Facebook AI Research team started developing Faiss in 2015, based on research results and a significant amount of engineering effort. The project has now become one of the standard tools in the field of vector similarity search.
Faiss is a powerful and high-performance vector similarity search library, especially suitable for handling large-scale, high-dimensional vector data. Its rich algorithm selection, excellent performance, and wide range of application scenarios make it an important tool in fields such as machine learning, information retrieval, and recommendation systems. Whether for academic research or industrial applications, Faiss can provide reliable and efficient solutions.