• A specialized library built on top of Hugging Face’s transformers to specifically handle tasks that involve sentence embeddings, such as semantic search, clustering, and sentence similarity.

  • The Transformers library Provides a broad set of models for various NLP tasks, including language generation, classification, and more; The sentence-transformers focuses on models that can efficiently generate sentence embeddings.

  • Useful for sentence similarity

    • useful in sentence retrieval, clustering, or grouping
  • Sentence similarity models convert input text into vectors (Embeddings)

Code

pip install sentence-transformers
 
from sentence_transformers import SentenceTransformer
from sentence_transformers import util
 
model = SentenceTransformer("all-MiniLm-L6-v2")
 
sentences1 = [
	'The cat sits outside',
    'A man is playing guitar',
    'The movies are awesome'
]
 
embeddings1 = model.encode(sentences1, convert_to_tensor=True)
print(embeddings1)
"""
tensor([[ 0.1392,  0.0030,  0.0470,  ...,  0.0641, -0.0163,  0.0636],
        [ 0.0227, -0.0014, -0.0056,  ..., -0.0225,  0.0846, -0.0283],
        [-0.1043, -0.0628,  0.0093,  ...,  0.0020,  0.0653, -0.0150]])
"""
 
sentences2 = [
	'The dog plays in the garden',
    'A woman watches TV',
    'The new movie is so great'
]
embeddings2 = model.encode(sentences2, convert_to_tensor=True)
 
cosine_scores = util.cos_sim(embeddings1,embeddings2)
print(cosine_scores)
""" cosine similary for each pair of sentences1 and sentences2
tensor([[ 0.2838,  0.1310, -0.0029],
        [ 0.2277, -0.0327, -0.0136],
        [-0.0124, -0.0465,  0.6571]])
"""