1. Explore pre-trained word vectors. Explore word relationships using vector arithmetic. Perform arithmetic operations and analyze results.
PROGRAM:
# Module or library install command (run this in terminal before running the script)
# pip install gensim scipy
# Import required libraries
import gensim.downloader as api # For downloading pre-trained word vectors
from scipy.spatial.distance import cosine # For calculating cosine similarity
# Load pre-trained Word2Vec model (Google News, 300 dimensions)
print("Loading Word2Vec model...")
model = api.load("word2vec-google-news-300")
print("Model loaded successfully.\n")
# Get and print the first 10 dimensions of the word vector for 'king'
vector = model['king']
print("First 10 dimensions of 'king' vector:")
print(vector[:10], "\n")
# Print top 10 most similar words to 'king'
print("Top 10 words most similar to 'king':")
for word, similarity in model.most_similar('king'):
print(f"{word}: {similarity:.4f}")
print()
# Perform word analogy: king - man + woman ≈ queen
result = model.most_similar(positive=['king', 'woman'], negative=['man'], topn=1)
print("Analogy - 'king' - 'man' + 'woman' ≈ ?")
print(f"Result: {result[0][0]} (Similarity: {result[0][1]:.4f})\n")
# Analogy: paris + italy - france ≈ rome
print("Analogy - 'paris' + 'italy' - 'france' ≈ ?")
for word, similarity in model.most_similar(positive=['paris', 'italy'], negative=['france']):
print(f"{word}: {similarity:.4f}")
print()
# Analogy: walking + swimming - walk ≈ swim
print("Analogy - 'walking' + 'swimming' - 'walk' ≈ ?")
for word, similarity in model.most_similar(positive=['walking', 'swimming'], negative=['walk']):
print(f"{word}: {similarity:.4f}")
print()
# Calculate cosine similarity between 'king' and 'queen'
similarity = 1 - cosine(model['king'], model['queen'])
print(f"Cosine similarity between 'king' and 'queen': {similarity:.4f}")
OUTPUT:
Loading Word2Vec model...
[==================================================] 100.0% 1662.8/1662.8MB downloaded
Model loaded successfully.
First 10 dimensions of 'king' vector:
[ 0.12597656 0.02978516 0.00860596 0.13964844 -0.02563477 -0.03613281
0.11181641 -0.19824219 0.05126953 0.36328125]
Top 10 words most similar to 'king':
kings: 0.7138
queen: 0.6511
monarch: 0.6413
crown_prince: 0.6204
prince: 0.6160
sultan: 0.5865
ruler: 0.5798
princes: 0.5647
Prince_Paras: 0.5433
throne: 0.5422
Analogy - 'king' - 'man' + 'woman' ≈ ?
Result: queen (Similarity: 0.7118)
Analogy - 'paris' + 'italy' - 'france' ≈ ?
lohan: 0.5070
madrid: 0.4818
heidi: 0.4800
real_madrid: 0.4753
florence: 0.4682
diego: 0.4673
ronnie: 0.4672
juventus: 0.4672
joel: 0.4654
huntelaar: 0.4636
Analogy - 'walking' + 'swimming' - 'walk' ≈ ?
Swimming: 0.6006
swim: 0.5949
swimmers: 0.5835
swimmer: 0.5819
paddling: 0.5744
kayaking: 0.5662
swam: 0.5506
rowing: 0.5436
swims: 0.5371
canoeing: 0.5140
Cosine similarity between 'king' and 'queen': 0.6511