Loading data...

vtucircle » BAIL657C Program 1

BAIL657C Program 1

PROGRAM:

# Module or library install command (run this in terminal before running the script)
# pip install gensim scipy

# Import required libraries
import gensim.downloader as api  # For downloading pre-trained word vectors
from scipy.spatial.distance import cosine  # For calculating cosine similarity

# Load pre-trained Word2Vec model (Google News, 300 dimensions)
print("Loading Word2Vec model...")
model = api.load("word2vec-google-news-300")
print("Model loaded successfully.\n")

# Get and print the first 10 dimensions of the word vector for 'king'
vector = model['king']
print("First 10 dimensions of 'king' vector:")
print(vector[:10], "\n")

# Print top 10 most similar words to 'king'
print("Top 10 words most similar to 'king':")
for word, similarity in model.most_similar('king'):
    print(f"{word}: {similarity:.4f}")
print()

# Perform word analogy: king - man + woman ≈ queen
result = model.most_similar(positive=['king', 'woman'], negative=['man'], topn=1)
print("Analogy - 'king' - 'man' + 'woman' ≈ ?")
print(f"Result: {result[0][0]} (Similarity: {result[0][1]:.4f})\n")

# Analogy: paris + italy - france ≈ rome
print("Analogy - 'paris' + 'italy' - 'france' ≈ ?")
for word, similarity in model.most_similar(positive=['paris', 'italy'], negative=['france']):
    print(f"{word}: {similarity:.4f}")
print()

# Analogy: walking + swimming - walk ≈ swim
print("Analogy - 'walking' + 'swimming' - 'walk' ≈ ?")
for word, similarity in model.most_similar(positive=['walking', 'swimming'], negative=['walk']):
    print(f"{word}: {similarity:.4f}")
print()

# Calculate cosine similarity between 'king' and 'queen'
similarity = 1 - cosine(model['king'], model['queen'])
print(f"Cosine similarity between 'king' and 'queen': {similarity:.4f}")

OUTPUT:

Loading Word2Vec model...
[==================================================] 100.0% 1662.8/1662.8MB downloaded
Model loaded successfully.

First 10 dimensions of 'king' vector:
[ 0.12597656  0.02978516  0.00860596  0.13964844 -0.02563477 -0.03613281
  0.11181641 -0.19824219  0.05126953  0.36328125] 

Top 10 words most similar to 'king':
kings: 0.7138
queen: 0.6511
monarch: 0.6413
crown_prince: 0.6204
prince: 0.6160
sultan: 0.5865
ruler: 0.5798
princes: 0.5647
Prince_Paras: 0.5433
throne: 0.5422

Analogy - 'king' - 'man' + 'woman'?
Result: queen (Similarity: 0.7118)

Analogy - 'paris' + 'italy' - 'france'?
lohan: 0.5070
madrid: 0.4818
heidi: 0.4800
real_madrid: 0.4753
florence: 0.4682
diego: 0.4673
ronnie: 0.4672
juventus: 0.4672
joel: 0.4654
huntelaar: 0.4636

Analogy - 'walking' + 'swimming' - 'walk'?
Swimming: 0.6006
swim: 0.5949
swimmers: 0.5835
swimmer: 0.5819
paddling: 0.5744
kayaking: 0.5662
swam: 0.5506
rowing: 0.5436
swims: 0.5371
canoeing: 0.5140

Cosine similarity between 'king' and 'queen': 0.6511
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x