简体   繁体   中英

Error in Plotting circle for Similarity measure using matplotlib in python

I am working on project to find similarity between two sentences/documents using tf-idf measure.

Now my question is how can I show the similarity in a graphical/Visualization format. Something like a Venn diagram where intersection value becomes the similarity measure or any other plots available in matplotlib or any python libraries.

I tried the following code:

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity  

documents = (
"The sky is blue",
"The sun is bright"

)
tfidf_vectorizer = TfidfVectorizer()
tfidf_matrix = tfidf_vectorizer.fit_transform(documents)
print tfidf_matrix
cosine = cosine_similarity(tfidf_matrix[0:1], tfidf_matrix)
print cosine
import matplotlib.pyplot as plt
r=25
d1 = 2 * r * (1 - cosine[0][0])
circle1=plt.Circle((0,0),d1/2,color='r')
d2 = 2 * r * (1 - cosine[0][1])
circle2=plt.Circle((r,0),d2/2,color="b")
fig = plt.gcf()
fig.gca().add_artist(circle1)
fig.gca().add_artist(circle2)
fig.savefig('plotcircles.png')
plt.show()

But the plot I got was empty. Can some one explain what might be the error.

plotting circle source: plot a circle

Just to explain what's going on, here's a stand-alone example of your problem (if the circle is entirely outside the boundaries, nothing would be shown):

import matplotlib.pyplot as plt
from matplotlib.patches import Circle

fig, ax = plt.subplots()
circ = Circle((1, 1), 0.5)
ax.add_artist(circ)
plt.show()

在此处输入图片说明

When you manually add an artist through add_artist , add_patch , etc, autoscaling isn't applied unless you explicitly do so. You're accessing a lower-level interface of matplotlib that's what the higher-level functions (eg plot ) are built on top of. However, this is also the easiest way to add a single circle in data coordinates, so the lower-level interface is what you want in this case.

Furthermore, add_artist is too general for this. You actually want add_patch ( plt.Circle is matplotlib.patches.Circle ). The difference between add_artist and add_patch may seem arbitrary, but add_patch has extra logic to calculate the extent of a patch for autoscaling, whereas add_artist is the "bare" lower-level function that can take any artist, but doesn't do anything special. Autoscaling won't work correctly for a patch if you add it with add_artist .

To autoscale the plot based on the artists that you've added, call ax.autoscale() :

As a quick example of autoscaling a manually added patch:

import matplotlib.pyplot as plt
from matplotlib.patches import Circle

fig, ax = plt.subplots()
circ = Circle((1, 1), 0.5)
ax.add_patch(circ)
ax.autoscale()
plt.show()

在此处输入图片说明

Your next question might be "why isn't the circle round?". It is, in data coordinates. However, the x and y scales of the plot (this is the aspect ratio, in matplotlib terminology) are currently different. To force them to be the same, call ax.axis('equal') or ax.axis('scaled') . (We can actually leave out the call to autoscale in this case, as ax.axis('scaled'/'equal') will effectively call it for us.):

import matplotlib.pyplot as plt
from matplotlib.patches import Circle

fig, ax = plt.subplots()
circ = Circle((1, 1), 0.5)
ax.add_patch(circ)
ax.axis('scaled')
plt.show()

在此处输入图片说明

The Plots are not empty, but I guess, your circles are to big!

I don't have sklearn installed, so I start at the point where you print cosine :

## set constants
r = 1
d = 2 * r * (1 - cosine[0][1])

## draw circles
circle1=plt.Circle((0, 0), r, alpha=.5)
circle2=plt.Circle((d, 0), r, alpha=.5)
## set axis limits
plt.ylim([-1.1, 1.1])
plt.xlim([-1.1, 1.1 + d])
fig = plt.gcf()
fig.gca().add_artist(circle1)
fig.gca().add_artist(circle2)
## hide axes if you like
# fig.gca().get_xaxis().set_visible(False)
# fig.gca().get_yaxis().set_visible(False)
fig.savefig('venn_diagramm.png')

That also answers your other question, where I also added this piece of code!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM