After a helpful answer to my original question , I decided to use internal data (which I can't share here). The internal data follows the same format as the mock data. I simply copied the data over to the same working directory, making sure that the new data has the same format, ie same column headers, etc. I used DiffChecker to make sure that app.py
(from my original post) matches the Proof of Concept ( appPOC.py
). The internal data has more than 600 nodes, and more than 3000 edges.
The code to make the interactive dashboard is the same as the one I used for my original post . However, this time I run into this KeyError
:
Traceback (most recent call last):
File "appPOC.py", line 75, in <module>
hovertext = "Document Description: " + str(G.nodes[node]['Description']) + "<br>" + "Document Name: " + str(G.nodes[node]['DocName']) + "<br>" + "Document ID: " + str(G.nodes[node]['DocumentID'])
KeyError: 'Description'
The data itself should be fine, as I can plot a network without the hovering text next to the node.
To summarize: app.py
can plot the mock data, appPOC.py
(which is identical, but has a different file name) can't plot the internal data. This leads me to believe that there is something wrong with the internal data in the CSV
file.
Edit : I figured out that if the target is not listed in the elements, the graph fails to be drawn. Is there anyway to create a node automatically (like in Gephi) if the (target) node is not defined in the elements?
NetworkX
creates nodes for from- and to-nodes of each edge. Hence, with
G = nx.from_pandas_edgelist(edges, 'Source', 'Target')
you're graph has all possible nodes. However, with
nx.set_node_attributes(G, nodes.set_index('Doc')['Description'].to_dict(), 'Description')
nx.set_node_attributes(G, nodes.set_index('Doc')['DocumentID'].to_dict(), 'DocumentID')
you only fill the node attributes 'Description'
and 'DocumentID'
for those in your nodes data frame. A simple workaround is to replace the
str(G.nodes[node]['Description'])
with
str(G.nodes[node].get('Description', ''))
and similarly for 'DocName'
and 'DocumentID'
. More information on the get method you find at: Why dict.get(key) instead of dict[key]? Basically, we use that networkx
uses dict
to store values and make use of the get
method, which allows to supply a default value.
import networkx as nx
g = nx.karate_club_graph()
# all nodes in this graph have the node attribute 'club' filled
# we add a node without this node attribute
g.add_node("Test")
print(g.nodes[0]["club"])
# 'Mr. Hi'
# print(g.nodes["Test"]["club"]
# results in KeyError: 'club'
print(g.nodes["Test"].get("club", ""))
# ''
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.