[英]How do I print out only certain rows from a large CSV file uploaded using pandas
How would I output a certain rows, in no particular order (ie I want to print the rows with (rep1,rep2), (rep10,rep12), (rep12,rep16)?) My csv file is:
我将如何以特定顺序输出某些行(即,我想使用(rep1,rep2),(rep10,rep12),(rep12,rep16)打印行?)我的csv文件是:
Node1 Node2 Trail Time color estimate
0 rep1 rep2 rep_1 1811 red 0
1 rep2 rep4 rep_1 1811 red 0
2 rep4 rep5 rep_1 1135 red 0
3 rep5 rep7 rep_1 2000 red 0
4 rep7 rep8 rep_1 885 red 0
5 rep8 rep10 rep_1 1010 red 0
6 rep10 rep12 rep_1 1310 red 0
7 rep12 rep13 rep_1 1135 red 0
8 rep13 rep16 rep_1 1435 red 0
9 rep16 rep17 rep_1 885 red 0
10 rep17 rep19 rep_1 1435 red 0
11 rep19 rep26 rep_1 1000 red 0
12 rep26 rep27 rep_1 850 red 0
13 rep2 rep1 rep_2 1811 blue 0
14 rep1 rep4 rep_2 1811 blue 0
15 rep4 rep5 rep_2 1135 blue 0
16 rep5 rep7 rep_2 2000 blue 0
17 rep7 rep8 rep_2 885 blue 0
18 rep8 rep10 rep_2 1010 blue 0
19 rep10 rep12 rep_2 1310 blue 0
.. ... ... ... ... ... ...
159 rep5 rep7 rep_26 2000 brown 0
160 rep7 rep8 rep_26 885 brown 0
161 rep8 rep10 rep_26 1010 brown 0
162 rep10 rep12 rep_26 1310 brown 0
163 rep12 rep13 rep_26 1135 brown 0
164 rep13 rep16 rep_26 1435 brown 0
165 rep16 rep17 rep_26 885 brown 0
166 rep17 rep19 rep_26 1435 brown 0
167 rep19 rep27 rep_26 1000 brown 0
168 rep27 rep1 rep_27 885 blue 0
169 rep1 rep2 rep_27 1181 blue 0
170 rep2 rep4 rep_27 1811 blue 0
171 rep4 rep5 rep_27 1135 blue 0
172 rep5 rep7 rep_27 2000 blue 0
173 rep7 rep8 rep_27 885 blue 0
174 rep8 rep10 rep_27 1010 blue 0
175 rep10 rep12 rep_27 1310 blue 0
176 rep12 rep13 rep_27 1135 blue 0
177 rep13 rep16 rep_27 1435 blue 0
178 rep16 rep17 rep_27 885 blue 0
179 rep17 rep19 rep_27 1435 blue 0
180 rep19 rep26 rep_27 850 blue 0
[181 rows x 6 columns]
this is what I have tried to use for an output: 这是我尝试用于输出的内容:
print(df3(odd_matching))
I've also tried: 我也尝试过:
tcount=0
bcount=0
for cols in df3.iterrows():
tcount += 1
if cols ['Node1'] == odd_matching
bcount +=1
print(Node1, 'This', Node2, 'This', )
The rest of my code: 我的其余代码:
import itertools
import copy
import networkx as nx
import pandas as pd
import matplotlib.pyplot as plt
import csv
df3=pd.read_csv(r"U:\\user\edge_list_4.csv")
print(df3)
df4=pd.read_csv(r"U:\\user\nodes_fixed_2.csv",error_bad_lines=False)
df4.dropna()
print(df4)
# Compute min weight matching.
# Note: max_weight_matching uses the 'weight' attribute by default as the
attribute to maximize.
odd_matching_dupes= nx.algorithms.max_weight_matching(g_odd_complete, True)
print('Number of edges in matching: {}'.format(len(odd_matching_dupes)))
# Preview of matching with dupes
odd_matching_dupes
# Convert matching to list of deduped tuples
odd_matching = list(pd.unique([tuple(sorted([k, v])) for k, v in
odd_matching_dupes]))
#Counts
print('Number of edges in matching (deduped):
{}'.format(len(odd_matching)))
# Preview of deduped matching
odd_matching
g_odd_complete_min_edges = nx.Graph(odd_matching)
def add_augmenting_path_to_graph(graph, min_weight_pairs):
"""
Add the min weight matching edges to the original graph
Parameters:
graph: NetworkX graph (original graph from trailmap)
min_weight_pairs: list[tuples] of node pairs from min weight matching
Returns:
augmented NetworkX graph
"""
# We need to make the augmented graph a MultiGraph so we can add parallel
edges
graph_aug=nx.MultiGraph(graph.copy())
for pair in min_weight_pairs:
graph_aug.add_edge(pair[0],
pair[1],
**{'Time': nx.dijkstra_path_length(graph, pair[0],
pair[1]), 'Trail': 'augmented'}
# attr_dict={'distance':
nx.dijkstra_path_length(graph, pair[0], pair[1]),
# 'trail': 'augmented'} # deprecated
after 1.11
)
return graph_aug
#Create augmented graph: add the min weight matching edges to g
g_aug=add_augmenting_path_to_graph(g, odd_matching)
Odd_matching is: 奇数匹配为:
[('rep19', 'rep27'), ('rep2', 'rep5'), ('rep10', 'rep7'), ('rep1', 'rep8'), ('rep12', 'rep13'), ('rep16', 'rep17')]
The error I get is:
- TypeError Traceback (most recent call last) in 271 272 --> 273 print("this sample", df3(odd_matching)) -TypeError Traceback(最近一次通话最近一次)在271272-> 273 print(“ this sample”,df3(odd_matching))
TypeError: 'DataFrame' object is not callable
suppose the dataframe you have given is 'df' then, 假设您给定的数据框为“ df”,
df1 = df[(df['Node1']=='rep1') & (df['Node2']=='rep2')]
df2 = df[(df['Node1']=='rep10') & (df['Node2']=='rep12')]
df2 = df[(df['Node1']=='rep12') & (df['Node2']=='rep16')]
concat the above dataframes to get the desired output. 合并以上数据框以获得所需的输出。
Starting from odd_matching
is awkward, because due to sorting the first element of each pair can be either Node1
or Node2
and the second element of this pair - the "other" node. 从
odd_matching
开始是很尴尬的,因为由于排序,每对的第一个元素可以是Node1
或Node2
,而该对的第二个元素可以是“其他”节点。
Start from odd_matching_dupes
as here keys are Node1
and values are Node2
. 从
odd_matching_dupes
开始,因为这里的键是Node1
,值是Node2
。 Don't bother about repetitions, you will deal with them later. 不要为重复而烦恼,稍后您将对其进行处理。
Note that: 注意:
[ df3[df3.Node1 == k and df3.Node2 == v].index for k, v in odd_matching_dupes ]
gives you a list of indices for all rows included in odd_matching_dupes
. 为您提供了一个
odd_matching_dupes
列表中所有行的索引列表。
Then: 然后:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.