简体   繁体   English

如何从使用熊猫上传的大型CSV文件中仅打印出某些行

[英]How do I print out only certain rows from a large CSV file uploaded using pandas

在此处输入图片说明 How would I output a certain rows, in no particular order (ie I want to print the rows with (rep1,rep2), (rep10,rep12), (rep12,rep16)?) My csv file is: 我将如何以特定顺序输出某些行(即,我想使用(rep1,rep2),(rep10,rep12),(rep12,rep16)打印行?)我的csv文件是:

 Node1  Node2   Trail  Time  color  estimate
0     rep1   rep2   rep_1  1811    red         0
1     rep2   rep4   rep_1  1811    red         0
2     rep4   rep5   rep_1  1135    red         0
3     rep5   rep7   rep_1  2000    red         0
4     rep7   rep8   rep_1   885    red         0
5     rep8  rep10   rep_1  1010    red         0
6    rep10  rep12   rep_1  1310    red         0
7    rep12  rep13   rep_1  1135    red         0
8    rep13  rep16   rep_1  1435    red         0
9    rep16  rep17   rep_1   885    red         0
10   rep17  rep19   rep_1  1435    red         0
11   rep19  rep26   rep_1  1000    red         0
12   rep26  rep27   rep_1   850    red         0
13    rep2   rep1   rep_2  1811   blue         0
14    rep1   rep4   rep_2  1811   blue         0
15    rep4   rep5   rep_2  1135   blue         0
16    rep5   rep7   rep_2  2000   blue         0
17    rep7   rep8   rep_2   885   blue         0
18    rep8  rep10   rep_2  1010   blue         0
19   rep10  rep12   rep_2  1310   blue         0
 ..     ...    ...     ...   ...    ...       ...
159   rep5   rep7  rep_26  2000  brown         0
160   rep7   rep8  rep_26   885  brown         0
161   rep8  rep10  rep_26  1010  brown         0
162  rep10  rep12  rep_26  1310  brown         0
163  rep12  rep13  rep_26  1135  brown         0
164  rep13  rep16  rep_26  1435  brown         0
165  rep16  rep17  rep_26   885  brown         0
166  rep17  rep19  rep_26  1435  brown         0
167  rep19  rep27  rep_26  1000  brown         0
168  rep27   rep1  rep_27   885   blue         0
169   rep1   rep2  rep_27  1181   blue         0
170   rep2   rep4  rep_27  1811   blue         0
171   rep4   rep5  rep_27  1135   blue         0
172   rep5   rep7  rep_27  2000   blue         0
173   rep7   rep8  rep_27   885   blue         0
174   rep8  rep10  rep_27  1010   blue         0
175  rep10  rep12  rep_27  1310   blue         0
176  rep12  rep13  rep_27  1135   blue         0
177  rep13  rep16  rep_27  1435   blue         0
178  rep16  rep17  rep_27   885   blue         0
179  rep17  rep19  rep_27  1435   blue         0
180  rep19  rep26  rep_27   850   blue         0

[181 rows x 6 columns]

this is what I have tried to use for an output: 这是我尝试用于输出的内容:

print(df3(odd_matching))

I've also tried: 我也尝试过:

tcount=0
bcount=0
for cols in df3.iterrows():
tcount += 1
if cols ['Node1'] == odd_matching
bcount +=1

print(Node1, 'This', Node2, 'This', )

The rest of my code: 我的其余代码:

import itertools
import copy
import networkx as nx
import pandas as pd
import matplotlib.pyplot as plt
import csv

df3=pd.read_csv(r"U:\\user\edge_list_4.csv")
print(df3)

df4=pd.read_csv(r"U:\\user\nodes_fixed_2.csv",error_bad_lines=False)
df4.dropna() 
print(df4)

# Compute min weight matching.
# Note: max_weight_matching uses the 'weight' attribute by default as the 
attribute to maximize.
odd_matching_dupes= nx.algorithms.max_weight_matching(g_odd_complete, True)

print('Number of edges in matching: {}'.format(len(odd_matching_dupes)))

# Preview of matching with dupes
odd_matching_dupes

# Convert matching to list of deduped tuples
odd_matching = list(pd.unique([tuple(sorted([k, v])) for k, v in 
odd_matching_dupes]))

#Counts
print('Number of edges in matching (deduped): 
{}'.format(len(odd_matching)))

# Preview of deduped matching
odd_matching

g_odd_complete_min_edges = nx.Graph(odd_matching)

def add_augmenting_path_to_graph(graph, min_weight_pairs):
"""
Add the min weight matching edges to the original graph
Parameters:
    graph: NetworkX graph (original graph from trailmap)
    min_weight_pairs: list[tuples] of node pairs from min weight matching
Returns:
    augmented NetworkX graph
"""

# We need to make the augmented graph a MultiGraph so we can add parallel 
edges
graph_aug=nx.MultiGraph(graph.copy())
for pair in min_weight_pairs:
    graph_aug.add_edge(pair[0], 
                       pair[1], 
                       **{'Time': nx.dijkstra_path_length(graph, pair[0], 
pair[1]), 'Trail': 'augmented'}
                       # attr_dict={'distance': 
nx.dijkstra_path_length(graph, pair[0], pair[1]),
                       #            'trail': 'augmented'}  # deprecated 
after 1.11
                      )
return graph_aug

#Create augmented graph: add the min weight matching edges to g
g_aug=add_augmenting_path_to_graph(g, odd_matching)

Odd_matching is: 奇数匹配为:

[('rep19', 'rep27'), ('rep2', 'rep5'), ('rep10', 'rep7'), ('rep1', 'rep8'), ('rep12', 'rep13'), ('rep16', 'rep17')]

The error I get is:

- TypeError Traceback (most recent call last) in 271 272 --> 273 print("this sample", df3(odd_matching)) -TypeError Traceback(最近一次通话最近一次)在271272-> 273 print(“ this sample”,df3(odd_matching))

  TypeError: 'DataFrame' object is not callable

suppose the dataframe you have given is 'df' then, 假设您给定的数据框为“ df”,

df1 = df[(df['Node1']=='rep1') & (df['Node2']=='rep2')]

df2 = df[(df['Node1']=='rep10') & (df['Node2']=='rep12')]

df2 = df[(df['Node1']=='rep12') & (df['Node2']=='rep16')]

concat the above dataframes to get the desired output. 合并以上数据框以获得所需的输出。

Starting from odd_matching is awkward, because due to sorting the first element of each pair can be either Node1 or Node2 and the second element of this pair - the "other" node. odd_matching开始是很尴尬的,因为由于排序,每对的第一个元素可以是Node1Node2 ,而该对的第二个元素可以是“其他”节点。

Start from odd_matching_dupes as here keys are Node1 and values are Node2 . odd_matching_dupes开始,因为这里的键是Node1 ,值是Node2 Don't bother about repetitions, you will deal with them later. 不要为重复而烦恼,稍后您将对其进行处理。

Note that: 注意:

[ df3[df3.Node1 == k and df3.Node2 == v].index for k, v in odd_matching_dupes ]

gives you a list of indices for all rows included in odd_matching_dupes . 为您提供了一个odd_matching_dupes列表中所有行的索引列表。

Then: 然后:

  • delete repeating values from this list, 从此列表中删除重复的值,
  • print rows with indices in that (squeezed) list. 打印在该(压缩的)列表中具有索引的行。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在不使用熊猫的情况下,如何分析CSV数据并仅从CSV文件的某些列和行中提取某些值? - WITHOUT using Pandas, how do I analyze CSV data and only extract certain values from certain columns and rows of my CSV file? 如何使用熊猫从带有填充 0 的 csv 中打印电话号码? - How do I print out the Phone number from a csv with padded 0 using pandas? 使用 pandas 从 CSV 文件打印具有某些值的某些列 - Print certain columns with certain values from CSV file using pandas 如何使用Pandas库打印数据集(CSV文件)的尺寸并打印一些行? - How do I print the dimensions of a dataset (csv file) using Pandas- library and also print out some lines? 如何使用python将所需的行从一个csv文件复制到另一个csv文件? - How do I copy only the required rows from one csv file to other csv file using python? 如何使用 Jupyter 笔记本在 Pandas 中打印出我在 csv 文件中的每个数据值 - How do I print out every value of data I have inside my csv file in Pandas using Jupyter notebook 如何读取 CSV 文件两次并根据通过函数传递的参数打印某些行? - How do I read through a CSV file twice and print certain rows based on arguments passed through the function? 如何使用 Python 仅打印 csv 文件的前 10 行? - How do I print only the first 10 lines from a csv file using Python? 如何在csv文件中的行中打印特定字段,以及如何将输入内容写入csv文件? - How do I print specific fields from rows in a csv file and how to write input to a csv file? 如何在 Python 中使用 Pandas 打印 csv 文件中的所有行/数据? - How to print all rows/data from csv file with Pandas in Python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM