在Python中从stdin读取CSV文件并对其进行修改

Question

I need to read csv file from stdin and output the rows only the rows which values are equal to those specified in the columns. 我需要从stdin读取csv文件，并仅将值等于列中指定的行的行输出。 My input is like this: 我的输入是这样的：

 2
 Kashiwa
 Name,Campus,LabName
 Shinichi MORISHITA,Kashiwa,Laboratory of Omics
 Kenta Naai,Shirogane,Laboratory of Functional Analysis in Silico
 Kiyoshi ASAI,Kashiwa,Laboratory of Genome Informatics
 Yukihide Tomari,Yayoi,Laboratory of RNA Function

My output should be like this: 我的输出应该是这样的：

 Name,Campus,LabName
 Shinichi MORISHITA,Kashiwa,Laboratory of Omics
 Kiyoshi ASAI,Kashiwa,Laboratory of Genome Informatics

I need to sort out the people whose values in column#2 == Kashiwa and not output first 2 lines of stdin in stdout. 我需要整理那些在column＃2 == Kashiwa中值的人，而不要在stdout中输出stdin的前两行。

So far I just tried to read from stdin into csv but I am getting each row as a list of strings (as expected from csv documentation). 到目前为止，我只是试图从stdin读入csv，但是我将每一行作为字符串列表获取（如csv文档所期望的那样）。 Can I change this? 我可以改变这个吗？

 #!usr/bin/env python3

 import sys
 import csv

 data = sys.stdin.readlines()

 for line in csv.reader(data):

      print(line)

Output: 输出：

 ['2']
 ['Kashiwa']
 ['Name', 'Campus', 'LabName']
 ['Shinichi MORISHITA', 'Kashiwa', 'Laboratory of Omics']
 ['Kenta Naai', 'Shirogane', 'Laboratory of Functional Analysis in 
 Silico']
 ['Kiyoshi ASAI', 'Kashiwa', 'Laboratory of Genome Informatics']
 ['Yukihide Tomari', 'Yayoi', 'Laboratory of RNA Function']

Can someone give me some advice on reading stdin into CSV and manipulating the data later (outputting only needed values of columns, swapping the columns, etc.,)? 有人可以给我一些建议，以便将stdin读入CSV并稍后处理数据（仅输出所需的列值，交换列等）吗？

Answer 1

This is one approach. 这是一种方法。

Ex: 例如：

import csv

with open(filename) as csv_file:
    reader = csv.reader(csv_file)
    next(reader) #Skip First Line
    next(reader) #Skip Second Line
    print(next(reader)) #print Header
    for row in reader:
        if row[1] == 'Kashiwa':   #Filter By 'Kashiwa'
            print(row)

Output: 输出：

['Name', 'Campus', 'LabName']
['Shinichi MORISHITA', 'Kashiwa', 'Laboratory of Omics']
['Kiyoshi ASAI', 'Kashiwa', 'Laboratory of Genome Informatics']

Answer 2

Use Pandas to read your and manage your data in a DataFrame 使用Pandas在DataFrame中读取和管理数据

import pandas as pd
# File location
infile = r'path/file'
# Load file and skip first two rows
df = pd.read_csv(infile, skiprows=2)
# Refresh your Dataframe en throw out the rows that contain Kashiwa in the campus column
df = df[df['campus'] != 'Kashiwa']

You can perform all kinds edits for example sort your DataFrame simply by: 您可以执行各种编辑，例如通过以下方式对DataFrame进行排序：

df.sort(columns='your column')

Check the Pandas documentation for all the possibilities. 有关所有可能性，请查阅Pandas文档。

Answer 3

 #!usr/bin/env python3
 import sys
 import csv

 data = sys.stdin.readlines()  # to read the file
 column_to_be_matched = int(data.pop(0)) # to get the column number to match
 word_to_be_matched = data.pop(0) # to get the word to be matched in said column
 col_headers = data.pop(0) # to get the column names
 print(", ".join(col_headers)) # to print the column names
 for line in csv.reader(data):
     if line[column_to_be_matched-1] == word_to_be_matched: #while it matched
        print(", ".join(line)) #print it

在Python中从stdin读取CSV文件并对其进行修改

问题描述

3 个解决方案

解决方案1
1 2019-05-21 08:30:49

解决方案2
1 2019-05-21 08:39:48

解决方案3
0 2019-05-21 08:29:28

在Python中从stdin读取CSV文件并对其进行修改

问题描述

3 个解决方案

解决方案1 1 2019-05-21 08:30:49

解决方案2 1 2019-05-21 08:39:48

解决方案3 0 2019-05-21 08:29:28

解决方案1
1 2019-05-21 08:30:49

解决方案2
1 2019-05-21 08:39:48

解决方案3
0 2019-05-21 08:29:28