简体   繁体   English

统计某个position在列表中显示的次数

[英]Counting the number of times a certain position is shown in a list

So I have a list of "clients" that I need to count how many times in every line this "element" shows up.所以我有一个“客户”列表,我需要计算这个“元素”在每一行中出现了多少次。 a little snippet of the text file inside the.zip FakeCostomers : .zip FakeCostomers中的文本文件的一小段:

1,female,Melissa,J,Palmer,4 Lynch Street,Milwaukee,WI,53213,US,Melissa.J.Palmer@gmail.com,920-959-8247,9/29/1972,Visa,281.84,5

2,male,Edwin,M,Corder,4302 Pick Street,RIDGEWAY,CO,81432,US,Edwin.M.Corder@outlook.com,970-626-1897,2/28/1953,Visa,277.58,16

3,female,Laura,A,Olvera,365 Tori Lane,Salt Lake City,UT,84116,US,Laura.A.Olvera@yahoo.com,801-599-5964,4/11/1963,MasterCard,560.63,24

4,male,Wayne,D,Adams,3643 Nash Street,Chicago,IL,60605,US,Wayne.D.Adams@yahoo.com,312-948-6927,7/16/1957,Visa,320.11,3

5,female,Mari,R,Smith,3024 Atha Drive,Palmdale,CA,93550,US,Mari.R.Smith@gmail.com,661-574-4919,7/30/1973,MasterCard,798.58,28

6,male,Craig,H,Salazar,3929 Goosetown Drive,Hendersonville,NC,28792,US,Craig.H.Salazar@gmail.com,828-697-6697,1/15/1959,Visa,183.35,29

7,male,Henry,S,Clark,205 Charla Lane,Mesquite,TX,75150,US,Henry.S.Clark@gmail.com,972-686-5507,8/28/1962,Visa,650.58,27

8,male,Jerry,L,Littleton,1652 My Drive,Elmsford,NY,10523,US,Jerry.L.Littleton@gmail.com,347-219-4091,9/5/1975,MasterCard,525.73,8

9,female,Georgia,V,Allen,1226 Jefferson Street,Norfolk,VA,23510,US,Georgia.V.Allen@yahoo.com,757-774-4490,5/17/1952,Visa,910.39,6

10,male,Ted,A,Harding,2143 Lake Floyd Circle,HOCKESSIN,DE,19707,US,Ted.A.Harding@gmail.com,302-239-3674,7/12/1958,MasterCard,307.51,25

11,male,Jose,J,Houston,2639 Olive Street,Shelby,OH,44875,US,Jose.J.Houston@gmail.com,419-342-5793,4/23/1943,Visa,447.97,27

For example if I wanted to find out how many females there are in this list.例如,如果我想知道这个列表中有多少女性。

So far I have tried:到目前为止,我已经尝试过:

def getColumnDistribution(filename,columnNum):

    file = open(filename,"r")
    listoflists = []

    for line in file:

        stripped_line = line.strip()

        line_list = stripped_line.split()

        listoflists.append(line_list)

        NUMBER = line_list.count(line_list[columnNum])

it keeps coming up with "list index out of range" Anyone know how I can fix it or use a better method?它不断出现"list index out of range"有人知道我该如何修复它或使用更好的方法吗?

Python gives you a lot of tools that make these kind of tasks painless. Python 为您提供了许多使此类任务变得轻松的工具。 For example, you can pass the issues of csv parsing to the csv module and counter to collections.Counter .例如,您可以将 csv 解析的问题传递给csv模块和计数器collections.Counter Then it's just a couple lines:然后它只是几行:

import csv
from collections import Counter

with open(path, 'r') as f:
    reader = csv.reader(f)
    headers = next(reader)    # pop off the header row

    genderCounts = Counter(row[1] for row in reader)

print(genderCounts['female'])
# 15167

print(genderCounts['male'])
# 14833

If you use theDictreader from csv, you can index by columns name, which makes the code more readable:如果您使用Dictreader中的 Dictreader,您可以按列名进行索引,这使得代码更具可读性:

with open(path, 'r') as f:
    reader = csv.DictReader(f)

    genderCounts = Counter(row['Gender'] for row in reader)

Of course if you are doing a lot of work on data like this, pandas will make you life substantially easier:当然,如果你对这样的数据做了大量工作,pandas 会让你的生活变得更轻松:

import pandas as pd

df = pd.read_csv(path)
df['Gender'].value_counts()

# female    15167
# male      14833
# Name: Gender, dtype: int64

This will work for you.这对你有用。 Tested on 3.6v在 3.6v 上测试

import csv

def openFile(file_name:str)->tuple:
    with open(file_name,'r') as csv_file:
        csv_reader = csv.reader(csv_file)
        return tuple(csv_reader)  

def getColumnDistribution(csv_data:tuple,name_to_count:str)->int:

    num_of_count = [idx for idx,rows in enumerate(csv_data) if name_to_count in rows]
    print("number of occurence:",num_of_count)
    return len(num_of_count)

#driver code
csv_data = openFile(your_csv_file_name)
getColumnDistribution(csv_data,'female')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM