简体   繁体   中英

Create a matrix from a .txt file with strings

I need to create a matrix from a .txt file, the proble I'm having is that the .txt file has a LOT of information (maybe no that much, but I'm new on programming so I have problems dealing with it)... The file has:

  • Client number
  • Name
  • Last name
  • Age
  • Single or married (S or M)
  • Sex (M or F)
  • Favourite animal

like this:

1 James Gordon 35 M M Dog
...
75 Julius Harrison 48 S M Cat

I managed to read the file and create a list for each one of the persons, but also it's needed to calculate the average of age, sex... I don't know how to separate each one of the elements so I can do the math. Here's the code so far.

infile=open("db.txt","r")
list=infile.read()

matrix=[]

raw = []
with open('db.txt','r') as f:
    for line in f:
        raw.append(line.split())

You are using a list of list ( raw ) to hold the data. You can use list index to access the data.

eg average age

ages = [r[3] for r in raw]
average_age = float(sum(ages))/len(ages)

Since you've created a 2D list, you can simply access the elements using X and Y coordinates. For example, to calculate the average of their ages, you can sum up the fourth element of each person's sublist. The following page offers several ways to do this:

Calculate mean across dimension in a 2D array

Depending on what things you're interested in doing with your matrix, it may be a good idea putting everything in a numpy array. You can slice objects in either axis more easily and numpy arrays are faster than transversing lists.

import numpy as np

# read file and store the file lines in `raw` as you have done above, then
matrix = np.array(raw, dtype=object)
matrix[:,3] = matrix[:,3].astype(int)

average = np.mean(matrix[:,3])

If you want to count how many males you have, you can count how many Ms you have in the gender column.

male_no = len(np.where(matrix[:,5] == 'M'))

However, an even better way for counting items, especially for animals, where you may have more than 1-2 options, you can use Counter from the collection package.

from collections import Counter

gender_count = Counter(matrix[:,5])
for key in gender_count.keys():
    print key, gender_count[key]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM