繁体   English   中英

Python 3将数据文件转换为字典,每个键具有多个值并显示它

[英]Python 3 converting datafile to dictionary with multiple values per key and displaying it

我四处寻找答案,但找不到答案。 我有一个包含这些键/值的数据文件:

oolong:8580.0:7201.25:8900.0 
earl grey:10225.25:9025.0:9505.0
green:6700.1:5012.45:6011.0
mint:9285.15:8276.1:8705.0
jasmine:7901.25:4267.0:7056.5

数据如下:– tea_name:store1_Sales:store2_Sales:store3_Sales

我需要能够显示以下输出:

>>> earl grey 10225.25 9025.00 9505.00 28755.25
green 6700.10 5012.45 6011.00 17723.55
jasmine 7901.25 4267.00 7056.50 19224.75
mint 9285.15 8276.10 8705.00 26266.25
oolong 8580.00 7201.25 8900.00 24681.25
       42691.75 33781.80 40177.50

我了解我可以使用以下方式将文件加载为列表

with open('tea.txt') as f:
   teas = f.read().splitlines()

我不知道如何将列表转换为具有多个单键值的字典。 任何帮助表示赞赏。

编辑:我知道现在如何获取列表并转换成字典。 多谢你们!

获取字典的最简单方法:

 with open('1.txt') as f: data = {} for row in f: row = row.strip().split(':') data[row[0]] = row[1:] for key, value in data.items(): print('%s %s %s' % (key, ' '.join(value), sum([float(v) for v in value]))) 

您可以轻松地使用pandas

import pandas as pd
from io import StringIO

# makes it easy to read globs of text like the data you posted above
data = StringIO('''oolong:8580.0:7201.25:8900.0 
earl grey:10225.25:9025.0:9505.0
green:6700.1:5012.45:6011.0
mint:9285.15:8276.1:8705.0
jasmine:7901.25:4267.0:7056.5''')

df = pd.read_csv(data, sep = ':', header = None)

# returns a list of column names from the string you have above
df.columns = "tea_name:store1_Sales:store2_Sales:store3_Sales".split(':')

# add up the sales for stores 1, 2, and 3 for each type of tea to get total sales for a given tea
df['total_sales'] = df[['store1_Sales', 'store2_Sales', 'store3_Sales']].sum(axis = 1)

结果如下:

>>> df
    tea_name  store1_Sales  store2_Sales  store3_Sales  total_sales
0     oolong       8580.00       7201.25        8900.0     24681.25
1  earl grey      10225.25       9025.00        9505.0     28755.25
2      green       6700.10       5012.45        6011.0     17723.55
3       mint       9285.15       8276.10        8705.0     26266.25
4    jasmine       7901.25       4267.00        7056.5     19224.75

编辑:要从此pandas.DataFrame对象获取dict ,只需执行以下操作:

>>> df.to_dict()
{'store1_Sales': {0: 8580.0, 1: 10225.25, 2: 6700.1000000000004, 3: 9285.1499999999996, 4: 7901.25}, 'tea_name': {0: 'oolong', 1: 'earl grey', 2: 'green', 3: 'mint', 4: 'jasmine'}, 'total_sales': {0: 24681.25, 1: 28755.25, 2: 17723.549999999999, 3: 26266.25, 4: 19224.75}, 'store3_Sales': {0: 8900.0, 1: 9505.0, 2: 6011.0, 3: 8705.0, 4: 7056.5}, 'store2_Sales': {0: 7201.25, 1: 9025.0, 2: 5012.4499999999998, 3: 8276.1000000000004, 4: 4267.0}}

Edit2:忽略pandas ,您可以像这样在基本Python中完成所需的操作,

teas_dict = {}
for row in teas:
    row_list = row.split(':')
    tea = row_list[0] # tea name is always the first element in a row
    sales = row_list[1:] # remaining elements in row_list are sales data
    teas_dict[tea] = sales

等效地,使用dict理解:

>>> teas_dict = {row.split(':')[0]: row.split(':')[1:] for row in teas}
>>> teas_dict
{'earl grey': ['10225.25', '9025.0', '9505.0'], 'green': ['6700.1', '5012.45', '6011.0'], 'oolong': ['8580.0', '7201.25', '8900.0 '], 'mint': ['9285.15', '8276.1', '8705.0'], 'jasmine': ['7901.25', '4267.0', '7056.5']}

最后,要获得最后的累计销售额:

for tea in teas_dict:
    total_sales = sum(map(float, teas_dict[tea]))
    teas_dict[tea].append(total_sales)

结果:

>>> teas_dict
{'earl grey': ['10225.25', '9025.0', '9505.0', 28755.25], 'green': ['6700.1', '5012.45', '6011.0', 17723.55], 'oolong': ['8580.0', '7201.25', '8900.0 ', 24681.25], 'mint': ['9285.15', '8276.1', '8705.0', 26266.25], 'jasmine': ['7901.25', '4267.0', '7056.5', 19224.75]}

有很多不同的方法来解决此问题。 我将向您展示一种阅读茶叶的方法。

teas = {}
with open('tea.txt') as f:
    # step through the file, line by line,
    # so that you don't read in a huge file all at once
    for line in f:
        # split the line by your delimiter ':'
        t = line.split(':')
        # create your dictionary with a key, value pair
        teas[t[0]] = t[1:]

如果需要每个列表的数值,则可以将它们映射到适当的数据类型。 这是两个如何更改上面的最后一行以获得数字列表/数组的示例。

  • 您可以使用简单的内置函数来做到这一点:

     teas[t[0]] = map(float, t[1:]) 
  • 或者,您可以使用一个numpy数组

     import numpy # .... teas[t[0]] = numpy.array(t[1:], dtype=float) 

最终的字典如下所示:

{'earl grey': (10225.25, 9025.0, 9505.0),
 'green': (6700.1, 5012.45, 6011.0),
 'jasmine': (7901.25, 4267.0, 7056.5),
 'mint': (9285.15, 8276.1, 8705.0),
 'oolong': (8580.0, 7201.25, 8900.0)}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM