在python中迭代时修改2d列表的列

Question

I am trying to write a function that turns all the non-numerical columns in a data set to numerical form. 我正在尝试编写一个函数，将数据集中的所有非数字列都转换为数字形式。

The data set is a list of lists. 数据集是列表的列表。

Here is my code: 这是我的代码：

def handle_non_numerical_data(data):
    def convert_to_numbers(data, index):
        items = []
        column = [line[0] for line in data]
        for item in column:
            if item not in items:
                items.append(item)
        [line[0] = items.index(line[0]) for line in data]
        return new_data

    for value in data[0]:
        if isinstance(value, str):
            convert_to_numbers(data, data[0].index(value))

Apparently [line[0] = items.index(line[0]) for line in data] is not valid syntax and I cant figure out how to modify the first column of data while iterating over it. 显然[line[0] = items.index(line[0]) for line in data]是无效的语法，我无法弄清楚在迭代数据时如何修改第一列数据。

I can't use numpy because the data will not be in numerical form until after this function is run. 我不能使用numpy，因为直到运行此函数后，数据才会采用数字形式。

How do I do this and why is it so complicated? 我该怎么做，为什么这么复杂？ I feel like this should be way simpler than it is... 我觉得这应该比现在简单得多...

In other words, I want to turn this: 换句话说，我想把这个变成：

[[M,0.455,0.365,0.095,0.514,0.2245,0.101,0.15,15],
[M,0.35,0.265,0.09,0.2255,0.0995,0.0485,0.07,7],
[F,0.53,0.42,0.135,0.677,0.2565,0.1415,0.21,9]]

into this: 到这个：

[[0,0.455,0.365,0.095,0.514,0.2245,0.101,0.15,15],
[0,0.35,0.265,0.09,0.2255,0.0995,0.0485,0.07,7],
[1,0.53,0.42,0.135,0.677,0.2565,0.1415,0.21,9]]

Note that the first column was changed from strings to numbers. 请注意，第一列从字符串更改为数字。

Answer 1

Solution 解

data = [['M',0.455,0.365,0.095,0.514,0.2245,0.101,0.15,15],
        ['M',0.35,0.265,0.09,0.2255,0.0995,0.0485,0.07,7],
        ['F',0.53,0.42,0.135,0.677,0.2565,0.1415,0.21,9]]

values = {'M': 0, 'F': 1}

new_data = [[values.get(val, val) for val in line] for line in data]
new_data

Output: 输出：

[[0, 0.455, 0.365, 0.095, 0.514, 0.2245, 0.101, 0.15, 15],
 [0, 0.35, 0.265, 0.09, 0.2255, 0.0995, 0.0485, 0.07, 7],
 [1, 0.53, 0.42, 0.135, 0.677, 0.2565, 0.1415, 0.21, 9]]

Explanation 说明

You can take advantage of Python dictionaries and their get method. 您可以利用Python字典及其get方法。

These are values for the strings: 这些是字符串的值：

values = {'M': 0, 'F': 1}

You can also add more strings like I with a corresponding value. 您还可以添加更多类似I字符串并带有相应的值。

If the string is values , you will get the value from the dict: 如果字符串是values ，则将从dict中获取值：

>>> values.get('M', 'M')
0

Otherwise, you will get the original value: 否则，您将获得原始值：

>>> values.get(10, 10)
10

Answer 2

Rather than indexing (which I'm not sure how it was supposed to work in your example), you can instead create a dictionary mapping for letters to numbers. 除了索引（我不确定在您的示例中应该如何工作）之外，您还可以创建一个字母到数字的字典映射。 Something like this should work. 这样的事情应该起作用。

raw_data = [['M',0.455,0.365,0.095,0.514,0.2245,0.101,0.15,15],
            ['M',0.35,0.265,0.09,0.2255,0.0995,0.0485,0.07,7],
            ['F',0.53,0.42,0.135,0.677,0.2565,0.1415,0.21,9]]

def handle_non_numerical_data(data):
    mapping = {'M': 0, 'F': 1, 'I': 2}

    for item in raw_data:
        if isinstance(item[0], str):
            item[0] = mapping.get(item[0], -1) # Returns -1 if letter not found
    return data

run = handle_non_numerical_data(raw_data)
print(run)

Answer 3

This answer will use a dict to store the coding from str to int . 这个答案将使用dict来存储从str到int的编码。 It can be preloaded and also investigated after the data has been replaced. 可以在数据替换后对其进行预加载和调查。

# MODIFIES DATA IN PLACE
data = [['M',0.455,0.365,0.095,0.514,0.2245,0.101,0.15,15],
        ['M',0.35,0.265,0.09,0.2255,0.0995,0.0485,0.07,7],
        ['F',0.53,0.42,0.135,0.677,0.2565,0.1415,0.21,9]]

coding_dict = {} # can also preload this {'M': 0, 'F':1}
for row in data:
    if row[0] not in coding_dict:
        coding_dict[row[0]] = len(coding_dict)
    row[0] = coding_dict[row[0]]

在python中迭代时修改2d列表的列

问题描述

3 个解决方案

解决方案1
1 已采纳 2017-01-09 16:55:06

Solution 解

Explanation 说明

解决方案2
0 2017-01-09 16:53:57

解决方案3
0 2017-01-09 16:57:41

在python中迭代时修改2d列表的列

问题描述

3 个解决方案

解决方案1 1 已采纳 2017-01-09 16:55:06

Solution 解

Explanation 说明

解决方案2 0 2017-01-09 16:53:57

解决方案3 0 2017-01-09 16:57:41

解决方案1
1 已采纳 2017-01-09 16:55:06

解决方案2
0 2017-01-09 16:53:57

解决方案3
0 2017-01-09 16:57:41