如何在 Python 中执行此拆分过程？

Question

I'm trying to make a data labeling in a table, and I need to do it in such a way that, in each row, the index is repeated, however, that in each column there is another Enum class.我正在尝试在表中制作数据标签，并且我需要以这样一种方式进行操作，即在每一行中，索引都重复，但是，在每一列中都有另一个 Enum class。

What I've done so far is make this representation with the same enumerator class.到目前为止，我所做的是使用相同的枚举器 class 进行此表示。

A solution using the column separately as a list would also be possible.将列单独用作列表的解决方案也是可能的。 But what would be the best way to resolve this?但是解决这个问题的最佳方法是什么？

import pandas as pd
from enum import Enum


df = pd.DataFrame({'first': ['product and other', 'product2 and other', 'price'], 'second':['product and prices', 'price2', 'product3 and price']})
df

class Tipos(Enum):
    B = 1
    I = 2
    L = 3

for index, row in df.iterrows():
    sentencas = row.values
    for sentenca in sentencas:
        for pos, palavra in enumerate(sentenca.split()):
            print(f"{palavra} {Tipos(pos+1).name}")

Results:结果：

                first              second
0   product and other  product and prices
1  product2 and other              price2
2               price  product3 and price

product B
and I
other L
product B
and I
prices L
product2 B
and I
other L
price2 B
price B
product3 B
and I
price L

Desired Results:期望的结果：

        Word Ent
0    product B_first
1        and I_first
2      other L_first
3    product B_second
4        and I_second
5     prices L_second
6   product2 B_first
7        and I_first
8      other L_first
9     price2 B_second
10     price B_first
11  product3 B_second
12       and I_second
13     price L_second

# In that case, the sequence is like that: (B_first, I_first, L_first, L_first...) and if changes the column gets B_second, I_second, L_second...

Answer 1

Instead of using Enum you can use a dict mapping.您可以使用dict映射，而不是使用Enum 。 You can avoid loops if you flatten your dataframe:如果您将 dataframe 展平，则可以避免循环：

out = df.unstack().str.split().explode().sort_index(level=1).to_frame('Word')
out['Ent'] = out.groupby(level=[0, 1]).cumcount().map(Tipos) \
                 + '_' + out.index.get_level_values(0)
out = out.reset_index(drop=True)

Output: Output：

>>> out
        Word       Ent
0    product   B_first
1        and   I_first
2      other   L_first
3    product  B_second
4        and  I_second
5     prices  L_second
6   product2   B_first
7        and   I_first
8      other   L_first
9     price2  B_second
10     price   B_first
11  product3  B_second
12       and  I_second
13     price  L_second

如何在 Python 中执行此拆分过程？

问题描述

1 个解决方案

解决方案1
2 2021-12-30 13:57:33

如何在 Python 中执行此拆分过程？

问题描述

1 个解决方案

解决方案1 2 2021-12-30 13:57:33

解决方案1
2 2021-12-30 13:57:33