简体   繁体   English

PYTHON:将现有列拆分为多个,而不影响其他列

[英]PYTHON: Split Existing Column into Multiple without Affecting other columns

I just started learning PYTHON. 我刚刚开始学习PYTHON。 I tried to search an answer for my problem but didn't have luck. 我试图为我的问题寻找答案,但是没有运气。

I have an excel file with multiple columns. 我有一个包含多列的Excel文件。

For example, this is what I have in the Excel file. 例如,这就是Excel文件中的内容。

Current Data Set 当前数据集

and I would like to change the file to look like below. 我想将文件更改为如下所示。 I used "Text to Columns" on Excel to do this(highlighted in yellow), but couldn't figure out how to do it using Python without affecting other columns. 我在Excel上使用“文本到列”来执行此操作(以黄色突出显示),但无法弄清楚如何使用Python进行操作而不影响其他列。

Desired outcome 期望的结果

I would greatly appreciate your help! 非常感谢您的帮助!

Best, Tae 太好了

This should go something like below: 这应该如下所示:

data['a'], data['col2'] = data['Information'].str.split('-', 1).str
data['b'], data['col3'] = data['col2'].str.split('-', 1).str
data['c'], data['col4'] = data['col3'].str.split('-', 1).str
data['d'], data['e'] = data['col4'].str.split('-', 1).str

This may not be the efficient way but will work for sure. 这可能不是有效的方法,但可以肯定地起作用。 This will spilt col Information in 5 different columns 这会将col Information溢出到5个不同的列中

Updated answer as per updated data in question 根据有问题的更新数据更新答案

data = pd.read_excel("/path/to/file/Example for Pygo.xlsx")
data['a'], data['col2'] = data['Information'].str.split('-', 1).str
data['b'], data['col3'] = data['col2'].str.split('-', 1).str
data['c'], data['col4'] = data['col3'].str.split('-', 1).str
data['d'], data['e'] = data['col4'].str.split('-', 1).str
data = data.drop(['Information','col2', 'col3', 'col4'], axis = 1)

Check out the string.split() method. 检出string.split()方法。 You can pass in an argument to split on, in this case string.split('-') 您可以传入一个参数进行拆分,在这种情况下为string.split('-')

array[index]=array[index].split('-')

one easy way is to use dataframe to process the dataset. 一种简单的方法是使用数据框处理数据集。 1. read the xls file into dataframe using, you may find the details here xls into dataframe 1.使用将xls文件读入数据框,您可以在此处找到详细信息xls到数据框

  1. Now use merge, lambda and split. 现在使用merge,lambda和split。

please find examples below. 请在下面找到示例。

Example - 2 lines only 示例-仅2行

import pandas as pd

df = pd.read_excel(open('/Users/xxx/Downloads/ExampleforPygo.xlsx','rb'), sheet_name=0)
df = df.merge(df.apply(lambda row: pd.Series(row['Information'].split('-')), axis=1), left_index=True, right_index=True)

print(df)

Example with separate function. 具有单独功能的示例。

    import pandas as pd

    def splitInfomation(information):
        ret = {}
        splits = information.split('-')
        for idx, split in enumerate(splits):
            ret['split' + str(idx)] = split
        return pd.Series(ret)

    df = pd.read_excel(open('/Users/xxxx/Downloads/ExampleforPygo.xlsx','rb'), sheet_name=0)

    df = df.merge(df.apply(lambda row: splitInfomation(row['Information']), axis=1), left_index=True, right_index=True)

    print(df)

Updated the Answer based on your example file given, in your case the datafile is xlsx so, you have to do like below, You can use Just str.split method to get the Job done, i also used fillna in case whereas no values Just mark them None . 根据给定的示例文件更新了Answer,在您的情况下,数据文件为xlsx因此,您必须执行以下操作,可以使用Just str.split方法完成任务,在没有值的情况下,我也使用fillna将它们标记为“ None

When using expand=True , the split elements will expand out into separate columns. 当使用expand=True ,split元素将扩展为单独的列。

>>> import pandas as pd
>>> pd.set_option('display.height',     None)
>>> pd.set_option('display.max_rows',   None)
>>> pd.set_option('display.max_columns',None)
>>> pd.set_option('display.width',      None)


>>> data_xls = pd.read_excel("Example_data.xlsx", index_col=None).fillna('')
>>> data_xls['Information'].str.split('-', expand=True).head(30)
     0        1                    2            3                     4
0   us  EXAMPLE             article1   scrolldown            findoutnow
1   us  EXAMPLE             article1  scrollright                  None
2   us  EXAMPLE             article1   findoutnow                  None
3   us  EXAMPLE   payablesmanagement   findoutnow                  None
4   us  EXAMPLE  strategicpurchasing  scrollright                  None
5   us  EXAMPLE             article1    learnmore         profitmargins
6   us  EXAMPLE   payablesmanagement  scrollright                  None
7   us  EXAMPLE             article2  scrollright                  None
8   us  EXAMPLE  controlandvisibilty   findoutnow                  None
9   us  EXAMPLE             article1   scrollleft                  None
10  us  EXAMPLE             homepage     amexlogo                  None
11  us  EXAMPLE        profitmargins   findoutnow                  None
12  us  EXAMPLE             article3   findoutnow                  None
13  us  EXAMPLE             article1    learnmore    payablesmanagement
14  us  EXAMPLE             article2   scrollleft                  None
15  us  EXAMPLE             article3  scrollright                  None
16  us  EXAMPLE             homepage     readmore    payablesmanagement
17  us  EXAMPLE             article1         None                  None
18  us  EXAMPLE             homepage      homenav            findoutnow
19  us  EXAMPLE  controlandvisibilty  scrollright                  None
20  us  EXAMPLE             homepage      homenav    payablesmanagement
21  us  EXAMPLE             homepage       scroll            findoutnow
22  us  EXAMPLE             article3   scrollleft                  None
23  us  EXAMPLE             article1    learnmore   strategicpurchasing
24  us  EXAMPLE             article1    learnmore  controlandvisibility
25  us  EXAMPLE             article1   scrolldown            findoutnow
26  us  EXAMPLE             article1  scrollright                  None
27  us  EXAMPLE             article1   findoutnow                  None
28  us  EXAMPLE   payablesmanagement   findoutnow                  None
29  us  EXAMPLE  strategicpurchasing  scrollright                  None

Borrowed From @Jon.. to get the whole dataset along with your orignal ones & new ones included... 从@Jon ..借来以获取整个数据集以及原始数据和新数据。

>>> data_xls.join(data_xls['Information'].str.split('-', expand=True).add_prefix('newCol_')).head()

        Date                                 Information  EXAMPLE_LinkedIn_SponsoredContent_Visits  EXAMPLE_LinkedIn_inMail_Visits  EXAMPLE_DBM_Native_Visits  EXAMPLE_SGCPB_Native_Visits  EXAMPLE_SGCBDC_Email_Visits  EXAMPLE_SGCPB_Email_Visit  \
0 2018-08-20   us-EXAMPLE-article1-scrolldown-findoutnow                                         0                               0                          0                            0                            0                          0
1 2018-08-20             us-EXAMPLE-article1-scrollright                                         0                               0                          0                            0                            0                          0
2 2018-08-20              us-EXAMPLE-article1-findoutnow                                         1                               0                          1                            0                            0                          0
3 2018-08-20    us-EXAMPLE-payablesmanagement-findoutnow                                         0                               0                          0                            0                            0                          0
4 2018-08-20  us-EXAMPLE-strategicpurchasing-scrollright                                         0                               0                          0                            0                            0                          0

   EXAMPLE_SGCBDC_Native_Visits  EXAMPLE_ConstructionDive_Email_Visit  EXAMPLE_ConstructionDive_PromotedStory_Visit  EXAMPLE_SGCPB_PromotedStory_Visit  EXAMPLE_SGCBDC_PromotedStory_Visit  EXAMPLE_ConstructionDive_Native_Visits newCol_0 newCol_1  \
0                             0                                     0                                             0                                  0                                   0                                       0       us  EXAMPLE
1                             0                                     0                                             0                                  0                                   0                                       0       us  EXAMPLE
2                             0                                     0                                             0                                  0                                   0                                       0       us  EXAMPLE
3                             0                                     0                                             0                                  0                                   0                                       0       us  EXAMPLE
4                             0                                     0                                             0                                  0                                   0                                       0       us  EXAMPLE

              newCol_2     newCol_3    newCol_4
0             article1   scrolldown  findoutnow
1             article1  scrollright        None
2             article1   findoutnow        None
3   payablesmanagement   findoutnow        None
4  strategicpurchasing  scrollright        None

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM