繁体   English   中英

如何将多行字符串转换为数据框

[英]How to convert multiple line string to data frame

我的示例字符串如下所示:

>>> x3 = '\n      DST: 10.1.1.1\n      DST2: 10.1.2.1\n      DST3: 10.1.3.1\n    \n    \n      DST: 11.1.1.1\n      DST2: 11.1.2.1\n      DST3: 11.1.3.1\n    \n    \n'
>>> print(x3)

  DST: 10.1.1.1
  DST2: 10.1.2.1
  DST3: 10.1.3.1


  DST: 11.1.1.1
  DST2: 11.1.2.1
  DST3: 11.1.3.1

我想将其转换为以 DST、DST2 和 DST3 作为列的数据框

你可以这样做:

# get key, value pairs from string
items = (line.strip().split(': ') for line in x3.splitlines() if line.strip())

# build data
d = {}
for key, value in items:
    d.setdefault(key, []).append(value)

# convert it to a DataFrame
result = pd.DataFrame(d)

print(result)

Output

        DST      DST2      DST3
0  10.1.1.1  10.1.2.1  10.1.3.1
1  11.1.1.1  11.1.2.1  11.1.3.1

该行:

items = (line.strip().split(': ') for line in x3.splitlines() if line.strip())

是一个生成器表达式,出于问题的目的,您可以认为它与以下 for 循环等效(但不相同):

result = []
for line in x3.splitlines():
    if line.strip():
        result.append(line.strip().split(': '))

另外 splitlines、strip、split 是string的函数。

import pandas as pd

if __name__ == '__main__':

    x3 = "\n      DST: 10.1.1.1\n      DST2: 10.1.2.1\n      DST3: 10.1.3.1\n    \n    \n      DST: 11.1.1.1\n      DST2: 11.1.2.1\n      DST3: 11.1.3.1\n    \n    \n"
    #remove spaces
    x3_no_space = x3.replace(" ", "")
    #remove new lines and replace with &
    x3_no_new_line = x3_no_space.replace("\n", "&")
    #split from &
    x3_split = x3_no_new_line.split("&")

    #data array for store values
    DST_data = []
    #dictionary for make dataframe
    DST_TABLE = dict()

    #loop splitted data
    for DST in x3_split:
        #check if data is empty or not if not empty add data to DST_DATA array
        if DST != '':
            DST_data.append(DST)
            #split data from :
            DST_split = DST.split(":")
            #get column names and store it into dictionary with null array
            DST_TABLE[DST_split[0]] = []

    #read dst array
    for COL_DATA in DST_data:
        #split from :
        DATA = COL_DATA.split(":")
        #loop the dictionary
        for COLS in DST_TABLE:
            #check if column name of dictionary equal to splitted data 0 index if equals append the data to column
            if DATA[0] == COLS:
                DST_TABLE[COLS].append(DATA[1])

    # this is dictionary
    print("Python dictionary")
    print(DST_TABLE)

    # convert dictionary to dataframe using pandas
    dataframe = pd.DataFrame.from_dict(DST_TABLE)
    print("DATA FRAME")
    print(dataframe)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM