如何将多行字符串转换为数据框

Question

我的示例字符串如下所示：

>>> x3 = '\n      DST: 10.1.1.1\n      DST2: 10.1.2.1\n      DST3: 10.1.3.1\n    \n    \n      DST: 11.1.1.1\n      DST2: 11.1.2.1\n      DST3: 11.1.3.1\n    \n    \n'
>>> print(x3)

  DST: 10.1.1.1
  DST2: 10.1.2.1
  DST3: 10.1.3.1


  DST: 11.1.1.1
  DST2: 11.1.2.1
  DST3: 11.1.3.1

我想将其转换为以 DST、DST2 和 DST3 作为列的数据框

Answer 1

你可以这样做：

# get key, value pairs from string
items = (line.strip().split(': ') for line in x3.splitlines() if line.strip())

# build data
d = {}
for key, value in items:
    d.setdefault(key, []).append(value)

# convert it to a DataFrame
result = pd.DataFrame(d)

print(result)

Output

        DST      DST2      DST3
0  10.1.1.1  10.1.2.1  10.1.3.1
1  11.1.1.1  11.1.2.1  11.1.3.1

该行：

items = (line.strip().split(': ') for line in x3.splitlines() if line.strip())

是一个生成器表达式，出于问题的目的，您可以认为它与以下 for 循环等效（但不相同）：

result = []
for line in x3.splitlines():
    if line.strip():
        result.append(line.strip().split(': '))

另外 splitlines、strip、split 是string的函数。

Answer 2

import pandas as pd

if __name__ == '__main__':

    x3 = "\n      DST: 10.1.1.1\n      DST2: 10.1.2.1\n      DST3: 10.1.3.1\n    \n    \n      DST: 11.1.1.1\n      DST2: 11.1.2.1\n      DST3: 11.1.3.1\n    \n    \n"
    #remove spaces
    x3_no_space = x3.replace(" ", "")
    #remove new lines and replace with &
    x3_no_new_line = x3_no_space.replace("\n", "&")
    #split from &
    x3_split = x3_no_new_line.split("&")

    #data array for store values
    DST_data = []
    #dictionary for make dataframe
    DST_TABLE = dict()

    #loop splitted data
    for DST in x3_split:
        #check if data is empty or not if not empty add data to DST_DATA array
        if DST != '':
            DST_data.append(DST)
            #split data from :
            DST_split = DST.split(":")
            #get column names and store it into dictionary with null array
            DST_TABLE[DST_split[0]] = []

    #read dst array
    for COL_DATA in DST_data:
        #split from :
        DATA = COL_DATA.split(":")
        #loop the dictionary
        for COLS in DST_TABLE:
            #check if column name of dictionary equal to splitted data 0 index if equals append the data to column
            if DATA[0] == COLS:
                DST_TABLE[COLS].append(DATA[1])

    # this is dictionary
    print("Python dictionary")
    print(DST_TABLE)

    # convert dictionary to dataframe using pandas
    dataframe = pd.DataFrame.from_dict(DST_TABLE)
    print("DATA FRAME")
    print(dataframe)

如何将多行字符串转换为数据框

问题描述

2 个解决方案

解决方案1
4 已采纳 2019-11-20 19:46:33

解决方案2
2 2019-11-20 20:14:13

如何将多行字符串转换为数据框

问题描述

2 个解决方案

解决方案1 4 已采纳 2019-11-20 19:46:33

解决方案2 2 2019-11-20 20:14:13

解决方案1
4 已采纳 2019-11-20 19:46:33

解决方案2
2 2019-11-20 20:14:13