簡體   English   中英

將 txt 文件轉換為 python 中的數據框

[英]convert a txt file as data frame in python

我有一個包含一些數據的文本文件。 我需要將我的文本文件拆分為數據框。 這是我的文本文件:

    2012/02/03 18:55:54 SampleClass1 verb detail for id 19471668
    verb detail for id 185289
    verb detail for id 185289
    verb detail for id 1852849
    2012/03/03 18:55:54 SampleClass8 detail for id 2181536
    2012/04/03 18:55:54 SampleClass1 verb detail for id 1765383670
    2012/05/03 18:55:54 SampleClass9 verb detail for id 1666944491
    2012/06/03 18:55:54 SampleClass8 detail for id 799914029 verb detail for id 185229

我想分別拆分日期和時間以及一些字符串,然后我需要將其轉換為數據框。

我預期的 output:

date       time     desc
2012/02/03 18:55:54 SampleClass9 verb detail for id 1947166588
                    verb  detail for id 185289
                    verb detail for id 185289
                    verb detail for id 1852849

2012/03/03 18:55:54 SampleClass8 detail for id 218851536
                    verb detail for id 1852829
                    verb detail for id 185289
                    verb detail for id 1852849
2012/04/03 18:55:54 SampleClass1 verb detail for id 1765383670
                    verb detail for id 1852829
                    verb detail for id 1852829
                    verb detail for id 1852849
2012/05/03 18:55:54 SampleClass9 verb detail for id 1666944491
                    verb detail for id 1852829
                     verb detail for id 1852829
                     verb detail for id 18528429
2012/06/03 18:55:54 SampleClass8 detail for id 799914029 verb detail for id 1852844029
                    verb detail for id 1852829
                    verb detail for id 1852829
                    verb detail for id 18528429

根據您輸入的數據,下面的代碼可以完成這項工作。

import csv
import pandas as pd
    
file = "/path/to/file/"
# Open CSV file
with open(file, "r", newline="") as fp:
    # Read the text file and use a space delimiter
    reader = csv.reader(fp, delimiter=" ")
    rows = []
    # loop through the rows
    for row in reader:
        # if empty row then continue
        if not row:
            continue
        #if the first character of the row is a number join the columns after
        # column 2, as columns one and two are already separated
        elif row[0][0].isdigit():
            rows.append(row[:2]+ [' '.join(row[2:])])
        # else add two columns and then join the columns
        else:
            rows.append(['','']+ [' '.join(row)])
        df = pd.DataFrame(rows, columns=['date','time','desc'])

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM