[英]How to load in txt file as data in Python?
正如其他人提到的那样,您要寻找的形状有点不清楚,但作为一般的入门者,并将数据转换为非常灵活的格式,您可以将文本文件读入 python 并将其转换为熊猫数据帧. 我确信他们有其他更紧凑的方法来做到这一点,但只是为了提供清晰的步骤,我们可以开始:
import pandas as pd
import re
file = 'filepath' #this is the file path to the saved text file
music = open(file, 'r')
lines = music.readlines()
# split the lines by comma
lines = [line.split(',') for line in lines]
# capturing the column line
columns = lines[9]
# capturing the actual content of the data, and dismissing the header info
content = lines[10:]
musicdf = pd.DataFrame(content)
# assign the column names to our dataframe
musicdf.columns = columns
# preview the dataframe
musicdf.head(10)
# the final column had formatting issues, so wanted to provide code to get rid of the "\n" in both the column title and the column values
def cleaner(txt):
txt = re.sub(r'[\n]+', '', txt)
return txt
# rename the column of issue
musicdf = musicdf.rename(columns = {'var_timbre12\n' : 'var_timbre12'})
# applying the column cleaning function above to the column of interest
musicdf['var_timbre12'] = musicdf['var_timbre12'].apply(lambda p: cleaner(p))
# checking the top and bottom of dataframe for column var_timbre12
musicdf['var_timbre12'].head(10)
musicdf['var_timbre12'].tail(10)
结果如下:
%genre track_id artist_name
0 classic pop and rock TRFCOOU128F427AEC0 Blue Oyster Cult
1 classic pop and rock TRNJTPB128F427AE9F Blue Oyster Cult
通过使用这种格式的数据,您现在可以使用 pandas groupby 函数执行大量分组任务、查找某些流派及其相关属性等。
希望这可以帮助!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.