[英]convert .dat into .csv in python
I want to convert a data set of an .dat
file into csv
file. 我想将
.dat
文件的数据集转换为csv
文件。 The data format looks like, 数据格式如下:
Each row begins with the sentiment score followed by the text associated with that rating.
I want the have sentiment value of (-1 or 1) to have a column and the text of review corresponding to the sentiment value to have an review to have an column. 我希望具有(-1或1)的情感值具有一列,而与情感值相对应的评论文本则具有具有一列的评论。
WHAT I TRIED SO FAR 我尝试过的如此之遥
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import csv
# read flash.dat to a list of lists
datContent = [i.strip().split() for i in open("train.dat").readlines()]
# write it as a new CSV file
with open("train.csv", "wb") as f:
writer = csv.writer(f)
writer.writerows(datContent)
def your_func(row):
return row['Sentiments'] / row['Review']
columns_to_keep = ['Sentiments', 'Review']
dataframe = pd.read_csv("train.csv", usecols=columns_to_keep)
dataframe['new_column'] = dataframe.apply(your_func, axis=1)
print dataframe
Sample screen shot of the resulting train.csv it has an comma after every word in the review. 生成的train.csv的示例屏幕截图,它在审阅中的每个单词后面都有一个逗号。
If all your rows follow that consistent format, you can use pd.read_fwf
. 如果所有行都遵循一致的格式,则可以使用
pd.read_fwf
。 This is a little safer than using read_csv
, in the event that your second column also contains the delimiter you are attempting to split on. 这比使用
read_csv
安全一些,如果第二列还包含您要分割的定界符。
df = pd.read_fwf('data.txt', header=None,
widths=[2, int(1e5)], names=['label', 'text'])
print(df)
label text
0 -1 ieafxf rjzy xfxk ymi wuy
1 1 lqqm ceegjnbjpxnidygr
2 -1 zss awoj anxb rfw kgbvnl
data.txt
-1 ieafxf rjzy xfxk ymi wuy
+1 lqqm ceegjnbjpxnidygr
-1 zss awoj anxb rfw kgbvnl
As mentioned in the comments, read_csv would be appropriate here. 如评论中所述,在这里read_csv是合适的。
df = pd.read_csv('train_csv.csv', sep='\t', names=['Sentiments', 'Review'])
Sentiments Review
0 -1 alskjdf
1 1 asdfa
2 1 afsd
3 -1 sdf
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.