[英]In Python, how can I construct a loop that allows me read a txt file (tab delimited) and store each 1000 rows as its own dataframe?
Below is a sample of my data, there is ten row header, then exactly 1000 rows of data, then it repeats for 30 cycles (these are trials for a lab experiment).下面是我的数据样本,有 10 行标题,然后正好是 1000 行数据,然后重复 30 个周期(这些是实验室实验的试验)。 I have 8 of these files with the same format and I would like to extract each batch so I can then do some stuff.我有 8 个格式相同的文件,我想提取每个批次,以便我可以做一些事情。 How do I make a loop that creates a new dataframe each time to store the new rows?如何创建一个循环,每次创建一个新的数据帧来存储新行?
Channels 1
Samples 1000
Date 2020/02/12
Time 10:11:36.6395705499426038443
Y_Unit_Label Volts
X_Dimension Time
X0 0.0000000000000000E+0
Delta_X 0.001000
***End_of_Header***
X_Value Voltage Comment
0.000000 4.930675 4.96V\0A69.0 cm\0A6.9 degrees
0.001000 4.934949
0.002000 4.931990
0.003000 4.923443
I'm trying to do something like the code below, but I can't figure out to get pandas to create a new dataframe for each iteration.我正在尝试执行类似于下面的代码的操作,但我无法弄清楚让 Pandas 为每次迭代创建一个新的数据框。
collection=['Rawdata01.txt','Rawdata02.txt','Rawdata03.txt']
result = pd.DataFrame()
for i in collection:
j=0
mydf = pd.read_csv(i,sep='\t',header=(0),index_col=False)
for row in mydf.iterrows():
result = csv[1000*j + 10*(j+1):1000*(j+1) + 10*(j+1)] # how to get it to make newdataframes
print(result.head())
j=j+1
I've gotten pretty close, but I'm stuck on how to proceed to getting either separate dataframes for each batch or one big one.我已经很接近了,但我一直在思考如何继续为每个批次或一个大批次获取单独的数据帧。 At this point, either will work.在这一点上,任一个都行。 Any help on this matter would be greatly appreciated.对此事的任何帮助将不胜感激。
After this line:在这一行之后:
mydf = pd.read_csv(i,sep='\t',header=(0),index_col=False)
you already have the tab-delimited data in single dataframe.您已经在单个数据框中拥有制表符分隔的数据。 To break it into 1000-rows each, you can try this:要将其分成 1000 行,您可以尝试以下操作:
sub_frames = [mydf.iloc[startrow:startrow+1000] for startrow in range(0, len(mydf), 1000)]
then you have a list of dataframes, each has 1000 rows (except possibly the last one).然后你有一个数据框列表,每个有 1000 行(除了最后一行)。 The iloc
of a dataframe is to extract rows from a bigger dataframe.数据帧的iloc
是从更大的数据帧中提取行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.