Pandas 没有读取整个 .CSV 文件

Question

I have been testing some issues I've had with Pandas.我一直在测试我在 Pandas 上遇到的一些问题。 My end goal here is to add data to a .csv.我的最终目标是将数据添加到 .csv。 While figuring out ways to change a .csv, I settled on this method:在寻找更改 .csv 的方法时，我选择了这种方法：

import pandas
data = pandas.read_csv('path/to/my/script/test.csv')

data.iat[1,1] = 'DataHere'

data.to_csv('path/to/my/script/test.csv', index=False, header=False)

This code worked somewhat correctly.这段代码工作得有些正确。 DataHere goes to the second row and second column, which is correct (because [0,0] is the first row and column. Note: it's not the normal x,y coordinates, it's more like y,x). DataHere转到第二行第二列，这是正确的（因为 [0,0] 是第一行和第一列。注意：它不是正常的 x,y 坐标，它更像是 y,x）。

test.csv before code (6x6):代码前的 test.csv (6x6)：

yes,yes,yes,yes,yes,yes
yes,yes,yes,yes,yes,yes
yes,yes,yes,yes,yes,yes
yes,yes,yes,yes,yes,yes
yes,yes,yes,yes,yes,yes
yes,yes,yes,yes,yes,yes

test.csv after code (6x5):代码后的test.csv（6x5）：

yes,yes,yes,yes,yes,yes
yes,DataHere,yes,yes,yes,yes
yes,yes,yes,yes,yes,yes
yes,yes,yes,yes,yes,yes
yes,yes,yes,yes,yes,yes

It gets rid of the lowermost row for some reason!由于某种原因，它摆脱了最下面的一行！ So I did some messing with the parameters of pandas.read_csv('path/to/my/script/test.csv') to fix this problem, and got this:所以我对pandas.read_csv('path/to/my/script/test.csv')的参数做了一些处理来解决这个问题，得到了这个：

data = pandas.read_csv('path/to/my/script/test.csv', nrows=6, skip_blank_lines=False)

I added nrows=6 to make it read 6 rows, although I do intend to make it higher in the future.我添加了nrows=6以使其读取6行，尽管我确实打算在将来使它更高。 I added skip_blank_lines=False because I want to be able to add data to blank cells.我添加了skip_blank_lines=False因为我希望能够将数据添加到空白单元格。

When I ran this new code (after changing the csv to its previous 6x6 state), it didn't help.当我运行这个新代码时（将 csv 更改为之前的 6x6 状态后），它没有帮助。 It still erases the 6th row.它仍然会擦除第 6 行。

import pandas
data = pandas.read_csv('path/to/my/script/test.csv', nrows=6, skip_blank_lines=False)

data.iat[1,1] = 'DataHere'

data.to_csv('path/to/my/script/test.csv', index=False, header=False)

I also tried data.iat[6,3] = 'DataHere' instead of data.iat[1,1] = 'DataHere' , which returned this error:我还尝试data.iat[6,3] = 'DataHere'而不是data.iat[1,1] = 'DataHere' ，它返回了这个错误：

IndexError: index 6 is out of bounds for axis 0 with size 5

This shows that not only it is erasing the last row, but that it cannot add data to a blank cell.这表明它不仅在擦除最后一行，而且无法将数据添加到空白单元格。 To make sure that it was the fault of this line: data = pandas.read_csv('path/to/my/script/test.csv', nrows=6, skip_blank_lines=False) , I put print(data) in the line immediately after it and got this output (plus the previously stated errors).为了确保这是这一行的错： data = pandas.read_csv('path/to/my/script/test.csv', nrows=6, skip_blank_lines=False) ，我把print(data)放在了这一行紧随其后并得到这个输出（加上前面提到的错误）。 There should be a 5th row of 'yes' there.那里应该有第 5 行“是”。 So my two problems are:所以我的两个问题是：

Deletion of a row.删除一行。
Not being able to add data to a blank cell.无法将数据添加到空白单元格。

Answer 1

pandas.read_csv('path/to/my/script/test.csv') uses the first row as a header row. pandas.read_csv('path/to/my/script/test.csv')使用第一行作为标题行。 Your test.csv does not have a header row.您的test.csv没有标题行。 So it is likely that the first row (data row) in test.csv is being read as a header row.因此很可能test.csv中的第一行（数据行）被读取为标题行。 Giving you 5 data rows and not 6 as you expect.给你 5 个数据行，而不是你期望的 6 个。

This could be happening这可能正在发生

sim_csv = io.StringIO(
'''yes,yes,yes,yes,yes,yes
yes,yes,yes,yes,yes,yes
yes,yes,yes,yes,yes,yes
yes,yes,yes,yes,yes,yes
yes,yes,yes,yes,yes,yes
yes,yes,yes,yes,yes,yes'''
)

data = pd.read_csv(sim_csv)
print(data)

   yes yes.1 yes.2 yes.3 yes.4 yes.5
0  yes   yes   yes   yes   yes   yes
1  yes   yes   yes   yes   yes   yes
2  yes   yes   yes   yes   yes   yes
3  yes   yes   yes   yes   yes   yes
4  yes   yes   yes   yes   yes   yes

Then when you write out the CSV with to_csv(header=None) you lose that first row of data.然后，当您使用to_csv(header=None)写出 CSV 时，您会丢失第一行数据。

To get around this you could do:为了解决这个问题，你可以这样做：

pandas.read_csv('path/to/my/script/test.csv', header=None)

or you could do this:或者你可以这样做：

pandas.to_csv('path/to/my/script/test.csv')

Just make sure that you're consistent with header=None , you either set both pandas.read_csv and pandas.to_csv to header=None , or just remove it, don't have header=None on one or the other.只要确保您与header=None一致，您可以将pandas.read_csv和pandas.to_csv都设置为header=None ，或者只是删除它，不要在其中一个或另一个上设置header=None 。

Adding a row (cell to a new row)添加行（单元格到新行）

You can add a row (cell to a row) by using the index.您可以使用索引添加一行（单元格到一行）。 For example if you had:例如，如果您有：

   yes yes.1 yes.2 yes.3 yes.4 yes.5
0  yes   yes   yes   yes   yes   yes
1  yes   yes   yes   yes   yes   yes
2  yes   yes   yes   yes   yes   yes
3  yes   yes   yes   yes   yes   yes
4  yes   yes   yes   yes   yes   yes

Then you could do: (note that this is .at and not .iat )然后你可以这样做：（注意这是.at而不是.iat ）

df.at[5,'yes'] = 'yes'

Which will give you:这会给你：

   yes yes.1 yes.2 yes.3 yes.4 yes.5
0  yes   yes   yes   yes   yes   yes
1  yes   yes   yes   yes   yes   yes
2  yes   yes   yes   yes   yes   yes
3  yes   yes   yes   yes   yes   yes
4  yes   yes   yes   yes   yes   yes
5  yes   NaN   NaN   NaN   NaN   NaN

Pandas 没有读取整个 .CSV 文件

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-06-17 19:05:02

Pandas 没有读取整个 .CSV 文件

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-06-17 19:05:02

解决方案1
1 已采纳 2022-06-17 19:05:02