简体   繁体   English

Python 如何打开在同一程序中修改过的 csv 文件进行读取?

[英]Python how to open a csv file for reading that was modified in the same program?

I have a .csv file with headers.我有一个带有标题的.csv文件。

I am trying to delete the header row and then open the same file for reading.我正在尝试删除 header 行,然后打开同一文件进行阅读。

But the first line read is still the header line.但读取的第一行仍然是 header 行。 How to I delete the header line and start reading from the first line of data?如何删除 header 行并从第一行数据开始读取?

Code snippet -代码片段——

# Sort the cleaned file on r2
df = pd.read_csv(cleaned_file + ".csv", names=['r2','r5','r7','r12','r15','r70','r83'])
sorted_df = df.sort_values(by=["r2"], ascending=True)
sorted_df.to_csv(cleaned_file_sorted_on_ts + '.csv', index=False)

# Remove the header line from the cleaned_file_sorted_on_ts file
cmd = "tail -n +2 " + cleaned_file_sorted_on_ts + ".csv" + " > tmp.csv && mv tmp.csv " + cleaned_file_sorted_on_ts + ".csv"
print(cmd)
proc = Popen(cmd, shell=True, stdout=PIPE)

with open(cleaned_file_sorted_on_ts + ".csv","r") as infile:
    first_line = infile.readline().strip('\n')
    print("First line in cleaned file = {}".format(first_line))

Output I am getting is - Output 我得到的是 -

tail -n +2 /ghostcache/Run.multi.rollout/h2_lines_cleaned_sorted.csv > tmp.csv && mv tmp.csv /ghostcache/Run.multi.rollout/h2_lines_cleaned_sorted.csv
First line in cleaned file = r2,r5,r7,r12,r15,r70,r83
Traceback (most recent call last):
  File "process_r83.py", line 51, in <module>
    first_ts = int(float(first_line.split(',')[0]))
ValueError: could not convert string to float: 'r2'

You should reload the file into a pandas DF after removing the header line using the shell command, and then read the first line of the DF instead of the file.在使用 shell 命令删除 header 行后,您应该将文件重新加载到 pandas DF 中,然后读取 DF 的第一行而不是文件。 Can you try this out.你能试试这个吗?

# Sort the cleaned file on r2
df = pd.read_csv(cleaned_file + ".csv", names=['r2','r5','r7','r12','r15','r70','r83'])
sorted_df = df.sort_values(by=["r2"], ascending=True)
sorted_df.to_csv(cleaned_file_sorted_on_ts + '.csv', index=False)

# Remove the header line from the cleaned_file_sorted_on_ts file
cmd = "tail -n +2 " + cleaned_file_sorted_on_ts + ".csv" + " > tmp.csv && mv tmp.csv " + cleaned_file_sorted_on_ts + ".csv"
print(cmd)
proc = Popen(cmd, shell=True, stdout=PIPE)

# Re-load the file into a DataFrame
df = pd.read_csv(cleaned_file_sorted_on_ts + ".csv")

# Get the first line of the DataFrame
first_line = df.iloc[0]
print("First line in cleaned file = {}".format(first_line))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM