简体   繁体   English

如何使用 DataFrame 编辑 Excel 文件并将其另存为 Excel 文件?

[英]How to edit Excel file using DataFrame and save it back as Excel file?

I have this Excel file.我有这个Excel 文件。 I also put the screenshot of my the file below.我还将我的文件的屏幕截图放在下面。 Excel 截图

I want to edit the data on pitch-class column with this 2 criteria:我想用这 2 个标准编辑pitch-class列上的数据:

  1. removing ' ' mark between the text.删除文本之间的' '标记。
  2. removing 0 values.删除0值。
  3. removing [] mark.删除[]标记。

So, for example, from this text:因此,例如,从这段文字中:

['0', 'E3', 'F3', 'F#3 / Gb3', 'G3', 'G#3 / Ab3', 'A3', 'A#3 / Bb3', 'B3', 'C4', 'C#4 / Db4', 'D4']

I want to make it look like this:我想让它看起来像这样:

[E3, F3, F#3 / Gb3, G3, G#3 / Ab3, A3, A#3 / Bb3, B3, C4, C#4 / Db4, D4]

Of course, I can do this manually one by one, but unfortunately because I have about 20 similar files that I have to edit, I can't do it manually, so I think I might need help from Python.当然,我可以一个一个地手动完成,但不幸的是因为我有大约 20 个类似的文件需要编辑,我无法手动完成,所以我想我可能需要 Python 的帮助。

My idea to do it on Python is to load the Excel file to a DataFrame, edit the data row by row (maybe using .remove() and .join() method), and put the edit result back to original Excel file, or maybe generate a new one consisting an edited pitch-class data column. My idea to do it on Python is to load the Excel file to a DataFrame, edit the data row by row (maybe using .remove() and .join() method), and put the edit result back to original Excel file, or可能会生成一个包含已编辑pitch-class数据列的新列。

But, I kinda have no idea on how to do code it.但是,我有点不知道如何编写代码。 So far, what I've tried to do is this:到目前为止,我试图做的是:

  1. read the Excel files to Python.将 Excel 文件读取到 Python。
  2. read pitch-class column in that Excel file.读取 Excel 文件中pitch-class列。
  3. load it to a dataframe.将其加载到 dataframe。 Below is my current code.以下是我当前的代码。
import pandas as pd 

file = '014_twinkle_twinkle 300 0.0001 dataframe.xlsx' # file attached above

df = pd.read_excel(file, index_col=None, usecols="C") # read only pitch-class column

# printing data
for row in df.iterrows():
    print(df['pitch-class'].astype(str))

My question is how can I edit the pitch-class data per row and put the edit result back again to original or a new Excel file?我的问题是如何编辑每行的pitch-class数据并将编辑结果重新放回原始文件或新的 Excel 文件? I have difficulties accessing the df['pitch-class'] data because I can't get the string value.我无法访问df['pitch-class']数据,因为我无法获取字符串值。 Is there any way in Python to achieve it? Python有什么办法可以实现吗?

In general you do not want to iterate over every row in a pandas dataframe, it is very slow.通常,您不想遍历 pandas dataframe 中的每一行,这非常慢。 There are a lot of ways (that you can lean by practice over time) to apply functions over a column/row/the whole dataframe in pandas.在 pandas 中,有很多方法(随着时间的推移,你可以通过练习来学习)在列/行/整个 dataframe 上应用函数。 In this example:在这个例子中:

Convert the column to type string, and replace the ' character with a blank space将列转换为字符串类型,并将 ' 字符替换为空格

df = pd.read_excel("014_twinkle_twinkle 300 0.0001 dataframe.xlsx")
df["pitch-class"] = df["pitch-class"].astype(str).str.replace("'0', ", "")
df["pitch-class"] = df["pitch-class"].astype(str).str.replace("'", "")
df.to_excel("results.xlsx")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM