[英]Save data frame as csv/text file in pandas without line numbering
我使用 Pandas 中的文本文件創建了一個數據框。
df = pd.read_table('inputfile.txt',names=['Line'])
當我做df
Line
0 17/08/31 13:24:48 INFO spark.SparkContext: Run...
1 17/08/31 13:24:49 INFO spark.SecurityManager: ...
2 17/08/31 13:24:49 INFO spark.SecurityManager: ...
3 17/08/31 13:24:49 INFO spark.SecurityManager: ...
4 17/08/31 13:24:49 INFO util.Utils: Successfull...
5 17/08/31 13:24:49 INFO slf4j.Slf4jLogger: Slf4...
6 17/08/31 13:24:49 INFO Remoting: Starting remo...
7 17/08/31 13:24:50 INFO Remoting: Remoting star...
8 17/08/31 13:24:50 INFO Remoting: Remoting now ...
9 17/08/31 13:24:50 INFO util.Utils: Successfull...
現在我想將此文件保存為csv
df.to_csv('outputfile')
我得到的結果是這樣的
0,17/08/31 13:24:48 INFO spark.SparkContext: Running Spark version 1.6.0
1,17/08/31 13:24:49 INFO spark.SecurityManager: Changing view acls to: user1
2,17/08/31 13:24:49 INFO spark.SecurityManager: Changing modify acls to: user1
3,17/08/31 13:24:49 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(user1);
4,17/08/31 13:24:49 INFO util.Utils: Successfully started service 'sparkDriver' on port 17101.
5,17/08/31 13:24:49 INFO slf4j.Slf4jLogger: Slf4jLogger started
6,17/08/31 13:24:49 INFO Remoting: Starting remoting
7,17/08/31 13:24:50 INFO Remoting: Remoting started; listening on addresses :
8,17/08/31 13:24:50 INFO Remoting: Remoting now listens on addresses:
9,17/08/31 13:24:50 INFO util.Utils: Successfully started service 'sparkDriverActorSystem' on port 100033.
我希望我的輸出是
17/08/31 13:24:48 INFO spark.SparkContext: Running Spark version 1.6.0
17/08/31 13:24:49 INFO spark.SecurityManager: Changing view acls to: user1
17/08/31 13:24:49 INFO spark.SecurityManager: Changing modify acls to: user1
17/08/31 13:24:49 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(user1);
17/08/31 13:24:49 INFO util.Utils: Successfully started service 'sparkDriver' on port 17101.
17/08/31 13:24:49 INFO slf4j.Slf4jLogger: Slf4jLogger started
17/08/31 13:24:49 INFO Remoting: Starting remoting
17/08/31 13:24:50 INFO Remoting: Remoting started; listening on addresses :
17/08/31 13:24:50 INFO Remoting: Remoting now listens on addresses:
17/08/31 13:24:50 INFO util.Utils: Successfully started service 'sparkDriverActorSystem' on port 100033.
我嘗試了以下幾種方法,但仍然得到相同的結果而不是我想要的輸出。
np.savetxt(r'np.txt', df.Line, fmt='%d')
df.to_csv(sep=' ', index=False, header=False)
鑒於特殊情況,詹姆斯的回答可能是正確的。 但是, pandas
的標准行為是將行號作為沒有標題的列放在前面。 要刪除它,只需將index=
參數設置為None
:
df.to_csv("outfile.csv", index=None)
看起來數字可能是Line
列中字符串的一部分。 您可以用空替換前導數字和空格,然后使用以下命令將其輸出到沒有索引的文件:
df.Line.str.replace('^\d+ +','').to_csv('outputfile.csv', index=False, header=False)
克里斯蒂安幾乎是對的。 如果您查看to_csv 命令的文檔。
根據文檔
index : boolean, default True, Write row names (index)
我強烈推薦輔助工具 Kite 來幫助處理這樣的簡單事情。
df.to_csv('outfile.csv', index=False)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.