簡體   English   中英

如何用帶有逗號分隔符和空格的pandas解析csv?

[英]How do I parse a csv with pandas that has a comma delimiter and space?

我目前有以下data.csv ,它有逗號分隔符:

name,day
Chicken Sandwich,Wednesday
Pesto Pasta,Thursday
Lettuce, Tomato & Onion Sandwich,Friday
Lettuce, Tomato & Onion Pita,Friday
Soup,Saturday

解析器腳本是:

import pandas as pd


df = pd.read_csv('data.csv', delimiter=',', error_bad_lines=False, index_col=False)
print(df.head(5))

輸出是:

Skipping line 4: expected 2 fields, saw 3
Skipping line 5: expected 2 fields, saw 3

               name        day
0  Chicken Sandwich  Wednesday
1       Pesto Pasta   Thursday
2              Soup   Saturday

我該如何處理Lettuce, Tomato & Onion Sandwich 每個項目應該分開,但項目中可能有逗號后跟空格。 所需的輸出是:

                               name        day
0                  Chicken Sandwich  Wednesday
1                       Pesto Pasta   Thursday
2  Lettuce, Tomato & Onion Sandwich     Friday
3      Lettuce, Tomato & Onion Pita     Friday
4                              Soup   Saturday

這可能有所幫助。

import pandas as pd
p = "PATH_TO.csv"
df = pd.read_csv(p, delimiter='(,(?=\S)|:)')
#print(df.head(5))
print "-----"
print df["name"]
print "-----"
print df["day"]

輸出:

-----
0                    Chicken Sandwich
1                         Pesto Pasta
2    Lettuce, Tomato & Onion Sandwich
3        Lettuce, Tomato & Onion Pita
4                                Soup
Name: name, dtype: object
-----
0    Wednesday
1     Thursday
2       Friday
3       Friday
4     Saturday
Name: day, dtype: object

另一種適用於其他情況的替代方案。 好的,這很難看。

import pandas as pd
from io import StringIO

for_pd = StringIO()
with open('theirry.csv') as input:
    for line in input:
        line = line.rstrip().replace(', ', '|||').replace(',', '```').replace('|||', ', ').replace('```', '|')
        print (line, file=for_pd)
for_pd.seek(0)

df = pd.read_csv(for_pd, sep='|')

print (df)

結果:

                               name        day
0                  Chicken Sandwich  Wednesday
1                       Pesto Pasta   Thursday
2  Lettuce, Tomato & Onion Sandwich     Friday
3      Lettuce, Tomato & Onion Pita     Friday
4                              Soup   Saturday

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM