[英]Python Panda CSV Management
Hi quick query on Pandas via Juypter Ipython. 嗨,通过Juypter Ipython对熊猫进行快速查询。 I have written the below code, and working through some other bits of automation that I am trying to do for a friends business.
我已经编写了以下代码,并通过其他一些自动化工作来尝试为朋友做生意。 If I wanted to split the first column into 2 using "-" as a delimiter just like you can in Excel... how would I do this in Pandas via Ipython?
如果我想像在Excel中一样使用“-”作为分隔符将第一列分为2,我该如何通过Ipython在Pandas中做到这一点? So description for say "Red Bull-225825" would become "Red Bull" and a new column would be created to the left of Description called "XYZ" with a 225825 as the value.
因此,说“ Red Bull-225825”的描述将变为“ Red Bull”,并且将在描述的左侧创建一个新列,称为“ XYZ”,其值为225825。 With null values being null.
null值为null。
import pandas as pd
df.columns = df.iloc[1]
df = pd.read_csv("3.csv", skiprows=range(0, 2))
df[['Description','Total Qty','Total Sales']].dropna().to_csv("new1.csv",index=False)
Thanks 谢谢
Here's my take: 这是我的看法:
import pandas as pd
from io import StringIO
TESTDATA = StringIO("""Description,TotalQty,TotalSales
ACME, 11, 1
Evil Corp, 10, 2
Google-Alphabet, 100, 0""")
df = pd.read_csv(TESTDATA, sep=",")
def splitfun(row):
if '-' in row['Description']:
val1, val2 = row['Description'].split('-')
return pd.Series({'Description': val1, 'AfterDash': val2})
else:
return pd.Series({'Description': row['Description'], 'AfterDash': None})
df[['Description','AfterDash']]=df.apply(splitfun, axis=1)
print(df)
Description TotalQty TotalSales AfterDash
0 ACME 11 1 None
1 Evil Corp 10 2 None
2 Google 100 0 Alphabet
datadict = {'Desc': ['Sale', 'Red Bull-968313', 'Lotto', 'ABC-11123'],
'Total Qty': [1,2,3,4],
'Total Sale': [5,6,7,8]
}
import pandas as pd
df = pd.DataFrame.from_dict(datadict)
print (df)
# Desc Total Qty Total Sale
#0 Sale 1 5
#1 Red Bull-968313 2 6
#2 Lotto 3 7
#3 ABC-11123 4 8
df['Desc Number'] = df['Desc'].str.split('-')
df['Desc'] = [i[0] for i in df['Desc Number']]
df['Desc Number'] = [i[1] if len(i)>1 else None for i in df['Desc Number']]
df = df[['Desc Number', 'Desc', 'Total Qty', 'Total Sale']]
print (df)
# Desc Number Desc Total Qty Total Sale
#0 None Sale 1 5
#1 968313 Red Bull 2 6
#2 None Lotto 3 7
#3 11123 ABC 4 8
This answer will account for the None
/Null values you require 该答案将说明您需要的
None
/ Null值
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.