[英]Python - data frame columns rename with capital letter after the '.'
I have some columns that follow the pattern 'abc.def' and I'm trying to change it to 'abcDef' with a function.我有一些列遵循模式“abc.def”,我正在尝试使用函数将其更改为“abcDef”。 I can do it with
df.rename(columns={'abc.def': 'abcDef'}, inplace = True)
but looking for a more generic approach that can be applied to different data frames.我可以用
df.rename(columns={'abc.def': 'abcDef'}, inplace = True)
但寻找一种可以应用于不同数据框的更通用的方法。 I did it for the simple string and I do not know how to apply it to the column names.我是为简单的字符串做的,我不知道如何将它应用于列名。 I have tried to get column names to the list and append the function to the list but that did not work either.
我试图将列名添加到列表中并将函数附加到列表中,但这也不起作用。
My df is:我的 df 是:
import pandas as pd
import re
data = {'end.date': ['01/10/2020 15:23', '01/10/2020 16:31', '01/10/2020 16:20', '01/10/2020 11:00'],
'start.date': ['01/10/2020 13:38', '01/10/2020 14:49', '01/10/2020 14:30','01/10/2020 14:30']
}
df = pd.DataFrame(data, columns = ['end.Date','start.date'])
# below is my go at the text.
text = 'abs.d'
splitFilter = re.compile('([.!?]\s*)')
splitColumnName = splitFilter.split(text)
print(splitColumnName)
final = ''.join([i.capitalize() for i in splitColumnName])
final = final.replace('.', '')
print(final)
I think you want something like that ?我想你想要那样的东西?
import pandas as pd
import re
data = {'end.date': ['01/10/2020 15:23', '01/10/2020 16:31', '01/10/2020 16:20', '01/10/2020 11:00'],
'start.date': ['01/10/2020 13:38', '01/10/2020 14:49', '01/10/2020 14:30','01/10/2020 14:30']
}
df = pd.DataFrame(data, columns = ['end.Date','start.date'])
# below is my go at the text.
def formatColumn(column) :
splitFilter = re.compile('([.!?]\s*)')
splitColumnName = splitFilter.split(column)
final = ''.join([i.capitalize() for i in splitColumnName])
final = final.replace('.', '')
return final[0].lower() + final[1:]
df.rename(columns=dict(zip(df.columns, [formatColumn(c) for c in df.columns])))
I used the answers from @Arne and from @LeMorse and compiled what I needed.我使用了@Arne 和@LeMorse 的答案并编译了我需要的内容。 Thanks again!
再次感谢!
import pandas as pd
import re
data = {'end.date': ['01/10/2020 15:23', '01/10/2020 16:31', '01/10/2020 16:20', '01/10/2020 11:00'],
'start.date': ['01/10/2020 13:38', '01/10/2020 14:49', '01/10/2020 14:30','01/10/2020 14:30']
}
df = pd.DataFrame(data, columns = ['end.Date','start.date'])
# below is my go at the text.
def formatColumn(column) :
splitFilter = re.compile('([.!?]\s*)')
splitColumnName = splitFilter.split(column)
final = ''.join([i.capitalize() for i in splitColumnName])
final = final.replace('.', '')
return final[0].lower() + final[1:]
df.columns = [formatColumn(col) for col in df.columns]
You could put your code to transform an individual string into a function and then apply this function to every column name, eg with a list comprehension:您可以将代码将单个字符串转换为函数,然后将此函数应用于每个列名,例如使用列表理解:
def camelCase(text):
splitFilter = re.compile('([.!?]\s*)')
splitColumnName = splitFilter.split(text)
final = ''.join([i.capitalize() for i in splitColumnName])
final = final.replace('.', '')
return final
df.columns = [camelCase(col) for col in df.columns]
Note that currently your code capitalizes the first letter too.请注意,目前您的代码也将第一个字母大写。
def splitAndRenameColumns(df, splitSignal):
# get all the columns in a list
columnNameList = df.columns.values.tolist()
# create a map to rename columns
# mapping is old.columnname : newColumnname
newColNames = {}
# loop pver all column names
for clm in columnNameList :
# split the column names on "provided split signal i.e dot in this case"
tempStore = clm.split(splitSignal)
# store the first word before dot in temparory string
newString = tempStore[0]
# loop over all other string values we got after splitting
for index in range(1,len(tempStore)):
# capitalise first character to upper case and concatenate all the strings
newString += tempStore[index][0].upper()+tempStore[index][1:]
# create the mapping
# i.e {'end.Date.gate': 'endDateGate', 'start.date.bate': 'startDateBate'}
newColNames[clm] = newString
return newColNames
df = df.rename(columns=splitAndRenameColumns(df, "."))
print(df)
It's almost similar to other answers but It's more generic in-terms of split signals and explains the process with comments clearly.它几乎与其他答案相似,但它在拆分信号方面更通用,并用注释清楚地解释了该过程。 Let me know if you still need more comments on the code
如果您还需要对代码进行更多评论,请告诉我
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.