简体   繁体   English

更改 pandas csv 列的值样式 python

[英]change pandas csv column's values style python

I have a dataset like below:我有一个如下数据集:
在此处输入图像描述

my problem is in anotation column, I want to change the style of lists that is showing above into some thing like this:我的问题在注释列中,我想将上面显示的列表样式更改为如下所示:

  • ['flight-sth sth sth sth] ['flight-sth sth sth]

I mean that the anotation column value's has multiple style, I just want to change that style into mine:)我的意思是注释列值有多种样式,我只想将这种样式更改为我的样式:)
example:例子:

['flight_search.price_range'] ==> ['flight-search price range']  
['flight_search.stops'] ==> ['flight-search stop']  
['flight_search.date.depart_origin'] ==> ['flight-search date depart origin']  

and after doing this conversion, replacing it exactly into old anotation column:)并在进行此转换后,将其完全替换为旧注释列:)


the sample of anotation:注释样本:

anotation
['flight_search.destination1']  
['flight_search.origin']  
['flight_search.destination1']  
['flight_search.type']  
['flight_search.type']  
['flight_search.airline']  
['flight_search.stops']  
['flight_search.stops']  
['flight_search.price_range']  
['flight_search.price_range']  
['flight1_detail.from.time']  
['flight_search.date.depart_origin']  
annotation = [['flight_search.destination1'],  
['flight_search.origin'],
['flight_search.destination1']  ,
['flight_search.type']  ,
['flight_search.type']  ,
['flight_search.airline'],  
['flight_search.stops']  ,
['flight_search.stops']  ,
['flight_search.price_range']  ,
['flight_search.price_range']  ,
['flight1_detail.from.time']  ,
['flight_search.date.depart_origin']]  
empty = []
for i in annotation:
    empty.append([i[0].replace("_","-").replace("."," ")])

Output Output

[['flight-search destination1'],
 ['flight-search origin'],
 ['flight-search destination1'],
 ['flight-search type'],
 ['flight-search type'],
 ['flight-search airline'],
 ['flight-search stops'],
 ['flight-search stops'],
 ['flight-search price-range'],
 ['flight-search price-range'],
 ['flight1-detail from time'],
 ['flight-search date depart-origin']]

DataFrames数据帧

# for dataframe

df["annotation"].apply(lambda x: [x[0].replace("_","-").replace("."," ")])

I Believe this should do the trick, if no typos are in it我相信这应该可以解决问题,如果没有错别字的话

Python String replace() Method could be an option. Python String replace() 方法可能是一个选项。 But i see you wanted the first underscore to be a - and the second one being a space.但是我看到您希望第一个下划线是 - 而第二个下划线是空格。 I think that problem can be solved if you go deep into regular expressions in python.我认为如果您深入了解 python 中的正则表达式,这个问题可以解决。 to keep it simple i've made this so far:为了简单起见,到目前为止我已经做到了:

mystring = 'flight_search.price_range'
mystring = mystring.replace("_", "-")
mystring = mystring.replace(".", " ")

see https://www.w3schools.com/python/ref_string_replace.asphttps://www.w3schools.com/python/ref_string_replace.asp

edited code:编辑代码:

mystring = 'flight_search.price_range'
mystring = mystring.replace("_", "-",1)
mystring = mystring.replace(".", " ")
mystring = mystring.replace("_", " ")
print(mystring)

result of edited code: flight-search price range编辑代码的结果:航班搜索价格范围

What you want to think about is what changes do you need to make to the strings in the annotation column.您要考虑的是您需要对注释列中的字符串进行哪些更改。 With the df.replace() function you can apply simple changes to all the columns.使用df.replace() function 您可以对所有列应用简单的更改。

If you need some more control, however, you would want to use the df.apply() function.但是,如果您需要更多控制,则可能需要使用df.apply() function。 With this function you can specify exactly what you want to do with each string in the column using a custom function.使用此 function,您可以使用自定义 function 准确指定要对列中的每个字符串执行的操作。

For example you could take this approach to start off with, you can change the custom function to get your desired results:例如,您可以采用这种方法开始,您可以更改自定义 function 以获得所需的结果:

import pandas as pd

annotation = ['flight_search.destination1',  
'flight_search.origin',
'flight_search.destination1',
'flight_search.type' ,
'flight_search.type'  ,
'flight_search.airline',  
'flight_search.stops'  ,
'flight_search.stops'  ,
'flight_search.price_range' ,
'flight_search.price_range' ,
'flight1_detail.from.time' ,
'flight_search.date.depart_origin']

df = pd.DataFrame({"annotation":annotation})

def custom_func(string):
    # replace the initial word
    string = string.replace("flight_", "flight-")
    string = string.replace("flight1_", "flight1-") # is this a typo?
    
    # replace the other punctuataion marks with a space
    for punctuation in ['_', '.']:
        string = string.replace(punctuation, " ")
    
    # retun the formatted string
    return string

# apply the custom function to the annotation column
df["annotation"] = df["annotation"].apply(custom_func)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM