简体   繁体   中英

How can i split string having different delimeters in dataframe?

in my data set there are 4 columns and some of the rows have multiple entries separated by ";"in one column and "," in other how can i split them in separate rows

i tried using str.split of pandas and and stack method too but it only work for single delimeter i wnated to do it for entire dataframe having different delimeters

i tried using this but it also didn't work

[In]  df.set_index(['Year','Source title','Volume','Issue','Pagestart','Page end','Cited by','Abstract']).apply(lambda x :x.str.split(',')).stack().apply(pd.Series).stack()

i want to split data in a row into different rows here is an example my csv file:

Name     id    city 
a,b,c   1;2;3  x,y,z
d       4       w

wanted to convert into:

Name     id    city 
a        1       x
b        2       y
c        3       z
d        4       w

You can use multiple delimiters using regex :

df = df.apply(lambda x: x.str.split('[,;]').explode())

  Name id city
0    a  1    x
0    b  2    y
0    c  3    z
1    d  4    w

Let's assume that most of your columns can be split on commas. For everything else, you can manually make an entry in a dictionary.

You can now perform a column-wise splitting and explode :

delim = {'id': ';'}
df.apply(lambda x: x.str.split(delim.get(x.name, ',')).explode())

  Name id city
0    a  1    x
0    b  2    y
0    c  3    z
1    d  4    w

(df.apply(lambda x: x.str.split(delim.get(x.name, ',')).explode())
   .reset_index(drop=True))

  Name id city
0    a  1    x
1    b  2    y
2    c  3    z
3    d  4    w

The assumption here is that all columns in a given row will have equal number of splits.

(Works for pandas >= 0.25 ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM