简体   繁体   中英

Python pandas: Split string into multiple columns and extract data for column from split parameter

I am quite new to Python and Pandas and have some URL paths in a column and I want to split it into individual columns.

Each parameter of the string is separated by a semi-colon.

I understand there are a number of other answers on how to split data into multiple columns by a delimiter however in my example I want to dynamically create the columns and extract the value to go in each column from the parameter itself.

The column that each parameter should be put in is within the parameter itself and the data is after the equals sign. I want to put the data after the equals sign into the column that is before the equals sign.

For example:

cat=be_thnky;u1=men
cat=be_thnky;u1=custom

Should become

cat      u1
be_thnky men
be_thnky custom

For added complexity, not all parameters exist in each URL, where the parameter doesn't exist I would like the column to contain NaN.

Some example URL path strings I am working with are:

;src=4457426;type=be_salec;cat=be_thnky;qty=1;cost=60.00;ord=50608803;gtm=G64;gcldc=*;gclaw=*;gac=UA-32723457-1:*;u1=men;u2=schoenen;u3=none;u5=VA38G1NRI;u6=80;u7=0;u8=1;u9=EUR;u10=be;u11=Suede Old Skool Shoes;u12=checkout;u13=8;u14=VNIWTYI926IW7;u15=https://www.vans.be/webapp/wcs/stores/servlet/OrderOKView?langId=-27&catalogId=11260&storeId=10167&krypto=%2B03C782RqELOiuY1L2ELV7hFeTRMquZ9Eyr1lJqmoSQhClENiUJ6feRNwwAA1ZYd4V7tkAuIwyiIrClp7QaqfLeC%2B%2FPTLl7wSF%2FCyrVWqgiSJRgAS%2BWbXohu0DG8xsdPnSXp%2F%2F4MDb%2FkbPwh%2FT5EpiEWMkGur%2Fx%2FABR7Cvs4jh345776IITNx%2FTRZZXu4zeAco5P%2FvxyqDbmwvLKpPKljf3TpU0wOCmjCDWR5r3uR3ELErPFboWuV5H24FOIy7e%2B2b6m4YhCCDuzceKa5Qllkiwc4YI6AL9rIK1T2jExde343vk%2B4FZtK6XgOMtxbwv6pBIUMX%2Bn3kbb7soGQ%2FjnEwxzxMX5P%2FdMZzts6NkskMSICB955QKsZqPLepiS%2BWY5u5%2Bs9CPjquK%2FlsXmHTi26wq1cLqeiPdyolnE2AxaswLDhQcQbvDengszkSu8U8lTDhqaAxLExYF%2BMstZtKamD14AnMElNAbjZNcTEByzYlXOi1q2FpYg0kCyoaBBBtkRInSDBZtjxNWgd9bl98qs5R2ZqCiHmtOPrfcM53V77Acxcb5wl%2FkpdKEbTGuAijHpHgxpi55kIEcEmkJjvPnW7RwxUXPiVZbFjh34PlGJ10FaGvqPwsijBpR1TXrKWV3t3Z4r03yViU6txghbNtODiQ%3D%3D&ddkey=https%3AVFCWorldpayPunchoutCallbackCmd;~oref=https://www.vans.be/webapp/wcs/stores/servlet/OrderOKView?langId=-27&catalogId=11260&storeId=10167&krypto=%2B03C782RqELOiuY1L2ELV7hFeTRMquZ9Eyr1lJqmoSQhClENiUJ6feRNwwAA1ZYd4V7tkAuIwyiIrClp7QaqfLeC%2B%2FPTLl7wSF%2FCyrVWqgiSJRgAS%2BWbXohu0DG8xsdPnSXp%2F%2F4MDb%2FkbPwh%2FT5EpiEWMkGur%2Fx%2FABR7Cvs4jh345776IITNx%2FTRZZXu4zeAco5P%2FvxyqDbmwvLKpPKljf3TpU0wOCmjCDWR5r3uR3ELErPFboWuV5H24FOIy7e%2B2b6m4YhCCDuzceKa5Qllkiwc4YI6AL9rIK1T2jExde343vk%2B4FZtK6XgOMtxbwv6pBIUMX%2Bn3kbb7soGQ%2FjnEwxzxMX5P%2FdMZzts6NkskMSICB955QKsZqPLepiS%2BWY5u5%2Bs9CPjquK%2FlsXmHTi26wq1cLqeiPdyolnE2AxaswLDhQcQbvDengszkSu8U8lTDhqaAxLExYF%2BMstZtKamD14AnMElNAbjZNcTEByzYlXOi1q2FpYg0kCyoaBBBtkRInSDBZtjxNWgd9bl98qs5R2ZqCiHmtOPrfcM53V77Acxcb5wl%2FkpdKEbTGuAijHpHgxpi55kIEcEmkJjvPnW7RwxUXPiVZbFjh34PlGJ10FaGvqPwsijBpR1TXrKWV3t3Z4r03yViU6txghbNtODiQ%3D%3D&ddkey=https%3AVFCWorldpayPunchoutCallbackCmd

and

;src=4457426;type=be_salec;cat=be_thnky;qty=1;cost=79.17;ord=50619855;gtm=G64;gac=UA-32723457-1:*;u1=custom;u2=undefined;u3=none;u5=AQNNOQ;u6=95;u7=0;u8=1;u9=EUR;u10=be;u11=Men Era Shoes;u12=checkout;u13=;u14=;u15=https://www.vans.be/webapp/wcs/stores/servlet/OrderOKView?langId=-27&catalogId=11260&storeId=10167&krypto=aaHqAAtJa9bzV4lSFEuMWqdyG11jxs2yT0UY242hWRQyCn%2Ff7AHBrF%2ByFm6GF%2BZiumn%2B6cjIaHASWHpiwsBKSa5k5fMJoyz3ex%2B8FTyDOp3WwLgA9U3ibS6gLNMEl68UQ8K7bVk%2FP1%2BC2ckY17vriakRKvUpobXypW0AvXHgHGmaleDoIOlM6dVIX1pSHBPbeKDG4JVoXbUOltTgLUcnYbojIiIGx6m%2FYlHnYjWU%2BaYQpCK%2BRBeFd%2FKyekIN9y9wQlZHHKb7pFar8c3S24tuHj%2FeDGe1jwJ0S7%2BBnUb5WloJ1SSf0LjDyFSZAWBSzhidLIRM2OWyTXJeCBdBFNSw%2BwICm6uWHKPClJD%2FRIzO4D%2F3HQyS4sOeynLgyIR6JHsCv3FH%2B%2BrINsPE0Y3eI51mpm7UEmmcLmNKiONm11LwTD1U%2FZKgnLe50naDdiYj9%2BCt7TUkNuDiOYq1jaC2yOSKcz%2BGdF2i4bgEttXJlK84ZUeCUhfvGbQNebesaoRLrGgU7FkuOhut3LQm7Lqu5lpKYSt5cV8gkGP5%2Fm%2BOa%2FzKbRNmbcwACXuZ1hBJW0alkcX%2F3hfpPiSg9UrT1uZKRwfQUpx6fHzagiSWtcWXJDYO2SfWtlfoS%2B7W%2FIvIoD1FtMbCeVC6oAvltLOnIojrW3VYh1OrFUIlXcl0XMXzCPfRz%2B2v28tFOmsucTRbixJ9WyW3WqN2h3YMHZJQoSFbpUDSN7VQkFJmC1NgHzX09u7X1AUIcwP1TmLqO034RnK6ZSfmS38NuYhWCAmPUIyopyEmxqE3M%2FzqEWjId6S1DTmaJSzo09Rx2UtLnZXMOLKXifzoN8eQy3yQvFeNsKxh3IkJxb6uifVXDBpyelQibch9gDg%3D&ddkey=https%3AVFCWorldpayPunchoutCallbackCmd;~oref=https://www.vans.be/webapp/wcs/stores/servlet/OrderOKView?langId=-27&catalogId=11260&storeId=10167&krypto=aaHqAAtJa9bzV4lSFEuMWqdyG11jxs2yT0UY242hWRQyCn%2Ff7AHBrF%2ByFm6GF%2BZiumn%2B6cjIaHASWHpiwsBKSa5k5fMJoyz3ex%2B8FTyDOp3WwLgA9U3ibS6gLNMEl68UQ8K7bVk%2FP1%2BC2ckY17vriakRKvUpobXypW0AvXHgHGmaleDoIOlM6dVIX1pSHBPbeKDG4JVoXbUOltTgLUcnYbojIiIGx6m%2FYlHnYjWU%2BaYQpCK%2BRBeFd%2FKyekIN9y9wQlZHHKb7pFar8c3S24tuHj%2FeDGe1jwJ0S7%2BBnUb5WloJ1SSf0LjDyFSZAWBSzhidLIRM2OWyTXJeCBdBFNSw%2BwICm6uWHKPClJD%2FRIzO4D%2F3HQyS4sOeynLgyIR6JHsCv3FH%2B%2BrINsPE0Y3eI51mpm7UEmmcLmNKiONm11LwTD1U%2FZKgnLe50naDdiYj9%2BCt7TUkNuDiOYq1jaC2yOSKcz%2BGdF2i4bgEttXJlK84ZUeCUhfvGbQNebesaoRLrGgU7FkuOhut3LQm7Lqu5lpKYSt5cV8gkGP5%2Fm%2BOa%2FzKbRNmbcwACXuZ1hBJW0alkcX%2F3hfpPiSg9UrT1uZKRwfQUpx6fHzagiSWtcWXJDYO2SfWtlfoS%2B7W%2FIvIoD1FtMbCeVC6oAvltLOnIojrW3VYh1OrFUIlXcl0XMXzCPfRz%2B2v28tFOmsucTRbixJ9WyW3WqN2h3YMHZJQoSFbpUDSN7VQkFJmC1NgHzX09u7X1AUIcwP1TmLqO034RnK6ZSfmS38NuYhWCAmPUIyopyEmxqE3M%2FzqEWjId6S1DTmaJSzo09Rx2UtLnZXMOLKXifzoN8eQy3yQvFeNsKxh3IkJxb6uifVXDBpyelQibch9gDg%3D&ddkey=https%3AVFCWorldpayPunchoutCallbackCmd

Here's one solution using a dictionary comprehension followed by pd.concat :

str1 = ';src=4457426;type=be_salec;cat=be_thnky;qty=1;cost=60.00;ord=50608803;gtm=G64;gcldc=*;gclaw=*;gac=UA-32723457-1:*;u1=men;u2=schoenen;u3=none;u5=VA38G1NRI;u6=80;u7=0;u8=1;u9=EUR;u10=be;u11=Suede Old Skool Shoes;u12=checkout;u13=8;u14=VNIWTYI926IW7;u15=https://www.vans.be/webapp/wcs/stores/servlet/OrderOKView?langId=-27&catalogId=11260&storeId=10167&krypto=%2B03C782RqELOiuY1L2ELV7hFeTRMquZ9Eyr1lJqmoSQhClENiUJ6feRNwwAA1ZYd4V7tkAuIwyiIrClp7QaqfLeC%2B%2FPTLl7wSF%2FCyrVWqgiSJRgAS%2BWbXohu0DG8xsdPnSXp%2F%2F4MDb%2FkbPwh%2FT5EpiEWMkGur%2Fx%2FABR7Cvs4jh345776IITNx%2FTRZZXu4zeAco5P%2FvxyqDbmwvLKpPKljf3TpU0wOCmjCDWR5r3uR3ELErPFboWuV5H24FOIy7e%2B2b6m4YhCCDuzceKa5Qllkiwc4YI6AL9rIK1T2jExde343vk%2B4FZtK6XgOMtxbwv6pBIUMX%2Bn3kbb7soGQ%2FjnEwxzxMX5P%2FdMZzts6NkskMSICB955QKsZqPLepiS%2BWY5u5%2Bs9CPjquK%2FlsXmHTi26wq1cLqeiPdyolnE2AxaswLDhQcQbvDengszkSu8U8lTDhqaAxLExYF%2BMstZtKamD14AnMElNAbjZNcTEByzYlXOi1q2FpYg0kCyoaBBBtkRInSDBZtjxNWgd9bl98qs5R2ZqCiHmtOPrfcM53V77Acxcb5wl%2FkpdKEbTGuAijHpHgxpi55kIEcEmkJjvPnW7RwxUXPiVZbFjh34PlGJ10FaGvqPwsijBpR1TXrKWV3t3Z4r03yViU6txghbNtODiQ%3D%3D&ddkey=https%3AVFCWorldpayPunchoutCallbackCmd;~oref=https://www.vans.be/webapp/wcs/stores/servlet/OrderOKView?langId=-27&catalogId=11260&storeId=10167&krypto=%2B03C782RqELOiuY1L2ELV7hFeTRMquZ9Eyr1lJqmoSQhClENiUJ6feRNwwAA1ZYd4V7tkAuIwyiIrClp7QaqfLeC%2B%2FPTLl7wSF%2FCyrVWqgiSJRgAS%2BWbXohu0DG8xsdPnSXp%2F%2F4MDb%2FkbPwh%2FT5EpiEWMkGur%2Fx%2FABR7Cvs4jh345776IITNx%2FTRZZXu4zeAco5P%2FvxyqDbmwvLKpPKljf3TpU0wOCmjCDWR5r3uR3ELErPFboWuV5H24FOIy7e%2B2b6m4YhCCDuzceKa5Qllkiwc4YI6AL9rIK1T2jExde343vk%2B4FZtK6XgOMtxbwv6pBIUMX%2Bn3kbb7soGQ%2FjnEwxzxMX5P%2FdMZzts6NkskMSICB955QKsZqPLepiS%2BWY5u5%2Bs9CPjquK%2FlsXmHTi26wq1cLqeiPdyolnE2AxaswLDhQcQbvDengszkSu8U8lTDhqaAxLExYF%2BMstZtKamD14AnMElNAbjZNcTEByzYlXOi1q2FpYg0kCyoaBBBtkRInSDBZtjxNWgd9bl98qs5R2ZqCiHmtOPrfcM53V77Acxcb5wl%2FkpdKEbTGuAijHpHgxpi55kIEcEmkJjvPnW7RwxUXPiVZbFjh34PlGJ10FaGvqPwsijBpR1TXrKWV3t3Z4r03yViU6txghbNtODiQ%3D%3D&ddkey=https%3AVFCWorldpayPunchoutCallbackCmd'
str2 = ';src=4457426;type=be_salec;cat=be_thnky;qty=1;cost=79.17;ord=50619855;gtm=G64;gac=UA-32723457-1:*;u1=custom;u2=undefined;u3=none;u5=AQNNOQ;u6=95;u7=0;u8=1;u9=EUR;u10=be;u11=Men Era Shoes;u12=checkout;u13=;u14=;u15=https://www.vans.be/webapp/wcs/stores/servlet/OrderOKView?langId=-27&catalogId=11260&storeId=10167&krypto=aaHqAAtJa9bzV4lSFEuMWqdyG11jxs2yT0UY242hWRQyCn%2Ff7AHBrF%2ByFm6GF%2BZiumn%2B6cjIaHASWHpiwsBKSa5k5fMJoyz3ex%2B8FTyDOp3WwLgA9U3ibS6gLNMEl68UQ8K7bVk%2FP1%2BC2ckY17vriakRKvUpobXypW0AvXHgHGmaleDoIOlM6dVIX1pSHBPbeKDG4JVoXbUOltTgLUcnYbojIiIGx6m%2FYlHnYjWU%2BaYQpCK%2BRBeFd%2FKyekIN9y9wQlZHHKb7pFar8c3S24tuHj%2FeDGe1jwJ0S7%2BBnUb5WloJ1SSf0LjDyFSZAWBSzhidLIRM2OWyTXJeCBdBFNSw%2BwICm6uWHKPClJD%2FRIzO4D%2F3HQyS4sOeynLgyIR6JHsCv3FH%2B%2BrINsPE0Y3eI51mpm7UEmmcLmNKiONm11LwTD1U%2FZKgnLe50naDdiYj9%2BCt7TUkNuDiOYq1jaC2yOSKcz%2BGdF2i4bgEttXJlK84ZUeCUhfvGbQNebesaoRLrGgU7FkuOhut3LQm7Lqu5lpKYSt5cV8gkGP5%2Fm%2BOa%2FzKbRNmbcwACXuZ1hBJW0alkcX%2F3hfpPiSg9UrT1uZKRwfQUpx6fHzagiSWtcWXJDYO2SfWtlfoS%2B7W%2FIvIoD1FtMbCeVC6oAvltLOnIojrW3VYh1OrFUIlXcl0XMXzCPfRz%2B2v28tFOmsucTRbixJ9WyW3WqN2h3YMHZJQoSFbpUDSN7VQkFJmC1NgHzX09u7X1AUIcwP1TmLqO034RnK6ZSfmS38NuYhWCAmPUIyopyEmxqE3M%2FzqEWjId6S1DTmaJSzo09Rx2UtLnZXMOLKXifzoN8eQy3yQvFeNsKxh3IkJxb6uifVXDBpyelQibch9gDg%3D&ddkey=https%3AVFCWorldpayPunchoutCallbackCmd;~oref=https://www.vans.be/webapp/wcs/stores/servlet/OrderOKView?langId=-27&catalogId=11260&storeId=10167&krypto=aaHqAAtJa9bzV4lSFEuMWqdyG11jxs2yT0UY242hWRQyCn%2Ff7AHBrF%2ByFm6GF%2BZiumn%2B6cjIaHASWHpiwsBKSa5k5fMJoyz3ex%2B8FTyDOp3WwLgA9U3ibS6gLNMEl68UQ8K7bVk%2FP1%2BC2ckY17vriakRKvUpobXypW0AvXHgHGmaleDoIOlM6dVIX1pSHBPbeKDG4JVoXbUOltTgLUcnYbojIiIGx6m%2FYlHnYjWU%2BaYQpCK%2BRBeFd%2FKyekIN9y9wQlZHHKb7pFar8c3S24tuHj%2FeDGe1jwJ0S7%2BBnUb5WloJ1SSf0LjDyFSZAWBSzhidLIRM2OWyTXJeCBdBFNSw%2BwICm6uWHKPClJD%2FRIzO4D%2F3HQyS4sOeynLgyIR6JHsCv3FH%2B%2BrINsPE0Y3eI51mpm7UEmmcLmNKiONm11LwTD1U%2FZKgnLe50naDdiYj9%2BCt7TUkNuDiOYq1jaC2yOSKcz%2BGdF2i4bgEttXJlK84ZUeCUhfvGbQNebesaoRLrGgU7FkuOhut3LQm7Lqu5lpKYSt5cV8gkGP5%2Fm%2BOa%2FzKbRNmbcwACXuZ1hBJW0alkcX%2F3hfpPiSg9UrT1uZKRwfQUpx6fHzagiSWtcWXJDYO2SfWtlfoS%2B7W%2FIvIoD1FtMbCeVC6oAvltLOnIojrW3VYh1OrFUIlXcl0XMXzCPfRz%2B2v28tFOmsucTRbixJ9WyW3WqN2h3YMHZJQoSFbpUDSN7VQkFJmC1NgHzX09u7X1AUIcwP1TmLqO034RnK6ZSfmS38NuYhWCAmPUIyopyEmxqE3M%2FzqEWjId6S1DTmaJSzo09Rx2UtLnZXMOLKXifzoN8eQy3yQvFeNsKxh3IkJxb6uifVXDBpyelQibch9gDg%3D&ddkey=https%3AVFCWorldpayPunchoutCallbackCmd'

def converter(x):
    return dict(i.split('=', 1) for i in str1.split(';') if '=' in i)

res = pd.concat([pd.DataFrame.from_dict(converter(i), orient='index').T \
                 for i in (str1, str2)])

Result:

print(res)

       src      type       cat qty   cost       ord  gtm gcldc gclaw  \
0  4457426  be_salec  be_thnky   1  60.00  50608803  G64     *     *   
0  4457426  be_salec  be_thnky   1  60.00  50608803  G64     *     *   

                                               ~oref  
0  https://www.vans.be/webapp/wcs/stores/servlet/...  
0  https://www.vans.be/webapp/wcs/stores/servlet/...  

[2 rows x 25 columns]

You can do:

def gen_col(u):
    for i in u:
        d = {}
        val = filter(None, i.split(";"))
        for j in val:
            v = j.split("=")
            d[v[0]] = v[1]
        yield d

your9000list = list(yourOtherDFWithURLS['URLCOL'].values)
df = pd.DataFrame([r for r in gen_col(your9000list])
print(df)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM