I have a column with multiple product names like
Contract
0 O.U20
1 O.Z20
2 O.H21
3 O.M21
4 O.U21
5 O.Z21
6 O.H22
7 O.M22
8 S3.U20
9 S3.Z20
10 S6.M26
11 S6.U26
12 S6.Z26
13 S6.H27
14 S9.U26
15 S9.Z26
16 F3.U26
17 F3.Z26
18 F3.H27
19 F6.H26
20 F6.M26
21 F6.U26
22 F9.U20
What I want to do is assign Section name based on Contract name like
Contract Sections
0 O.U20 O1
1 O.Z20 O1
2 O.H21 O1
3 O.M21 O1
4 O.U21 O2
5 O.Z21 O2
6 O.H22 O2
7 O.M22 O2
8 S3.U20 S3
9 S3.Z20 S3
10 S6.M26 S6
11 S6.U26 S6
12 S6.Z26 S6
13 S6.H27 S6
14 S9.U26 S9
15 S9.Z26 S9
16 F3.U26 F3
17 F3.Z26 F3
18 F3.H27 F3
19 F6.H26 F6
20 F6.M26 F6
21 F6.U26 F6
22 F9.U20 F9
For S and F series I can achieve the desired results using this code (Please let me know if there is a better way to achieve it)
df.loc[df['Contract'].str.contains('S3'),'Sections'] = 'S3'
df.loc[df['Contract'].str.contains('S6'),'Sections'] = 'S6'
df.loc[df['Contract'].str.contains('S9'),'Sections'] = 'S9'
df.loc[df['Contract'].str.contains('F3'),'Sections'] = 'F3'
df.loc[df['Contract'].str.contains('F6'),'Sections'] = 'F6'
df.loc[df['Contract'].str.contains('F9'),'Sections'] = 'F9'
Since it is just matching the string assigning the section name. Sadly O series does not have a number attached to it so I have to divide it into blocks of 4 like shown above
Contract Sections
0 O.U20 O1
1 O.Z20 O1
2 O.H21 O1
3 O.M21 O1
4 O.U21 O2
5 O.Z21 O2
6 O.H22 O2
7 O.M22 O2
I tried the following code
df.loc[df['Contract'].str.contains('O'),'Sections'] = df.index // 4+1
but it's throwing the error
ValueError: could not broadcast input array from shape (23) into shape (8)
How can I achieve the results in a better and efficient way? Please note that this is just a sample data and the original dataset has many more values like this.
Change your code to
df.loc[df['Contract'].str.contains('O'),'Sections'] = 'O' +((df['Contract'].str.contains('O').cumsum().sub(1)//4) + 1).astype(str)
To simplify
df.loc[df['Contract'].str.contains('S3'),'Sections'] = 'S3'
df.loc[df['Contract'].str.contains('S6'),'Sections'] = 'S6'
df.loc[df['Contract'].str.contains('S9'),'Sections'] = 'S9'
df.loc[df['Contract'].str.contains('F3'),'Sections'] = 'F3'
df.loc[df['Contract'].str.contains('F6'),'Sections'] = 'F6'
df.loc[df['Contract'].str.contains('F9'),'Sections'] = 'F9'
just replace it with below 1 line of code:
df['Section'] = df['Contract'].str.split('.').str[0]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.