I have a column with unique ID numbers, called "UnitID"
, that is organised in a way such as this:
ABC2_DEFGH12-01_X1_Y1
The segment of DEFGH12-01 hypothetically refers to the ID of the specific batch of units. I need to make a new column that specifies this batch, and therefore, want to extract the "DEFGH12-01" values (like extracting the value between the first and second "_", but I haven't been able to figure out how), into a new column, called "BatchID"
.
I would want to just leave "UnitID"
as is, and simply add the new "BatchID"
column before it.
I've tried everything but I haven't really managed to do this.
Using str.split("_").str[1]
Ex:
df = pd.DataFrame({"UnitID": ["ABC2_DEFGH12-01_X1_Y1"]})
df["BatchID"] = df["UnitID"].str.split("_").str[1]
print(df)
Output:
UnitID BatchID
0 ABC2_DEFGH12-01_X1_Y1 DEFGH12-01
If you need Regex use str.extract(r"(?<=_)(.*?)(?=_)")
.
df["BatchID"] = df["UnitID"].str.extract(r"(?<=_)(.*?)(?=_)")
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.