Extract value between first quotation marks in pandas data frame column

Question

I have a sample data frame called df like below (the actual df has thousands of rows) where each element of column "Code" is a list (and each of these lists can have multiple elements):

I would like for each row to get the first code number between the quotation marks. Therefore, I would like the output for the data frame above to be:

Initially, I thought that all the codes are 4-digit numbers, therefore I tried this:

My_List = df['Code'].tolist()

Unique_Code =[]
for i in range(0, len(My_List)):
    k = My_List[i][2:5]
    Unique_Code.append(k)

df['Unique_Code'] = Unique_Code

but this obviously works only in the case the code is 4-digit number.

Could you please help me in order to find a more efficient and univesral way to solve this problem? Many thanks

Answer 1

g=df.explode('code').groupby('id')['code'].first().to_frame()#explode and pick first item in each group
g['code']=g['code'].str.strip("''")#Proceed and strip the inverted comas from code

Answer 2

If the code values of your dataframe are just like python lists you can use eval() function to convert them to objects again; not just works for numbers you can use it on strings, functions etc.

Try this:

data = {
    'ID': ["1", "2", "3", "4"],
    'Code': ['["435"]', '["442244"]', '["etetetet"]', '["345666"]'],
}

data_frame = pd.DataFrame(data, columns=["ID", "Code"])
for index, each_row in data_frame.iterrows():
    id_column = each_row["ID"]
    code_row = eval(each_row["Code"])[0]
    print(code_row)

Just in one line:

codes = [eval(each_code) for each_code in df['Code'].tolist()]

Extract value between first quotation marks in pandas data frame column

Question

2 answers

solution1
1 2021-01-24 20:48:50

solution2
1 ACCPTED 2021-01-24 20:49:09

Extract value between first quotation marks in pandas data frame column

Question

2 answers

solution1 1 2021-01-24 20:48:50

solution2 1 ACCPTED 2021-01-24 20:49:09

solution1
1 2021-01-24 20:48:50

solution2
1 ACCPTED 2021-01-24 20:49:09