Removing ranges of characters in pandas dataframe index

Question

I have a list of text items in a dataframe column, some of which containing integers at the end, and some containing info between brackets "(extra info)". The rest of the items are just plane text. I want to remove all the integers from those which have them, and all the paranthesis with their info within, whilst still keeping the value after which it is located.

             Cost   Item Purchased  Name
Store1       22.5   Sponge          Chris
Shop         2.5    Kitty Litter    Kevyn
House (aax)  2  Spoon               Filip

I would like the output to be

           Cost Item Purchased  Name
Store      22.5 Sponge          Chris
Shop       2.5  Kitty Litter    Kevyn
House      2    Spoon           Filip

Answer 1

Set up the dataframe. It would be useful in future if you put this in the question.

df = pd.DataFrame(
    {
        "cost": [22.5, 2.5, 2],
        "item purchased": ["Sponge", "kitty litter", "spoon"],
        "name": ["Chris", "Kevyn", "Filip"],
    },
    index=["Store1", "Shop", "House (aax)"],
)


# reset the index to a column.
df=df.reset_index()

# split the index and keep the first item in the lists.
df['index'] = df['index'].str.split("(").map(lambda x: x[0])

# reset the index
df = df.set_index('index')

print(df)

        cost    item purchased  name
index           
Store1  22.5    Sponge          Chris
Shop    2.5     kitty litter    Kevyn
House   2.0     spoon           Filip

Removing ranges of characters in pandas dataframe index

Question

1 answers

solution1
0 2019-04-14 14:50:41

Set up the dataframe. It would be useful in future if you put this in the question.

Removing ranges of characters in pandas dataframe index

Question

1 answers

solution1 0 2019-04-14 14:50:41

Set up the dataframe. It would be useful in future if you put this in the question.

solution1
0 2019-04-14 14:50:41