How to access a specific row in a column using Pandas

Question

folks!

I'm stuck while developing a dashboboard using Pandas. This is the scenario:

I'm importing and transforming a CSV file in order to get some insights about a team I am working with.

|ID      |Area Path                             | 
|--------|--------------------------------------|
| 544    | [Level 1, Level 2, Level 3]          |
| 545    | [Level 1, Level 2]                   |
| 546    | [Level 1]                            |
| 547    | [Level 1, Level 2, Level 3, Level 4] |

As you can see, the column Area Path does not have a pattern. Sometimes I'll find a list with 1 or 2 or 3 or 4 items on it.

I'm facing a problem in order to access each line in this collumn to get the information I need. If the list has only one item, I must use the [0] position, if the list has 2 or more items, I must use the [1] position.

I've tried to do different things and this one below is my last attempt:

def Extract(lst):
    if dados['Area Path'].str.len() == 1:
      return [item[0] for item in dados['Area Path']]
    elif dados['Area Path'].str.len() == 2:
      return [item[-1] for item in dados['Area Path']]
    elif dados['Area Path'].str.len() == 3:
      return [item[1] for item in dados['Area Path']]
    elif dados['Area Path'].str.len() == 4:
      return [item[1] for item in dados['Area Path']]

lst = [dados['Area Path']]
indice_novo = Extract(lst)
dados['Team'] = indice_novo

The problem is that I'm not able to iterate over the list that is the column. The output provided by .str.len() is great, but it does not help me completely.

Can you help me to solve this problem?

Thanks, Marcelo

Answer 1

Here is a solution using map()

df['Area Path'].map(lambda x: x[0] if len(x) == 1 else x[1])

Output:

0    Level 2
1    Level 2
2    Level 1
3    Level 2
Name: Area Path, dtype: object

Answer 2

Based on your comment, the Area Path column contains lists. If so, you are accessing the columns incorrectly. The correct way to access the lists in the columns would be:

lst = dados['Area Path'].tolist()

This will populate the lst variable with a list of lists, which looks something like:

[['Level 1', 'Level 2', 'Level 3'], ['Level 1', 'Level 2'], ['Level 1'], ...]

Then, in your Extract() function, you can perform your filtering based on the required logic:

def Extract(list_of_lists):
    new_list = []
    for lst in list_of_lists:
        # Will fail if 'Area Path' contains None, NaN values
        if len(lst) == 1:
            new_list.append(lst[0])
        else:
            new_list.append(lst[1])
    return new_list

indice_novo = Extract(lst)
dados['Team'] = indice_novo

This answer is based on your code and may not be the most optimized way to do this.

How to access a specific row in a column using Pandas

Question

2 answers

solution1
0 ACCPTED 2022-01-07 03:19:31

solution2
0 2022-01-07 03:34:59

How to access a specific row in a column using Pandas

Question

2 answers

solution1 0 ACCPTED 2022-01-07 03:19:31

solution2 0 2022-01-07 03:34:59

solution1
0 ACCPTED 2022-01-07 03:19:31

solution2
0 2022-01-07 03:34:59