Error in iterating over a pandas dataframe created from a excel file column

Question

I am reading an excel file column into a pandas dataframe . This is the code I have written for this:

df = pd.ExcelFile('address.xlsx').parse('sheet1')  

x = df['Address']
print(x)

Output of above code is:

0                         Via abc che - 66110 Chi
1                 Via vivo, 44\n65125 Paris (PR)
2                 Via vivo, 44\n65125 Pesc (PI)
3            Contrada contra\n64100 Term (PI)
4                    Via Mvico\n75025 Poli (PR)

There is only item in each row, which is an address . Now what I want to do is iterate through each row of this dataframe , get the address and then extract zip code out of that address. I wrote this code for this:

for index ,row in x:
    reg = re.compile('^.*(?P<zipcode>\d{5}).*$')
    match = reg.match(row[0])
    fitered_match = match.groupdict().zipcode  
    print(fitered_match)

When I execute this I get error as ValueError: too many values to unpack (expected 2) .

I am unable to understand:

Why this error is coming?
Is my logic to extract zip code out of the address is correct?

Answer 1

You can use extract() :

df['Zip Code'] = df['Address'].str.extract(r'(\d{5})')

Yields:

                            Address Zip Code
0           Via abc che - 66110 Chi    66110
1    Via vivo, 44\n65125 Paris (PR)    65125
2     Via vivo, 44\n65125 Pesc (PI)    65125
3  Contrada contra\n64100 Term (PI)    64100
4        Via Mvico\n75025 Poli (PR)    75025

In your original code, the reason you are receiving the error ValueError: too many values to unpack (expected 2) is because you did not use enumerate(x) , as you are attempting to iterate both indices and values.

Error in iterating over a pandas dataframe created from a excel file column

Question

1 answers

solution1
1 2018-10-24 20:22:46

Error in iterating over a pandas dataframe created from a excel file column

Question

1 answers

solution1 1 2018-10-24 20:22:46

solution1
1 2018-10-24 20:22:46