简体   繁体   中英

pandas iterate over column values at once and generate range

I have a pandas dataframe like as below

df1 = pd.DataFrame({'biz': [18, 23], 'seg': [30, 34], 'PID': [40, 52]})

I would like to do the below

a) pass all the values from each column at once to for loop

For ex:

I am trying the below

cols = ['biz','seg','PID']
for col in cols:
  for i, j in df1.col.values:
      print("D" + str(i) + ":" + "F" + str(j))
      print("Q" + str(i) + ":" + "S" + str(j))
      print("AB" + str(i) + ":" + "AD" + str(j))

but this doesn;t work and I get an error

TypeError: cannot unpack non-iterable numpy.int64 object

I expect my output to be like as below

D18:F23
Q18:S23
AB18:AD23
D30:F34
Q30:S34
AB30:AD34
D40:F52
Q40:S52
AB40:AD52

The mistake is in the innermost forloop.

You are requesting an iterator over a 1-dimensional array of values, this iterator yields scalar values and hence they can not be unpacked.

If your dataframe only has 2 items per column, then this should suffice

cols = ['biz','seg','PID']
for col in cols:
   i, j = getattr(df1, col).values
   print("D" + str(i) + ":" + "F" + str(j))
   print("Q" + str(i) + ":" + "S" + str(j))
   print("AB" + str(i) + ":" + "AD" + str(j))

Alternatives

Pandas using loc

This is actually the simplest way to solve it but only now it occurred to me. We use the column name col along with loc to get all rows (given by : in loc[:, col] )

cols = ['biz','seg','PID']
for col in cols:
    i, j = df1.loc[:, col].values

Attrgetter

We can use the attrgetter object from operator library to get a single (or as many attributes) as we want:

from operator import attrgetter

cols = ['biz','seg','PID']
cols = attrgetter(*cols)(df1)
for col in cols:
   i, j = col.values

Attrgetter 2

This approach is similar to the one above, except that we select multiple columns and have the i and j in two lists, with each entry corresponding to one column.

from operator import attrgetter

cols = ['biz','seg','PID']
cols = attrgetter(*cols)(df1)
cols = [col.values for col in cols]
all_i, all_j = zip(*cols)

Pandas solution

This approach uses just pandas functions. It gets the column index using the df1.columns.get_loc(col_name) function, and then uses .iloc to index the values. In .iloc[a,b] we use : in place of a to select all rows, and index in place of b to select just the column.

cols = ['biz','seg','PID']
for col in cols:
    index = df1.columns.get_loc(col)
    i, j = df1.iloc[:, index]
    # do the printing here
for i in range(len(df1)):
    print('D' + str(df1.iloc[i,0]) + ':' + 'F' + str(df1.iloc[i+1,0]))
    print('Q' + str(df1.iloc[i,0]) + ':' + 'S' + str(df1.iloc[i+1,0]))
    print('AB' + str(df1.iloc[i,0]) + ':' + 'AD' + str(df1.iloc[i+1,0]))
    print('D' + str(df1.iloc[i,1]) + ':' + 'F' + str(df1.iloc[i+1,1]))
    print('Q' + str(df1.iloc[i,1]) + ':' + 'S' + str(df1.iloc[i+1,1]))
    print('AB' + str(df1.iloc[i,1]) + ':' + 'AD' + str(df1.iloc[i+1,1]))
    print('D' + str(df1.iloc[i,2]) + ':' + 'F' + str(df1.iloc[i+1,2]))
    print('Q' + str(df1.iloc[i,2]) + ':' + 'S' + str(df1.iloc[i+1,2]))
    print('AB' + str(df1.iloc[i,2]) + ':' + 'AD' + str(df1.iloc[i+1,2]))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM