简体   繁体   中英

Looping through list with dataframe elements in python

I want to iterate over a list, which has dataframes as its elements.

Example: ls is my list with below elements (two dataframes)

                           seq  score    status
4366  CGAGGCTGCCTGTTTTCTAGTTG   5.15  negative
5837  GGACCTTTTTTACAATATAGCCA   3.48  negative
96    TTTCTAGCCTACCAAAATCGGAG  -5.27  negative
1369  CTTCCTATCTTCATTCTTCGACT   1.28  negative
1223                CAAGTTTGT   2.06  negative
5451  TGTTTCCACACCTGTCTCAGCTC   4.48  negative
1277  GTACTGTGGAATCTCGGCAGGCT   4.87  negative
5299  CATAATGAATGCCCCATCAATTG  -7.19  negative
3477                ATGGCACTG  -3.60  negative
2953  AGTAATTCTGTTGCCTGAAGATA   2.86  negative
4586                TGGGCAAGT   2.48  negative
3746                AATGAGAGG  -3.67  negative,
                         seq  score    status
1983  AGCAGATCAAACGGGTAAAGGAC  -4.81  negative
3822  CCCTGGCCCACGCACTGCAGTCA   3.32  negative
1127  GCAGAGATGCTGATCTTCACGTC  -6.77  negative
3624                TGAGTATGG   0.60  negative
4559                AAGGTTGGG   4.94  negative
4391  ATGAAGATCATCGAAATCAGTTT  -2.09  negative
4028  TCTCCGACAATGCCTATCAGTAC   1.14  negative
2694                CAGGGAACT   0.98  negative
2197  CTTCCATTGAGCTGCTCCAGCAC  -0.97  negative
2025  TGTGATCTGGCTGCACGCACTGT  -2.13  negative
5575                CCAGAAAGG  -2.45  negative
275   TCTGTTGGGTTTTCATACAGCTA   7.11  negative

When I am accessing its elements, I am getting following error. list indices must be integers, not DataFrame

I tried the following code:

cut_off = [1,2,3,4]

for i in ls:
    for co in cut_off:
        print "Negative set : " + "cut off value =", str(
            co), number of variants = ", str((ls[i]['score'] > co).sum())

I want to access each dataframe element in the list and compare the score value of each row. If it is more than the cut_off value, it should sum it and give me the total number of rows which value > cut_off value.

Expected output: Negative set : cut off value = 0 , number of variants = 8

Thanks

This should work ok

cut_off = [1,2,3,4]

for df in ls:
    for co in cut_off:
        print "Negative set : " + "cut off value =", str(
            co), number of variants = ", str((df['score'] > co).sum())

It looks like you are expecting i to be an index into your list ls , when in fact it is the element itself. For example:

foo = [ "one", "two", "three" ]
for i in foo:
     print(i)

outputs

one
two
three

while

for i, elm in enumerate(foo):
     print(f"{i}: {elm}")

outputs:

0: one
1: two
2: three

So I think enumerate is what you're looking for.

for i in range(len(ls)):
    for co in cut_off:
        print("Negative set : " + "cut off value =", str(
        co), number of variants = ", (sum(list((ls[ls['score'] > co]['score'])))

I hope this helps...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM