Python Pandas Data Frame - how to Shorten the Frame

Question

I have a dataframe of 1m entries. The python program is searching for specific entries and bubbling those up to the top of the frame. I then want to write the frame containing only the entries of interest to a csv file. Is there a way to shorten the frame so that it is now only a frame containing the entries of interest instead of remaining 1m lines long? Example - say 100 entries are of interest. They are now the first 100 entries of the frame - is there a way to shorten the resulting frame so that it is only 100 entries long instead of 1m?

To Questions - these entries are identical to the original data in width. As far as knowing how many there are - it isn't known how many are good before the program runs but the program will count the entries of interest so I will know how entries there are - so I will know if there are 100 or 125, etc.

Answer 1

You can try indexing on the dataframe.

Assume "df" is your dataframe, with a column "value", and I want to find all results larger than 5, I could do the following:

indexes_of_interest = (df['value'] > 5)
short_df = df[indexes_of_interest]

The first line of the code is the definition of an arbitrary filter (in this case: 'is the value larger than 5). In the second line, I select the records that meet this criterion and save this to a shorter dataframe.

The "short_df" is now a dataframe that only contains the records of interest.

Python Pandas Data Frame - how to Shorten the Frame

Question

1 answers

solution1
1 2020-10-12 14:38:58

Python Pandas Data Frame - how to Shorten the Frame

Question

1 answers

solution1 1 2020-10-12 14:38:58

solution1
1 2020-10-12 14:38:58