简体   繁体   中英

Pandas: Creating new data frame from only certain columns

I have a csv file with measurements, and I want to create a new csv file with the hourly averages and standard deviations. But only for certain columns.

Example:

csv1:

YY-MO-DD HH-MI-SS_SSS    |     Acceleration  |        Lumx     |    Pressure
2015-12-07 20:51:06:608  |        22.7       |        32.3     |     10
2015-12-07 20:51:07:609  |        22.5       |        47.7     |     15

to csv 2 (only for the pressure and acceleration:

 YY-MO-DD HH-MI-SS_SSS       | Acceleration avg  |   Pressure avg
    2015-12-07 20:00:00:000  |        22.6       |        12.5     
    2015-12-07 21:00:00:000  |        ....       |        ....    

Now I have an idea (thanks to the people on this site) on how to calculate the averages - but i'm having trouble on creating a new smaller dataframe that contains the calculations for a few columns.

Thanks !!!

You should make smaller df like below,

csv2 = csv1[['Acceleration', 'Pressure']].copy()

and can handle the csv2. (You said you have an idea about avg calculation) FYI, .copy() could be omitted if you are sure about view versus copy .

csv2 = csv1.loc[:, ['Acceleration', 'Pressure']]
  • .loc[] helps keep the subsetting operation explicit and consistent.

  • .loc[] always returns a copy so the original dataframe is never modified.

(for further discussion and great examples of the different view vs. copy alternatives please see: Pandas: Knowing when an operation affects the original dataframe )

Your average method can go in place of "method_to_obtain_avg" and then you can obtain a subset as below:

csv2 = csv1.method_to_obtain_avg()[["Accelaration", "Pressure"]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM