简体   繁体   中英

How to create a Pandas function to read different size dataframes

I am trying to automate the mileage trips for my job which involves reading a .csv file and use pandas module. The problem is that the .csv file comes in different length because everyone has a different trips. Is there any to create a function that read exactly the number of trips regardless of the length of the .csv file? The .csv files have some extra rows below the trips that I don't want to read into the DataFrame.

      a  b  c  
trip1 x  x  x  
trip2 x  x  x 
trip3 x  x  x  
      a  b  c  
trip1 x  x  x  
trip2 x  x  x 
trip3 x  x  x
trip4 x  x  x
      ...
trip9 x  x  x 

I assume that you want to read n of rows from the .csv file, you can just do:

pd.read_csv('path_to_file.csv', nrows=10)

This would read only top 10 rows from the csv. This is helpful if you want to read files contains huge volume of data and size.

If you want to skip the last n of rows, you can do:

pd.read_csv('path_to_file.csv', skipfooter=2)

This will always skip the last 2 rows from the csv.

Documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

If you want to print the top row, use df.head()

import pandas as pd

def read():
     df = df.read_csv('csv_file.csv')
     df.head(10) #depends how many rows you want to print.
     print(df)

If you want to print from the bottom row, use df.tail()

import pandas as pd
def read():
    df = df.read_csv('csv_file.csv')
    df.tail(10) #depends how many rows you want to print.
    print(df)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM