[英]How do I print the dimensions of a dataset (csv file) using Pandas- library and also print out some lines?
So I am programming in Python 3, and would like to print out the dimensions of a dataset (csv file) using the pandas library dataframe, and also do a few other things that I dont quite grasp the idea of? 因此,我正在使用Python 3进行编程,并希望使用pandas库数据框打印出数据集(csv文件)的尺寸,还要做其他我不太了解的想法吗? this is just an example as I only need explanation on how.
这只是一个例子,我只需要解释一下。 Say I have 2 functions:
说我有2个功能:
in func1 i have (supposedly) loaded a dataset using pandas: 在func1中,我已经(应该)使用熊猫加载了数据集:
def func1(a): def func1(a):
namesOfColumns = ["The sepal-length", "The sepal-width", "The petal-length", "The petal-width", "class"]
a = "some_file"
some_file = pd.read_csv(a)
return (some_file)
def func2(data): def func2(数据):
#code for printing the dimensions of the dataset
#code for printing the top 3 lines
#code for printing the mean and standard variation of the sepal-width
#code for plot box plot of each attribute
Would someone explain how I can approach the steps in func2? 有人可以解释我如何执行func2中的步骤吗?
Code for printing the dimensions of the dataset: 用于打印数据集维度的代码:
print(data.info()) # Descriptive info about the DataFrame
print(data.shape) # gives a tuple with the shape of DataFrame
Code for printing the top 3 lines: 打印前三行的代码:
print(data.head(3))
Print mean and standard variation of the sepal-width: 打印间隔宽度的均值和标准方差:
print(data.describe()) # General statistics
print(data['Sepal_Width'].mean(), data['Sepal_Width'].std()) # Mean & std dev of Sepal_Width only
Code for plot box plot of each attribute: 每个属性的绘图箱绘图的代码:
data.boxplot(namesOfColumns)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.