简体   繁体   English

遍历数据框列并确定哪些是数字

[英]iterate through dataframe columns and determine which are numeric

I have a read data from from a csv file.我有一个来自 csv 文件的读取数据。 I'd like to write code that does the following—我想编写执行以下操作的代码 -

(1) starts iterating through each column (I imagine a for loop) (1) 开始遍历每一列(我想象一个 for 循环)

(2) determines if a column contains only numbers (2) 确定一列是否只包含数字

(3) if the column is numeric, print certain statistical information about that specific column & whether that column is normally distributed (skewness & kurtosis between -1 and 1) (3) 如果该列是数字,则打印有关该特定列的某些统计信息以及该列是否呈正态分布(偏度和峰度介于 -1 和 1 之间)

(4) if the column is not numeric, skip over it (4) 如果该列不是数字,则跳过它

This is for an Intro Python course, so it is not expected to be complex这是 Python 入门课程,因此预计不会很复杂

So far this is my code:到目前为止,这是我的代码:

import pandas as pd

df = pd.read_csv('file path')

columns = list(df)

for i in columns:
(if column is numeric, print: Column Title, min, max, mean, median, "Yes column normal" or "No column not normal")
(else, just skip it)
for column in df: 
    if np.issubdtype(df[column], np.number): 
        print(df[column]) # print(df[column].describe()) or whatever other stats

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM