繁体   English   中英

如何修复格式错误的数据框

[英]how to fix badly formatted Dataframe

让我们假设我们有以下数据集Ram Price

我已经使用以下命令阅读了这个数据集

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
data = pd.read_csv('https://raw.githubusercontent.com/amueller/introduction_to_ml_with_python/master/data/ram_price.csv')

但是当我使用命令显示前几个项目时

print(data.head())

它向我展示了以下结果

  Unnamed: 0    date        price
0           0  1957.0  411041792.0
1           1  1959.0   67947725.0
2           2  1960.0    5242880.0
3           3  1965.0    2642412.0
4           4  1970.0     734003.0

请帮我解决这个问题? 当我尝试删除 Unnamed 时,它显示没有 Unnamed 列,如何解决?

它看起来像索引列,您可以选择使用整数索引设置索引列,如下所示:

df = pd.read_csv(
'https://raw.githubusercontent.com/amueller/introduction_to_ml_with_python/master/data/ram_price.csv'
,index_col=[0])

print(df.head(5))
     date        price
0  1957.0  411041792.0
1  1959.0   67947725.0
2  1960.0    5242880.0
3  1965.0    2642412.0
4  1970.0     734003.0

您需要删除具有全名的列。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
data = pd.read_csv('https://raw.githubusercontent.com/amueller/introduction_to_ml_with_python/master/data/ram_price.csv')

print(data.columns) #print all the columns in the dataframe
#Index(['Unnamed: 0', 'date', 'price'], dtype='object')

data = data.drop(['Unnamed: 0'], axis =1) #axis=` specifies to drop column
print(data.head())

#     date        price
#0  1957.0  411041792.0
#1  1959.0   67947725.0
#2  1960.0    5242880.0
#3  1965.0    2642412.0
#4  1970.0     734003.0

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM