[英]Why do I get Length of values (1) does not match length of index (3) when using random.sample()?
My Python code returns the following error message:我的 Python 代码返回以下错误消息:
File "/Users/christianmagelssen/Desktop/Koding/analyse/moduler/resultater.py", line 64, in allokereGrupper
group1['GRUPPE'] = velger
ValueError: Length of values (1) does not match length of index (3)
I have tried many different things to solve this issue:我尝试了很多不同的方法来解决这个问题:
I know that my code worked 3 months ago but on another dataset.我知道我的代码在 3 个月前可以工作,但是在另一个数据集上。 Can someone help me so I understand what I am doing wrong here?有人可以帮助我,以便我了解我在这里做错了什么吗?
Here is all my code这是我所有的代码
results.py结果.py
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import random
class Resultat:
def lastInnOgRydd(path, LagreCsv = False):
df = pd.read_csv(path, skiprows=2, decimal=".")
filt = df['FINISH'] == 'DNF'
dnf = df[filt]
dnf = dnf.replace('DNF', 1)
if LagreCsv == True:
dnf.to_csv('DNF.csv')
df.replace('DNF', np.NaN, inplace=True)
df.replace('GARBAGE GARBAGE', np.NaN, inplace=True) #Denne finnes det nok en bedre løsning på
df.dropna(subset=['FINISH'], inplace=True)
df.dropna(subset=['NAME'], inplace=True)
return df
def endreDataType(df):
df["FINISH"] = df["FINISH"].str.replace(',', '.').astype(float)
df["INTER 1"] = df["INTER 1"].str.replace(',', '.').astype(float)
df["SECTION IM4-FINISH"] = df["SECTION IM4-FINISH"].str.replace(',', '.').astype(float)
df["COMMENT"] = df['COMMENT'].astype(int)
df["COMMENT"] = df['COMMENT'].astype(str)
df["COMMENT"] = df['COMMENT'].str.replace('11', 'COURSE 1')
df["COMMENT"] = df['COMMENT'].str.replace('22', 'COURSE 2')
df["COMMENT"] = df['COMMENT'].str.replace('33', 'COURSE 3')
df["COMMENT"] = df['COMMENT'].str.replace('55', 'UTKJORING')
df["COMMENT"] = df['COMMENT'].str.replace('99', 'STRAIGHT-GLIDING')
pd.to_numeric(df['FINISH'], downcast='float', errors='raise')
pd.to_numeric(df['INTER 1'], downcast='float', errors='raise')
pd.to_numeric(df['SECTION IM4-FINISH'], downcast='float', errors='raise')
return df
def navnendringCommentTilCourse(df):
df.rename(columns={'COMMENT': 'COURSE'}, inplace=True)
return df
def finnBesteRunder(df):
grupper = df.groupby(['BIB#', 'COURSE'])
bestruns = grupper['FINISH'].apply(lambda x: x.nsmallest(2).mean()).reset_index()
df1 = bestruns.pivot('BIB#', 'COURSE', 'FINISH').reset_index()
df1['GJENNOMSNITT'] = df1['COURSE 1'].add(df1['COURSE 2']).add(df1['COURSE 3']).div(3)
#df1['PRESTASJON'] = df1['MEAN'].div(df1['STRAIGHT-GLIDING']) # fjerner denne nå, men må med i den ordentilige analysen
return df1
def allokereGrupper(df1):
df1 = df1.sort_values(by='GJENNOMSNITT', ascending=True)
mask = np.arange(len(df1)) % 2
group1 = df1.loc[mask == 0]
group1 = group1.drop_duplicates(subset=['BIB#'])
print(group1)
group2 = df1.loc[mask == 1]
group2 = group2.drop_duplicates(subset=['BIB#'])
print(group2)
grupper = ['RANDOM', 'BLOCKED']
for i in group1['BIB#']:
velger = random.sample(grupper, k=1)
group1['GRUPPE'] = velger
main.py主文件
from moduler import Resultat
path = "http://www.cmagelssen.no/pilot2.csv"
df = Resultat.lastInnOgRydd(path)
df = Resultat.endreDataType(df)
df = Resultat.navnendringCommentTilCourse(df)
df = Resultat.finnBesteRunder(df)
df = Resultat.allokereGrupper(df)
The problem is that velger
is a list.问题是velger
是一个列表。 It looks like either ['RANDOM']
or ['BLOCKED']
.它看起来像['RANDOM']
或['BLOCKED']
。 When you try to create the 'GRUPPE'
column, you must feed a non-iterable, like a string or int.当您尝试创建'GRUPPE'
列时,您必须提供不可迭代的内容,例如字符串或整数。
If you feed it an iterable, Pandas assumes that your iterable is the same length as your dataframe, and fills every dataframe row with the corresponding value in the iterable.如果您为其提供一个可迭代对象,Pandas 会假定您的可迭代对象与您的数据帧长度相同,并用可迭代对象中的相应值填充每个数据帧行。 (3rd row gets 3rd list element, for example). (例如,第 3 行获取第 3 个列表元素)。 But of course your iterable has length one, and the dataframe group1
does not necessarily just have one element.但是当然您的迭代长度为 1,并且数据框group1
不一定只有一个元素。 Maybe in your previous dataset that was the case.也许在您之前的数据集中就是这种情况。
It's not entirely clear to me what is your goal from the code, but if your intention is to fill every cell in the 'GRUPPE'
column with the same value (either 'RANDOM'
or 'BLOCKED'
, then change:我并不完全清楚代码中的目标是什么,但是如果您打算用相同的值( 'RANDOM'
或'BLOCKED'
填充'GRUPPE'
列中的每个单元格,则更改:
group1['GRUPPE'] = velger
to到
group1['GRUPPE'] = velger[0]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.