简体   繁体   中英

Python - ValueError: could not broadcast input array from shape (5) into shape (2)

I have written some code which takes in my dataframe which consists of two columns - one is a string and the other is an idea count - the code takes in the dataframe, tries several delimeters and cross references it with the count to check it is using the correct one. The result I am looking for is to add a new column called "Ideas" which contains the list of broken out ideas. My code is below:

def getIdeas(row):
    s = str(row[0])
    ic = row[1]
    #  Try to break on lines ";;"
    my_dels = [";;", ";", ",", "\\", "//"]

    for d in my_dels:
        ideas = s.split(d)
        if len(ideas) == ic:
            return ideas
    #  Try to break on numbers "N)"
    ideas = re.split(r'[0-9]\)', s)
    if len(ideas) == ic:
        return ideas
    ideas = []
    return ideas

#  k = getIdeas(str_contents3, idea_count3)

xl = pd.ExcelFile("data/Total Dataset.xlsx")
df = xl.parse("Sheet3")

df1 = df.iloc[:,1:3] 

df1 = df1.loc[df1.iloc[:,1] != 0]
df1["Ideas"] = df1.apply(getIdeas, axis=1)

When I run this I am getting an error

ValueError: could not broadcast input array from shape (5) into shape (2)

Could someone tell me how to fix this?

You have 2 option with apply with axis=1 , ether you return a single value or a list of length that match the length your number of columns. if you match the number of columns in will be broadcast to the entire row. if you return a single value it will return a pandas Series

one work around would be not to use apply.

result = []
for idx, row in df1.iterrows():
    result.append(getIdeas(row))
df1['Ideas'] = result

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM