重新安装sklearn后出现错误

Question

I get the following error once i updated sklearn to a newer version - i don't know why this is . 将sklearn更新为较新版本后，出现以下错误-我不知道为什么。

    Traceback (most recent call last):
    File "/Users/X/Courses/Project/SupportVectorMachine/main.py", line 95, in <module>
y, x = dmatrices(formula, data=finalDataFrame, return_type='matrix')
    File "/Library/Python/2.7/site-packages/patsy/highlevel.py", line 297, in dmatrices
NA_action, return_type)
    File "/Library/Python/2.7/site-packages/patsy/highlevel.py", line 156, in _do_highlevel_design
return_type=return_type)
    File "/Library/Python/2.7/site-packages/patsy/build.py", line 947, in build_design_matrices
value, is_NA = evaluator.eval(data, NA_action)
   File "/Library/Python/2.7/site-packages/patsy/build.py", line 85, in eval
return result, NA_action.is_numerical_NA(result)
   File "/Library/Python/2.7/site-packages/patsy/missing.py", line 135, in is_numerical_NA
mask |= np.isnan(arr)
   TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule 'safe'

This is the code corresponding to this. 这是与此相对应的代码。 I have reinstalled and installed everything from Numpy to scipy patsy etc. But nothing works. 我已经重新安装并安装了从Numpy到scipy patsy等的所有设备。但是没有任何效果。

 # Merging the two dataframes - user and the tweets
 finalDataFrame =  pandas.merge(twitterDataFrame.reset_index(),twitterUserDataFrame.reset_index(),on=['UserID'],how='inner')
 finalDataFrame = finalDataFrame.drop_duplicates()
 finalDataFrame['FrequencyOfTweets'] = numpy.all(numpy.isfinite(finalDataFrame['FrequencyOfTweets']))

 # model formula, ~ means = and C() lets the classifier know its categorical data.
  formula = 'Classifier ~ InReplyToStatusID + InReplyToUserID + RetweetCount + FavouriteCount + Hashtags + UserMentionID + URL + MediaURL + C(MediaType) + UserMentionID + C(PossiblySensitive) + C(Language) + TweetLength + Location + Description + UserAccountURL + Protected + FollowersCount + FriendsCount + ListedCount + UserAccountCreatedAt + FavouritesCount + GeoEnabled + StatusesCount + ProfileBackgroundImageURL + ProfileUseBackgroundImage + DefaultProfile + FrequencyOfTweets'

  ### create a regression friendly data frame y gives the classifiers, x gives the features and gives different columns for Categorical data depending on variables. 
 y, x = dmatrices(formula, data=finalDataFrame, return_type='matrix')

 ## select which features we would like to analyze
 X = numpy.asarray(x)

Answer 1

I've found that error to crop up sometimes when calling np.isnan on an array that contains strings or other non-float values. 我发现在包含字符串或其他非浮点值的数组上调用np.isnan时，有时会出现该错误。 Try casting your np.arrays using arr.astype(float) before passing them in to dmatrices. 在将它们传递给dmatrices之前，尝试使用arr.astype（float）转换np.arrays。

Also, your frequency of tweets column is being set to all False or all True, since np.all returns a scalar. 此外，由于np.all返回标量，因此您的tweets频率列被设置为False或True。

Answer 2

After a lot of looking through code etc the problem was the formula I was passing wanted the program to use all the features below. 经过大量查看代码等之后，问题出在我传递的公式中，希望该程序使用以下所有功能。 Here the 'UserAccountCreatedAt'column was of type datetime[ns]. 这里的“ UserAccountCreatedAt”列的类型为datetime [ns]。 I have currently taken this off the formula and have no errors however, I would like to know how best to convert this to numeric data in order to actually pass it through. 我目前已将其从公式中删除，并且没有错误，但是，我想知道如何最好地将其转换为数字数据，以使其真正通过。 This is because categorical data is handled by C in front of some of the columns as seen below and datetime is considered numeric in patsy. 这是因为类别数据由C在某些列的前面处理（如下所示），并且datetime在patsy中被视为数字。

  formula = 'Classifier ~ UserAccountCreatedAt + InReplyToStatusID + InReplyToUserID + RetweetCount + FavouriteCount + Hashtags + UserMentionID + URL + MediaURL + C(MediaType) + UserMentionID + C(PossiblySensitive) + C(Language) + TweetLength + Location + Description + UserAccountURL + Protected + FollowersCount + FriendsCount + ListedCount + FavouritesCount + GeoEnabled + StatusesCount + ProfileBackgroundImageURL + ProfileUseBackgroundImage + DefaultProfile + FrequencyOfTweets'

重新安装sklearn后出现错误

问题描述

2 个解决方案

解决方案1
1 2014-07-25 17:14:08

解决方案2
0 2014-07-25 19:26:47

重新安装sklearn后出现错误

问题描述

2 个解决方案

解决方案1 1 2014-07-25 17:14:08

解决方案2 0 2014-07-25 19:26:47

解决方案1
1 2014-07-25 17:14:08

解决方案2
0 2014-07-25 19:26:47