简体繁体 English

如果功能是文本摘要，如何在scikit中使用SVC学习？

[英]How do I use SVC in scikit learn if a feature is text summary?

原文 2014-10-26 16:43:43 6 1 python/ scikit-learn

My question is if there are 6 features in a dataset, if some feature are non-numeric, I can convert them via label encoder or other methods. 我的问题是数据集中是否有6个要素，如果某些要素是非数字的，我可以通过标签编码器或其他方法将其转换。 But if one of the feature is a huge text body (a paragraph), what pre-processing techniques should I be using to use to to model a SVC or KNN classifier (and not Naive Bayes) ? 但是，如果功能之一是巨大的文本正文（一个段落），我应该使用哪些预处理技术来建模SVC或KNN分类器（而不是朴素贝叶斯）？
Thanks. 谢谢。

1 个解决方案

You can use CountVectorizer or TfidfVectorizer, which are standard methods for text feature extraction. 您可以使用CountVectorizer或TfidfVectorizer，这是文本特征提取的标准方法。 You can find the documentation here , and a comprehensive tutorial on working with text data here . 您可以在此处找到文档，并在此处找到有关处理文本数据的全面教程。

如何将特征重要性与 scikit-learn 的树木森林结合使用？ - How do I use feature importances with forest of trees from scikit-learn?

如何在 scikit-learn 中使用列表作为功能 - How to use List as feature in scikit-learn

在scikit学习中使用功能缩放 - The use of feature scaling in scikit learn

如何存储 TfidfVectorizer 以备将来在 scikit-learn 中使用？ - How do I store a TfidfVectorizer for future use in scikit-learn?

scikit-learn中如何使用自己的算法提取特征（文本特征提取） - How to use own algorithm to extract features in scikit-learn ( text feature extraction)

如何使用scikit-learn对文本进行分类 - How to use scikit-learn to classify text

我应该在 scikit-learn 中使用多项式回归的特征缩放吗？ - Should I use feature scaling with polynomial regression with scikit-learn?

如何在 scikit-learn 的“pipeline”中使用自定义特征选择功能 - How can I use a custom feature selection function in scikit-learn's `pipeline`

如何从 scikit-learn 中的 TfidfTransformer 获得最匹配的功能名称？ - How do I get top matched feature names from TfidfTransformer in scikit-learn?

如何使用scikit-learn knn将定向数据用作功能？ - How to use directional data as feature using scikit-learn knn?

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将特征重要性与 scikit-learn 的树木森林结合使用？ - How do I use feature importances with forest of trees from scikit-learn? 如何在 scikit-learn 中使用列表作为功能 - How to use List as feature in scikit-learn 在scikit学习中使用功能缩放 - The use of feature scaling in scikit learn 如何存储 TfidfVectorizer 以备将来在 scikit-learn 中使用？ - How do I store a TfidfVectorizer for future use in scikit-learn? scikit-learn中如何使用自己的算法提取特征（文本特征提取） - How to use own algorithm to extract features in scikit-learn ( text feature extraction) 如何使用scikit-learn对文本进行分类 - How to use scikit-learn to classify text 我应该在 scikit-learn 中使用多项式回归的特征缩放吗？ - Should I use feature scaling with polynomial regression with scikit-learn? 如何在 scikit-learn 的“pipeline”中使用自定义特征选择功能 - How can I use a custom feature selection function in scikit-learn's `pipeline` 如何从 scikit-learn 中的 TfidfTransformer 获得最匹配的功能名称？ - How do I get top matched feature names from TfidfTransformer in scikit-learn? 如何使用scikit-learn knn将定向数据用作功能？ - How to use directional data as feature using scikit-learn knn?

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM