简体   繁体   English

在python中,如何将两列的CSV文件转换成bigrams?

[英]In python, how do you convert a CSV file of two columns into bigrams?

I'd like to turn the given csv file into bigrams:我想将给定的 csv 文件转换为双字母组:

在此处输入图像描述

demo.csv:演示.csv:

words   class
hi my name is Jeff. brown
Wow, I am awesome.  red
I am a professional.    red
Will you marry me?  red
How are you today?  brown
Today, I woke up with a smile on my face.   red
My day today has been amazing.  brown

First, make sure to read your data and select the word-column.首先,确保读取您的数据和 select 字列。 You can use pandas.read_csv for that.您可以为此使用pandas.read_csv Since I dont have your.csv-file, I have recreated the data like that:因为我没有你的 .csv 文件,所以我重新创建了这样的数据:

import pandas as pd
df = pd.DataFrame(
    ["hi my name is Jeff.",
     "Wow, I am awesome.",
     "I am a professional.",
     "Will you marry me?",
     "How are you today?",
     "Today, I woke up with a smile on my face.",
     "My day today has been amazing."], columns=['words'])

which looks like this:看起来像这样:

                                       words
0                        hi my name is Jeff.
1                         Wow, I am awesome.
2                       I am a professional.
3                         Will you marry me?
4                         How are you today?
5  Today, I woke up with a smile on my face.
6             My day today has been amazing.

A library you can use to create bigrams is nltk .可用于创建二元语法的库是nltk In this example I create a function that returns the bigrams as a list.在此示例中,我创建了一个 function,它以列表的形式返回双字母组。

import nltk
def bigrams(words):
    return list(nltk.bigrams(nltk.word_tokenize(words)))

And then apply this function to my DataFrame and assign the result to a new column called bigrams like this:然后将这个 function 应用到我的 DataFrame 并将结果分配给一个名为 bigrams 的新列,如下所示:

df["bigrams"] = df.words.apply(bigrams)

the new column now looks like this:新列现在看起来像这样:

0    [(hi, my), (my, name), (name, is), (is, Jeff),...
1    [(Wow, ,), (,, I), (I, am), (am, awesome), (aw...
2    [(I, am), (am, a), (a, professional), (profess...
3    [(Will, you), (you, marry), (marry, me), (me, ?)]
4    [(How, are), (are, you), (you, today), (today,...
5    [(Today, ,), (,, I), (I, woke), (woke, up), (u...
6    [(My, day), (day, today), (today, has), (has, ...

I hope this helps, feel free to ask any question or tell me if you want to change something:)我希望这会有所帮助,如果您想更改某些内容,请随时提出任何问题或告诉我 :)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM