简体   繁体   English

将CSV文件的行转换为元组列表?

[英]Convert rows of CSV file to a list of tuples?

I have a .CSV file that has two columns one for Tweet and the other for sentiment value formatted like so (but for thousands of tweets): 我有一个.CSV文件,该文件包含两列,一列用于Tweet,另一列用于情感值格式(但用于数千条Tweet):

I like stackoverflow,Positive
Thanks for your answers,Positive
I hate sugar,Negative
I do not like that movie,Negative
stackoverflow is a question and answer site,Neutral
Python is oop high-level programming language,Neutral

I would like to get the output like this: 我想要这样的输出:

negfeats = [('I do not like that movie','Negative'),('I hate sugar','Negative')]
posfeats = [('I like stackoverflow','Positive'),('Thanks for your answers','Positive')]
neufeats = [('stackoverflow is a question and answer site','Neutral'),('Python is oop high-level programming language','Neutral')]

I have tried this below to do so but I got some missing chars in tuples. 我在下面尝试过这样做,但是我在元组中缺少一些字符。 Also, how can I keep x, y, and z as an integer and not a float? 另外,如何保持x,y和z为整数而不是浮点数?

import csv
neg = ['Negative']
pos = ['Positive']
neu = ['Neutral']
neg_counter=0
pos_counter=0
neu_counter=0
negfeats = []
posfeats = []
neufeats = []
with open('ff_tweets.csv', 'Ur') as f:
    for k in f:
        if any(word in k for word in neg):
            negfeats = list(tuple(rec) for rec in csv.reader(f, delimiter=','))
            neg_counter+=1
        elif any(word in k for word in pos):
            posfeats = list(tuple(rec) for rec in csv.reader(f, delimiter=','))
            pos_counter+=1
        else:
            neufeats = list(tuple(rec) for rec in csv.reader(f, delimiter=','))
            neu_counter+=1
x = neg_counter * 3/4
y = pos_counter * 3/4
z = neu_counte * 3/4
print negfeats 
print posfeats 
print neufeats 
print x
print y
print z

This should work 这应该工作

import csv

neg = 'Negative'
pos = 'Positive'
neu = 'Neutral'
negfeats = []
posfeats = []
neufeats = []

with open('ff_tweets.csv', 'Ur') as f:
    for r in csv.reader(f):
        if r[1] == neg:
            negfeats.append((r[0], r[1]))
        if r[1] == pos:
            posfeats.append((r[0], r[1]))
        if r[1] == neu:
            neufeats.append((r[0], r[1]))

x = len(negfeats) * float(3)/4
y = len(posfeats) * float(3)/4
z = len(neufeats) * float(3)/4

print negfeats 
print posfeats 
print neufeats 
print x
print y
print z

Try this, using Pandas. 使用Pandas尝试一下。 'Sentiment' is a column in the csv file: “情感”是csv文件中的一列:

import pandas as pd

df = pd.read_csv('ff_tweets.csv')

pos = tuple(df.loc[df['Sentiment'] == 'Positive'].apply(tuple, axis = 1))
neu = tuple(df.loc[df['Sentiment'] == 'Neutral'].apply(tuple, axis = 1))
neg = tuple(df.loc[df['Sentiment'] == 'Negative'].apply(tuple, axis = 1))

print pos, neg, neu

Output: 输出:

(('I like stackoverflow', 'Positive'), ('Thanks for your answers', 'Positive')) (('I hate sugar', 'Negative'), ('I do not like that movie', 'Negative')) (('stackoverflow is a question and answer site', 'Neutral'), ('Python is oop high-level programming language', 'Neutral'))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM