簡體   English   中英

Python-格式化二進制分類器的CSV數據

[英]Python - format csv data for Binary Classifier

我有一些csv數據,格式如下:

headers = [artist_list, song_list, lyrics_track, lyrics_artist, lyrics]`, 

和這個片段:

with open('lyrics.tsv', "rU") as f:
    reader = csv.reader(f, delimiter="\t")
    for i, line in enumerate(reader):
        print 'line[{}] = {}'.format(i, line)

印刷品:

(...)

line[808] = ['Pearl Jam', 'Wishlist', 'Wishlist', 'Pearl Jam', "I wish I was a neutron bomb\nfor once I could go off\nI wish I was a sacrifice\nbut somehow still lived on\nI wish I was a sentimental\nornament you hung on\nthe Christmas tree, I wish I was\nthe star that went on top\nI wish I was the evidence\nI wish I was the grounds\nfor fifty million hands upraised and opened toward the sky\nI wish I was a sailor with\nsomeone who waited for me\nI wish I was as fortunate\nas fortunate as me\nI wish I was a messenger\nand all the news was good\nI wish I was the full moon shining\noff a Camaro's hood\nI wish I was an alien\nat home behind the sun\nI wish I was the souvenir\nyou kept your house key on\nI wish I was the pedal break\nthat you depended on\nI wish I was the verb to trust\nand never let you down\nI wish I was a radio song\nthe one that you turned up\nI wish ..."]

現在我想使用數據進行分類,僅保留所有行的lyrics ,並為二進制值添加一列(始終相同,為0 ),因此數據被轉換為:

 lyrics                                                   type

 (...)                                                   (...)

 I wish I was a neutron bomb\nfor once I could go off..    0

我該如何從上面的代碼開始呢?

我認為類似這樣的方法可能會起作用(假設您的數據位於名為lyrics_df的數據幀中):

lyrics_df['type']=0

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM