簡體   English   中英

Python:循環遍歷文本 CSV

[英]Python: Loop through text CSV

我正在嘗試使用 lexnlp 來通讀我有一個法律案例的 csv,以便分離出文本中發現的不同信息,例如列出的所有行為、日期等。

我已經完全按照 lexnlp 網站指示的方式格式化了所有內容。 但是,我的 csv 沒有正確讀取。 我的教授建議我編寫一個循環來遍歷 csv,以便讀取每個句子。 在搜索了關於編寫迭代循環的不同信息后,我仍然不太明白該怎么做。

for row in text.iterrows():找到for row in text.iterrows():輸入for row in text.iterrows():但我不知道應該運行什么操作。 我問過同學,他們也好像迷路了。 下面是我的代碼。 任何幫助都是有用的。


url = 'https://raw.githubusercontent.com/unt-iialab/INFO5731_Spring2020/master/In_class_exercise/01-05-1%20%20Adams%20v%20Tanner.txt'
text = pd.read_csv(url,error_bad_lines=False, names=['sentence'])


#Output appears & reads  fine with this portion
#Indicates that CSV is getting read properly
print('Number of Sentences:' , len(text['sentence']))

!pip install lexnlp


#Cannot get nlp module to read csv
import lexnlp.extract.en.acts

#This Version gives back empty brackets. I believe because it is reading text as a string. 
print(lexnlp.extract.en.acts.get_act_list('text'))

#This is the format used in the number of sentences. It creates an error message.
print(lexnlp.extract.en.acts.get_act_list(text['sentence']))

#This is the format that the lexnlp site reccommends. It also creates an error message. 
print(lexnlp.extract.en.acts.get_act_list(text))




#The following are just different features of the lexnlp module that I am going to run. 
import lexnlp.extract.en.amounts
print(list(lexnlp.extract.en.amounts.get_amounts(text)))

import lexnlp.extract.en.citations
print(list(lexnlp.extract.en.citations.get_citations(text)))

import lexnlp.extract.en.entities.nltk_re
print(list(lexnlp.extract.en.entities.nltk_re.get_entities.nltk_re.get_companies(text)))

import lexnlp.extract.en.conditions
print(list(lexnlp.extract.en.conditions.get_conditions(text)))

import lexnlp.extract.en.constraints
print(list(lexnlp.extract.en.constraints.get_constraints(text)))

import lexnlp.extract.en.copyright
print(list(lexnlp.extract.en.copyright.get_copyright(text)))

import lexnlp.extract.en.courts

import lexnlp.extract.en.cusip
print(lexnlp.extract.en.cusip.get_cusip(text))

import lexnlp.extract.en.dates
print(list(lexnlp.extract.en.dates.get_dates(text)))

import lexnlp.extract.en.definitions
print(list(lexnlp.extract.en.definitions.get_definitions(text)))

import lexnlp.extract.en.distances
print(list(lexnlp.extract.en.distances.get_distances(text)))

import lexnlp.extract.en.durations
print(list(lexnlp.extract.en.durations.get_durations(text)))

import lexnlp.extract.en.money
print(list(lexnlp.extract.en.money.get_money(text)))

import lexnlp.extract.en.percents
print(list(lexnlp.extract.en.percents.get_percents(text)))

import lexnlp.extract.en.pii
print(list(lexnlp.extract.en.pii.get_pii(text)))

import lexnlp.extract.en.ratios
print(list(lexnlp.extract.en.ratios.get_ratios(text)))

import lexnlp.extract.en.regulations
print(list(lexnlp.extract.en.regulations.get_regulations(text)))

import lexnlp.extract.en.trademarks
print(list(lexnlp.extract.en.trademarks.get_trademarks(text)))

import lexnlp.extract.en.urls
print(list(lexnlp.extract.en.urls.get_urls(text)))

以下是我收到的錯誤代碼:

<ipython-input-2-301f76c3c169> in <module>()
     19 
     20 #This is the format used in the number of sentences. It creates an error message.
---> 21 print(lexnlp.extract.en.acts.get_act_list(text['sentence']))
     22 
     23 #This is the format that the lexnlp site reccommends. It also creates an error message.

2 frames
/usr/local/lib/python3.6/dist-packages/lexnlp/extract/en/acts.py in get_acts_annotations(text)
     37 
     38 def get_acts_annotations(text: str) -> Generator[ActAnnotation, None, None]:
---> 39     for match in ACT_PARTS_RE.finditer(text):
     40         captures = match.capturesdict()
     41         act_name = ''.join(captures.get('act_name') or [])

TypeError: expected string or buffer```

試試下面的代碼:

導入 csv

使用 open('file.csv', 'rb') 作為 csvfile: csvreader = csv.reader(csvfile, delimiter=',')

for row in csvreader:
    print(row)

“從 csv 文件中讀取的每一行都作為字符串列表返回。不執行自動數據類型轉換。”

參考

import pandas as pd

df = pd.read_csv('csv_file.csv', index_col=None , header=True) 

pd.read_csv('') 取決於您使用的相對或絕對路徑。 它會將您的數據作為 DataFrame 讀取。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM