![](/img/trans.png)
[英]how do I classify or regroup dataset based on time variation in python
[英]I have a list of lists how do i classify based on language?
I have three lists:
id = [1,3,4]
text = ["hello","hola","salut"]
date = ["20-12-2020","21-04-2018","15-04-2016"]
#I then combined it all in one list:
new_list = zip(id, text, date)
#which looks like [(1,"hello","20-12-2020"),(3,"hola","21-04-2018"),(4,"salut","15-04-2016")
I want to delete the whole list if it is not in english, do to this i installed lang id and am using lang id.classify
I ran a loop on only the text and its working but am unsure how to delete the whole value such as: (3,"hola","21-04-2018") as hola is not in english.
我正在嘗試獲得一個新列表,其中只有那些只有英文的列表。 我想進一步將 output 列表寫入 xml 文件中。 為此,我制作了一個樣本 xml 文件,並使用日期作為父鍵,因為多個文本的日期可以相同。
試試這個簡單的 for 循環
new_list = [(1,"hello","20-12-2020"),(3,"hola","21-04-2018"),(4,"salut","15-04-2016")]
for x in new_list:
# condition to check if word or sentence is english
if not isEnglishWord(x[1]):
new_list.pop(x)
不確定 lang id.classify
是如何工作的或者它接受的參數是什么,但這樣的事情應該有效:
for i in range(len(new_list)):
if id.classify(new_list[i][1]) != 'english':
new_list.pop[i]
在這種情況下,我假設 id.classify 接收一個 str 並輸出該詞所屬的語言(作為 str)。
我還使用范圍列表方法進行迭代,因此我們不會在迭代時更改列表。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.