简体   繁体   English

从 Jupyter 中的文本文件中删除所有英语和其他标点符号

[英]Removing all English and other punctuation form the text file in Jupyter

I have a text file I wanted to work on some NLP task.我有一个文本文件,我想处理一些 NLP 任务。 But I am processing for Local language.但我正在处理本地语言。 That file contains lots of English words and Punctuation marks.该文件包含大量英文单词和标点符号。 I wanted to get rid of all the Latin and other punctuation from that text file.我想从那个文本文件中去掉所有的拉丁文和其他标点符号。 How this is possible using Jupyter notebook TIA使用 Jupyter notebook TIA 如何实现这一点

Sure, you can accomplish this with just Python当然,您只需 Python 即可完成此操作

text = "Hello, World!!"
# put everything you wish to filter out in this list
filterList = [',', '!']

filteredList = filter(lambda c: c not in filterList, text)
print(''.join(filteredList))

Will give Hello World会给Hello World

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从文本文件中删除所有标点符号、空格和其他非字母字符,包括数字 - Removing all punctuation, spaces and other non-letter characters including numbers from a text file 程序从文本文件中读取数据并以缩写形式显示输出,并向用户输入所有标点符号 - program to read data from a text file and display the output in abbreviated form with all punctuation marks as entered my the user 去除标点符号的文本处理功能 - Text processing function for removing punctuation 有没有办法删除文本中所有不在其他文本中的单词? - Is there a way of removing all the words in the text that are not in other text? 从 dataframe 中的字符串中删除所有标点符号 - Removing all punctuation from string in dataframe 从文本问题中删除标点符号/数字 - Removing punctuation/numbers from text problem Python-从文本中删除一些标点符号 - Python - removing some punctuation from text 删除标点符号后从文本文件中打印唯一单词列表,并找到最长的单词 - Print a list of unique words from a text file after removing punctuation, and find longest word 删除标点符号并在python CSV文件中更改为小写 - Removing punctuation and change to lowercase in python CSV file 文件中单词的平均长度并删除标点符号python 3 - Average length of words in file and removing punctuation python 3
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM