简体   繁体   中英

Check if csv contains Chinese characters in python then output

["

I have a csv file that contains English and Chinese, how can I separate them and then save the ones that contain Chinese as "Chinese" and those that don't contain Chinese as "English", I found a code to differentiate but I don't know how to save them.<\/i>


def is_chinese(string):
    for ch in string:
        if u'\u4e00' <= ch <= u'\u9fff':
            return True

    return False

ret1 = is_chinese("a中国aaa")
print(ret1)

ret2 = is_chinese("123")
print(ret2)

["

This code output the lines that contains the chinese character and save those into a file called "detected.txt"<\/i>

import re

characters=[]
i = 0
with open('01.csv','r',encoding='utf-8') as file: #Open CSV file
    with open('detected.txt', 'r+') as f: #Open file to write

        for line in file.readlines(): #Read each line of CSV file
            if re.findall(r'[\u4e00-\u9fff]+', line) == []: #If there is no Chinese character in the line
                pass
            else:
                characters.append(re.findall(r'[\u4e00-\u9fff]+', line)) #Append the Chinese character to the list
                if str(characters[i][0]) in line: #If the Chinese character is in the line
                    f.write(line) #Append the line to the file
                i+=1
    

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM