简体   繁体   English

从字符串中删除多个字符(组成一个单词)

[英]Removing multiple characters (forming a word) from a string

I have a challenge which I have tried to solve using split.strings but that seems to not be designed for immutable strings the way I need but mainly for whitespace and single character removal. 我有一个尝试使用split.strings尝试解决的挑战,但这似乎不是为我所需要的不可变字符串设计的,而是主要用于空格和单个字符删除的。 I have also tried regex but as they are not due in my current python course for a few weeks I am a bit stuck on how they work (although I know the basics of what they are for). 我也尝试过正则表达式,但是由于它们在我当前的python课程中还没有到期,所以我对它们的工作方式有些犹豫(尽管我知道它们的基本用法)。

So, I have a JSON file that presents machine and people data from a factory and I need to parse the machine data separate from the people data that is gathered within the facility. 因此,我有一个JSON文件,用于显示工厂的机器和人员数据,我需要将机器数据与设施内收集的人员数据分开解析。 Converting the JSON file and selecting the required data is working but within one of the parameters called name is a mix of people and machine info I need to separate out. 可以转换JSON文件并选择所需的数据,但是在名为name的参数之一内,需要将人员和机器信息混合在一起。 An example of two branches is below: 下面是两个分支的示例:

"id": "b4994c877c9c",
    "name": "forklift_0001", # here is the machine
    "areaId": "Tracking001",
    "areaName": "Ajoneuvo",
    "color": "#FF0000",
    "coordinateSystemId": "CoordSys001",
    "coordinateSystemName": null,
    "covarianceMatrix": [

"id": "b4994c879275",
    "name": "guest_0001", # here is a person
    "areaId": "Tracking001_2D",
    "areaName": "staff1",
    "color": "#CCFF66",
    "coordinateSystemId": "CoordSys001",
    "coordinateSystemName": null,
    "covarianceMatrix": [

The code I have to convert is below: 我必须转换的代码如下:

for f in file_list:
    print('Input file: ' + f) # Replace with desired operations

with open(f, 'r') as f:

    distros = json.load(f)
    output_file = 'Output' + str(output_nr) + '.csv'

    with open(output_file, 'w') as text_file:
        for distro in distros:
            print(distro['name'] + ',' + str(distro['positionTS']) + ',' + str(distro['position']), file=text_file)

So what I need to do within the distro['name'] array (is it an array??) is to go through the 500k lines and ask it to remove anything that isn't forklift, crane, machine, etc, leaving only them (and later the opposite) and this I cannot figure out. 因此,我需要在distro['name']数组(是否是数组?)中做的是遍历50万行,并要求它删除所有不是叉车,起重机,机器等的东西,只留下他们(后来相反),这我不知道。

All help sincerely appreciated. 衷心感谢所有帮助。

As I understand your question you want to give each entry a flag 'machine' or 'person', based on the "name" tag. 据我了解,您想根据“名称”标签为每个条目赋予一个标志“机器”或“人”。

Assigning such a flag (or just directly writing to the appropriate file) could be done with for instance something like 分配这样的标志(或直接写到适当的文件)可以用例如

with open(file1, 'w') as _file1, open(file2, 'w') as _file2, open(file3, 'w') as _file3:
    for distro in distros:
        yourstring = distro['name'] + ',' + str(distro['positionTS']) + ',' + str(distro['position'])

        if distro['name'].startswith(('forklift','crane',...)):
            _file1.write(yourstring)
        elif distro['name'].startswith(('guest','employee',...)):
            _file2.write(yourstring)
        else:
            _file3.write(yourstring)

The three files that have been opened for writing together will contain all the entries in the end, split up between machines, people, and neither of both. 已打开以一起写入的三个文件将最后包含所有条目,这些条目在机器,人员和两者之间均分开。

Does that solve your issue? 这样可以解决您的问题吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM