简体   繁体   中英

Convert Custom String to Dict

HELLO, I need to convert this kind of string to down dict

string = "OS: Windows 7 SP1, Windows 8.1, Windows 10 (64bit versions only)Processor: Intel Core i5 2400s @ 2.5 GHz, AMD FX 6120 @ 3.5 GHz or betterMemory: 6 GB RAMGraphics: NVIDIA GeForce GTX 660 with 2 GB VRAM or AMD Radeon HD 7870, with 2 GB VRAM or better - See supported List"

DICT

requirements={
'Os':'Windows 7 SP1, Windows 8.1, Windows 10 (64bit versions only)',
'Processor':' Intel Core i5 2400s @ 2.5 GHz, AMD FX 6120 @ 3.5 GHz or better',
'Memory':'6 GB RAM',
'Graphics':'VIDIA GeForce GTX 660 with 2 GB VRAM or AMD Radeon HD 7870, with 2 GB VRAM or better - See supported List',
}

and i tried this

string = string.split(':')

and stored each list with the dict like this

requirements['Os'] = string[0]
requirements['Processor'] = string[1]

but this is not the right way to do it! which brings me lot more errors. So, is there any custom functions or module for these things ?

I'd use a regex to just capture the text that you want, since the actual format of the input string won't be changing. This should give you want:


import re

string = "OS: Windows 7 SP1, Windows 8.1, Windows 10 (64bit versions only)Processor: Intel Core i5 2400s @ 2.5 GHz, AMD FX 6120 @ 3.5 GHz or betterMemory: 6 GB RAMGraphics: NVIDIA GeForce GTX 660 with 2 GB VRAM or AMD Radeon HD 7870, with 2 GB VRAM or better - See supported List"

matches = re.match(r'OS: (.+)Processor: (.+)Memory: (.+)Graphics: (.+)', string)

requirements = {
    'Os': matches.group(1),
    'Processor': matches.group(2),
    'Memory': matches.group(3),
    'Graphics': matches.group(4),
}

print(requirements)

The regex is a little inflexible though and I would advice just using this as a starting point.

See re.match

This is an alternative, non-regex solution, though regex may in principle be more efficient and cleaner:

input_string = "OS: Windows 7 SP1, Windows 8.1, Windows 10 (64bit versions only)Processor: Intel Core i5 2400s @ 2.5 GHz, AMD FX 6120 @ 3.5 GHz or betterMemory: 6 GB RAMGraphics: NVIDIA GeForce GTX 660 with 2 GB VRAM or AMD Radeon HD 7870, with 2 GB VRAM or better - See supported List"
# Splits by space
input_string = input_string.split()

# Assumes the keys are exactly like listed - including uppercase letters
key_list = ["OS", "Processor", "Memory", "Graphics"]
key_ind = []

output = {}

# Collect indices corresponding to each key
for key in key_list:
    for idx, el in enumerate(input_string):
        if key in el:
            key_ind.append(idx)
            break

# Build the dictionary
for idx, key in enumerate(key_list):
    if idx + 1 >= len(key_list):
        output[key] = (' ').join(input_string[key_ind[idx]+1:])
    else:
        lp_idx = input_string[key_ind[idx+1]].find(key_list[idx+1])
        lp = input_string[key_ind[idx+1]][:lp_idx]
        output[key] = (' ').join(input_string[key_ind[idx]+1:key_ind[idx+1]]) + ' ' + lp

print(output)

Here the string is first split based on the whitespace, then the code finds the position of each chunk of code that contains the keys-tags of the future dictionary. After storing the indices of each key, the code builds the dictionary based on them, with the last element being a special case.

For all the elements except for last, the code also extracts the information before the next key. This assumes there is no space between the next key and the last part of the text you want to store for the current key, ie it is always (64bit versions only)Processor: and not (64bit versions only) Processor: - if you cannot make that assumption, you will need to extend this code to cover the cases with a space.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM