简体   繁体   中英

Convert every line in the string into dictionary key

Hi I am new to python and dont know whether can I ask this basic question in this site or not

I want to convert every line in the string into a key and assign 0 as an value

MY string is:

s = '''
sarika

santha

#

akash


nice
'''

I had tried this https://www.geeksforgeeks.org/ways-to-convert-string-to-dictionary/ ways but thought not useful for my requirement

Pls help anyone Thanks in advance

Edit:

Actually I had asked for basic string but I am literally for followed string

s="""
san
francisco

Santha

Kumari



this one
"""

 Here it should take {sanfrancisco:0 , santha kumari:0 , this one: 0 }

This is the challenge I am facing

Here in my string if having more than 1 new line gap it should take the nextline string as one word and convert into key

You can do it in the below way:

>>> s="""
... hello
... #
... world
... 
... vk
... """
>>> words = s.split("\n")
>>> words
['', 'hello', '#', 'world', '', 'vk', '']
>>> words = words[1:len(words)-1]
>>> words
['hello', '#', 'world', '', 'vk']
>>> word_dic = {}
>>> for word in words:
...     if word not in word_dic:
...             word_dic[word]=0
... 
>>> word_dic
{'': 0, 'world': 0, '#': 0, 'vk': 0, 'hello': 0}
>>> 

Please let me know if you have any question.

You could continuously match either all lines followed by 2 newlines, or match all lines followed by a single newline.

^(?:\S.*(?:\n\n\S.*)+|\S.*(?:\n\S.*)*)

The pattern matches

  • ^ Start of string
  • (?: Non capture group
    • \S.* Match a non whitespace char and the rest of the line
    • (?:\n\n\S.*)+ Repeat matching 1+ times 2 newlines, a non whitespace char and the rest of the line
    • | Or
    • \S.* Match a single non whitespace char and the rest of the line
    • (?:\n\S.*)* Optionally match a newline, a non whitespace char and the rest of the line
  • ) Close non capture group

Regex demo | Python demo

For those matches, replace 2 newlines with a space and replace a single newline with an empty string.

Then from the values, create a dictionary and initialize all values with 0.

Example

import re

s="""
san
francisco

Santha

Kumari



this one
"""
pattern = r"^(?:\S.*(?:\n\n\S.*)+|\S.*(?:\n\S.*)*)"
my_dict = dict.fromkeys(
    [
        re.sub(
            r"(\n\n)|\n",
               lambda n: " " if n.group(1) else "", s.lower()
        ) for s in re.findall(pattern, s, re.MULTILINE)
    ],
    0
)
print(my_dict)

Output

{'sanfrancisco': 0, 'santha kumari': 0, 'this one': 0}

You could do it like this:

# Split the string into a list
l = s.split()
dictionary = {}

# iterate through every element of the list and assign a value of 0 to it

n = 0


for word in l:
   while n < len(l) - 1:
       if word == "#":
           continue
       w = l[n] + l[n+1]
       dictionary.__setitem__(w, 0)
       n+=2
print(dictionary)

steps -

  1. Remove punctuations from a string via translate.
  2. split words if they're separated by 2 \n character
  3. remove the spaces from the list
  4. remove \n character and use dict comprehension to generate the required dict
import string
s = '''
sarika
santha

#

akash




nice
'''

s = s.translate(str.maketrans('', '', string.punctuation))
word_list = s.split('\n\n')
while '' in word_list:
    word_list.remove('')
result = {word.replace('\n', ''): 0 for word in word_list}
print(result)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM