简体   繁体   中英

python map each word to its own text

I have a list of words like this:

 word_list=[{"word": "python",
    "repeted": 4},
    {"word": "awsome",
    "repeted": 3},
    {"word": "frameworks",
    "repeted": 2},
    {"word": "programing",
    "repeted": 2},
    {"word": "stackoverflow",
    "repeted": 2},
    {"word": "work",
    "repeted": 1},
    {"word": "error",
    "repeted": 1},
    {"word": "teach",
    "repeted": 1}
    ]

,that comes from another list of notes:

note_list = [{"note_id":1,
"note_txt":"A curated list of awesome Python frameworks"},
{"note_id":2,
"note_txt":"what is awesome Python frameworks"},
{"note_id":3,
"note_txt":"awesome Python is good to wok with it"},
{"note_id":4,
"note_txt":"use stackoverflow to lern programing with python is awsome"},
{"note_id":5,
"note_txt":"error in programing is good to learn"},
{"note_id":6,
"note_txt":"stackoverflow is very useful to share our knoloedge"},
{"note_id":7,
"note_txt":"teach, work"},
  ]

I want to know how can I map every word to its own note:

maped_list=[{"word": "python",
        "notes_ids": [1,2,3,4]},
        {"word": "awsome",
        "notes_ids": [1,2,3]},
        {"word": "frameworks",
        "notes_ids": [1,2]},
        {"word": "programing",
        "notes_ids": [4,5]},
        {"word": "stackoverflow",
        "notes_ids": [4,6]},
        {"word": "work",
        "notes_ids": [7]},
        {"word": "error",
        "notes_ids": [5]},
        {"word": "teach",
        "notes_ids": [7]}
        ]

my work:

# i started by appending all the notes text into one list
notes_test = []
for note in note_list:
notes_test.append(note['note_txt'])
# calculate the reptition of each word
dict = {}
for sentence in notes_test:
    for word in re.split('\s', sentence): # split with whitespace
        try:
            dict[word] += 1
        except KeyError:
            dict[word] = 1
word_list= []
for key in dict.keys():
    word = {}
    word['word'] = key
    word['repeted'] = dict[key]
    word_list.append(word)

my question:

  1. how can i map the word list and note list to get the maped list
  2. how do you find the quality of my code, any remarks

thankyou

You can use a list comprehension:

mapped_list = [{"word": w_dict["word"],
                "notes_ids": [n_dict["note_id"] for n_dict in note_list
                              if w_dict["word"].lower() in n_dict["note_txt"].lower()]
                } for w_dict in word_list]

The result would be:

[{'word': 'python', 'notes_ids': [1, 2, 3, 4]},
 {'word': 'awsome', 'notes_ids': [4]},
 {'word': 'frameworks', 'notes_ids': [1, 2]},
 {'word': 'programing', 'notes_ids': [4, 5]},
 {'word': 'stackoverflow', 'notes_ids': [4, 6]},
 {'word': 'work', 'notes_ids': [1, 2, 7]},
 {'word': 'error', 'notes_ids': [5]},
 {'word': 'teach', 'notes_ids': [7]}]
  1. Try to create the maped_list while creating the dict, adding the index of a word when it's iterating.
  2. Do not use dict as variable, it's a python's reserved name to create dicts, like dict() , if you use it, it will be overwritten. Also, yuor input don't contain any other white spaces other than space, you can use sentence.split(). Other thing you can do is transform all words in lowercase, so they don't differ if write uppercase or not.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM