简体   繁体   中英

A machine learning model for matching pattern between two sets of strings?

I am trying to learn HTML transformations performed by a certain service using machine learning. I have broken down my problem into a pattern matching problem. For now I am trying to learn pattern in which tags are transformed. For example, for same data I have this pattern in original HTML "html, body, div, h1" and following pattern in transformed page "html, body, div, div, div". I have 14000 such data points and I want to train a model that would take as input patterns from original page and output transformed patterns. I have looked into a few NLP model but either I have failed to understand them completely or they were not very helpful. If someone could give me any pointers or preferably suggest some python based model that would be great.

your question is not clear enough to help you with some answer but still from what I was able to figure out your input will be html tags in a string pattern & your output too is a string pattern of html tags.

You can use a bi-directional LSTM or CRF for this kind of task. Read about them and you'll have a clear idea.

But if same input pattern is giving multiple output pattern then it will be difficult for most ML algos to learn. You can remove those data points and you'll be good to go.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM