简体   繁体   中英

Regular expression to Split a string

I'm having a string like

ATTRIBUTE ISC_FLOW OF XXX1234 : ENTITY IS FLOW_VERIFY(IDCODE) & INITIALIZE & (IDCODE WAIT TCK 1 32:01805043*0FFFFFFF), & FLOW_ENABLE & INITIALIZE & (ISC_ENABLE WAIT TCK 3, 20.0E-3), & FLOW_ERASE &

INITIALIZE & (ISC_ERASE WAIT TCK 3, 200.0E-3) & (ISC_DISCHARGE WAIT TCK 3, 200.0E-3), & FLOW_PRELOAD & INITIALIZE & (SAMPLE 68:0 WAIT TCK 1), &
FLOW_PROGRAM(ARRAY) & INITIALIZE & (ISC_ADDRESS_INIT WAIT TCK 1) & REPEAT 100& (ISC_PROGRAM 172:? WAIT TCK 3, 13.0E-3), & FLOW_VERIFY(ARRAY) & INITIALIZE & (ISC_ADDRESS_SHIFT 100:

$ADDR=800000000000000000000000& 0 & WAIT TCK 1) & REPEAT 100& (ISC_READ WAIT TCK 1, 1.0E-3 172:?:CRC) & (ISC_ADDRESS_SHIFT 100:$ADDR>>1 WAIT TCK 1)

I need write a pattern that should split each FLOW seperately.

So that the result will be like

1. FLOW_VERIFY(IDCODE)                  INITIALIZE        (IDCODE        WAIT TCK 1 32:01805043*0FFFFFFF)
2. FLOW_ENABLE                          INITIALIZE        (ISC_ENABLE    WAIT TCK 3, 20.0E-3)
3. FLOW_ERASE                           INITIALIZE        (ISC_ERASE     WAIT TCK 3, 200.0E-3)        (ISC_DISCHARGE WAIT TCK 3, 200.0E-3)
4. FLOW_PRELOAD                         INITIALIZE        (SAMPLE 68:0 WAIT TCK 1)
5. FLOW_PROGRAM(ARRAY)                  INITIALIZE        (ISC_ADDRESS_INIT         WAIT TCK 1)    REPEAT 100       (ISC_PROGRAM 172:? WAIT TCK 3, 13.0E-3)
6. FLOW_VERIFY(ARRAY)                   INITIALIZE        (ISC_ADDRESS_SHIFT 100:$ADDR=800000000000000000000000        0         WAIT TCK 1)      REPEAT 100  (ISC_READ  WAIT TCK 1, 1.0E-3 172:?:CRC) (ISC_ADDRESS_SHIFT 100:$ADDR>>1 WAIT TCK 1)

I've tried many patterns but i could not parse it properly.

Here is the pattern i've finally tried.

"(?<Func>[a-z0-9\\(\\)_]*)[\r\t\n ]*&[\r\t\n ]*(?<Instr>(INITIALIZE|REPEAT|TERMINATE))[\r\t\n ]*[0-9]*&(?<Action>[0-9a-z \r\t\n:*,\\(\\).\\-_\\?!$=]*)"

Please help me to write a pattern that seperates each FLOW value from the above string.

Since all your fields are nicely separated by & , I would suggest to

  • split the string on & , which gives you an array, and
  • iterate through the array with a few if statements.

I would consider this solution to be more readable (and, thus, more maintainable) than a huge regular expression.

I tried to come up with a regular expression and I couldn't. I think this would be much simpler to do using a plain string search. Along the lines of the following:

string flows = "<your example>";
int index = -1;
List<string> flowStrings = new List<string>();
const string flowStr = "FLOW_";
index = flows.IndexOf(flowStr);
int nextIndex = flows.IndexOf(flowStr, index + 1);
while(index != -1 && nextIndex != -1)
{
    string currentFlow = flows.Substring(index, nextIndex - index);
    index = nextIndex;
    nextIndex = flows.IndexOf(flowStr, index + 1);
}

Of course, I don't have a lot of experience using regular expressions.

试试这个:

(?<Func>FLOW_(?:[A-Z]+)(?:\([A-Z]+\))?)\s+&\s+(?<Inst>[A-Z]+)\s+&\s(?<Action>(?:(?:(?:\([^)]+\))|[A-Z0-9\s]+)(?:\s?&\s)?)+)

Since I believe this question is related to your other question ( How to split a string in C# ), I believe this might help you.

You can use the solution provided there to split your input data into several strings (as a starting point, before further parsing).

So, if you define your Split method like this:

private static List<string> Split(string input, IEnumerable<string> delimiters)
{
    List<string> results = new List<string>();
    List<int> indices = new List<int>();

    // get indices of delimiters
    foreach (string s in delimiters)
    {
        int idx = input.IndexOf(s);
        if (idx >= 0)
            indices.Add(idx);
    }
    indices.Sort();
    if (indices.Count > 0)
    {
        indices.Add(input.Length);
        // split the string
        for (int i = 0; i < indices.Count - 1; i++)
        {
            int idx = indices[i], nextIdx = indices[i + 1];
            results.Add(input.Substring(idx, nextIdx - idx).Trim());
        }
    }
    return results;
}

Then this will split it at all defined "FLOW" delimiters:

string data = "ATTRIBUTE ISC_FLOW ...

string[] delimiters = new string[]
{
    "FLOW_VERIFY",
    "FLOW_ENABLE",
    "FLOW_ERASE",
    "FLOW_PRELOAD",
    "FLOW_PROGRAM"
};

List<string> results = Split(data, delimiters);
for (int i = 0; i < results.Count; i++)
{
    Console.WriteLine("{0}. {1}", i + 1, results[i]);
    Console.WriteLine();
}

Console.Read();

Finally, you can split each of your results at & characters to get individual tokens:

foreach (string item in results)
{
    List<string> tokens = new List<string>();

    // split at &
    foreach (string t in item.Split('&'))
    {
        // trim spaces
        string token = t.Trim();

        // ignore empty tokens
        if (token == "")
            continue;

        tokens.Add(t);
    }

    // print tokens, separated by tabs
    foreach (string t in tokens)
        Console.Write("{0}\t", t);

    Console.WriteLine();
    Console.WriteLine();
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM