简体   繁体   中英

Need help inserting commas after each character in specific part of string

In the program I'm working on, I need to strip the tags around certain parts of a string, and then insert a comma after each character WITHIN the tag (not not after any other characters in the string). In case this doesn't make sense, here's an example of what needs to happen -

This is a string with a < a > tag < /a > (please ignore the spaces within the tag)

(needs to become)

This is a string with at,a,g,.

Can anyone help me with this? I've managed to strip the tags using RegEx, but I can't figure out how to insert the commas only after the characters contained within the tag. If someone could help that would be great.

@Dour High Arch I'll elaborate a little bit. The code is for a text-to-speech app that won't recognize SSML tags. When the user enters a message for the text to speech app, they have the option of enclosing a word in a < a > tag to make the speaker say the world as an acronym. Because the acronym SSML tag won't work, I want to remove the < a > tag whenever present, and place commas after each character contained in the tag to fake it out (ex: < a > test< /a > becomes t,e,s,t,). All the non-tagged words in the string do not need commas after them, just those enclosed in tags (see my first example if need be).

If you have figured out the regex, I would imagine it would be simple to capture the inner text of the tag. Then it's a really simple operation to insert the commas:

  var commaString = string.Join(",", capturedString.ToList());

Assuming you have your target string already parsed via your RegEx ie no tags around it...

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication32
{
    class Program
    {
        static void Main(string[] args)
        {
            // setup a test string
            string stringToProcess = "Test";

            // actual solution here
            string result = String.Concat(stringToProcess.Select(c => c + ","));

            // results: T,e,s,t,
            Console.WriteLine(result);
        }
    }
}

Parsing XML is very problematic because you may have to deal with things like CDATA sections, nested elements, entities, surrogate characters, and on and on. I would use a state-based parser like ANTLR.

However, if you are just starting out with C# it is instructive to solve this using the built-in.Net string and array classes. No ANTLR, LINQ, or regular expressions needed:

using System;

class ReplaceAContentsWithCommaSeparatedChars
{
    static readonly string acroStartTag = "<a>";
    static readonly string acroEndTag = "</a>";

    static void Main(string[] args)
    {
        string s = "Alpha <a>Beta</a> Gamma <a>Delta</a>";
        while (true)
        {
            int start = s.IndexOf(acroStartTag);
            if (start < 0)
                break;

            int end = s.IndexOf(acroEndTag, start + acroStartTag.Length);
            if (end < 0)
                end = s.Length;

            string contents = s.Substring(start + acroStartTag.Length, end - start - acroStartTag.Length);
            string[] chars = Array.ConvertAll<char, string>(contents.ToCharArray(), c => c.ToString());
            s = s.Substring(0, start)
                + string.Join(",", chars)
                + s.Substring(end + acroEndTag.Length);
        }

        Console.WriteLine(s);
    }
}

Please be aware this does not deal with any of the issues I mentioned. But then, none of the other suggestions do either.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM