简体   繁体   中英

find letter at the beginning of the string followed by space

I want to Write a method which will count the letters "a" or "A". The "a" can be either at the beginning of the string followed by space, or anywhere in the string surrounded by spaces. the result should be 2 but I am getting 5 with my code, how do i modify the code so it detects the space before and after a?

using System;

namespace Hi
{
    class Program
    {
        static void Main(string[] args)
        {
            string t1 = "A book was lost. There is a book on the table. Is that the book?";

            Console.WriteLine(t1);
            Console.WriteLine(" - Found {0} articles, should be 2.", CountArticles(t1));
            Console.ReadKey();
        }

        static int CountArticles(string text)
        {
            int count = 0;

            {
                for (int i = 0; i < text.Length; ++i)
                {
                    if (text[i] == 'a' || text[i] == 'A')
                    {
                        ++count;
                    }
                }            
                return count;
        }
    }
}
}

I suggest using regular expressions in order to count all the matches; something like this:

  using System.Text.RegularExpressions;

  ... 

  string t1 = "A book was lost. There is a book on the table. Is that the book?";

  int count = Regex.Matches(t1, @"\bA\b", RegexOptions.IgnoreCase).Count;

In case you insist on for loop, you have to check for spaces :

  static int CountArticles(string text)
  {
      int count = 0;

      for (int i = 0; i < text.Length; ++i)
      {
          if (text[i] == 'a' || text[i] == 'A')
          {
             // So we have a or A, now we have to check for spaces:
             if (((i == 0) || char.IsWhiteSpace(text[i - 1])) &&
                 ((i == text.Length - 1) || char.IsWhiteSpace(text[i + 1])))
                ++count;
           }
       }            

       return count;
  } 

Personally, I'm a huge fan of simple DFA state machines. Feels strange, so I'll explain why... It all boils down to a few reasons:

  1. DFA's are incredibly fast; if you do parsing like I do, there's a high chance that you'll throw a lot of data at this code. Performance matters.
  2. DFA's are very easy to unit test; only thing you need to do is to make sure you test all states and transitions.
  3. Code coverage reports on DFA's are very usable. It doesn't guarantee that you're design is correct, but if it is, it'll work. You'll definitely get a lot more information from it than coverage on a Regex.

The main disadvantages are:

  1. That they require more work to build. (*)
  2. That you should use a piece of paper to think them out (and document them as such for other people).

Once you get the idea, it's easy to construct a DFA. Grab a piece of paper, think about the possible states of your program (draw circles), and transitions between them (arrows between the circles). Last, think about what should happen when.

The translation to code is pretty much 1:1. Using a switch is just one implementation - there are other ways to do this. Anyways, without further interruption, here goes:

enum State
{
    SpaceEncountered,
    ArticleEncountered,
    Default
};

static int CountArticles(string text)
{
    int count = 0;
    State state = State.SpaceEncountered; // start of line behaves the same

    for (int i = 0; i < text.Length; ++i)
    {
        switch (state)
        {
            case State.SpaceEncountered:
                if (text[i] == 'a' || text[i] == 'A')
                {
                    state = State.ArticleEncountered;
                }
                else if (!char.IsWhiteSpace(text[i]))
                {
                    state = State.Default;
                }
                break;

            case State.ArticleEncountered:
                if (char.IsWhiteSpace(text[i]))
                {
                    ++count;
                    state = State.SpaceEncountered;
                }
                else
                {
                    state = State.Default;
                }
                break;
            case State.Default: // state 2 = 
                if (char.IsWhiteSpace(text[i]))
                {
                    state = State.SpaceEncountered;
                }
                break;
        }
    }

    // if we're in state ArticleEncountered, the next is EOF and we should count one extra
    if (state == State.ArticleEncountered)
    {
        ++count;
    }
    return count;
}

static void Main(string[] args)
{
    Console.WriteLine(CountArticles("A book was lost. There is a book on the table. Is that the book?"));
    Console.ReadLine();
}

(*) Now, I see people pondering, well that's a lot of code for such an easy problem. Yeah, that's very true, which is why there are ways to generate DFA's. The most usual ways to do this are to construct a lexer or a regex. For this toy problem that's a bit much, but perhaps you're real problem is a bit larger...

像这样使用String.Split

int count = text.Split(' ').Count(c => c == "a" || c == "A");

You may also use TextInfo class to Make string as a Title Case so the beginning of the string or followed by space would be

A Book Was Lost. There Is A Book On The Table. Is That The Book?

Now you can use your function CountArticles to count your Character

  namespace Hi
{
    class Program
    {
        static void Main(string[] args)
        {


    string t1 = "A book was lost. There is a book on the table. Is that the book?";

            Console.WriteLine(t1);
            Console.WriteLine(" - Found {0} articles, should be 2.", CountArticles(t1));
            Console.ReadKey();
        }

        static int CountArticles(string text)
        {
            int count = 0;

            // Here you may also try TextInfo
            //Make string as a Title Case
            //the beginning of the string OR followed by space would be now  'A'
            TextInfo textInfo = new CultureInfo("en-US", false).TextInfo;
            text = textInfo.ToTitleCase(text); 


            {
                for (int i = 0; i < text.Length; ++i)
                {
                    if (text[i] == 'A')
                    {
                        ++count;
                    }
                }
                return count;
            }
        }
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM