簡體   English   中英

使用Java RegEx如何僅返回完全匹配項?

[英]Using Java RegEx how can I return only exact matches?

我有一個字符串,其中包含大量文本,我想在其中搜索模式匹配項。 對於我找到的每個匹配項,我想從輸入字符串中提取它,並將其存儲在List或String []中以進行進一步排序。

為此,我試圖使用Java正則表達式搜索所需的模式,然后將這些匹配項打印到控制台。 但是我顯然不能對RegEx做正確的事情,因為不僅會返回我的匹配項,而且還會返回從輸入字符串的開始到RegEx的最終匹配項的所有內容。

我拼命地試圖找到一種方法,只返回我的RegEx比賽,就別無其他! 誰能提供可行的解決方案? 我會很感激,因為我現在很沮喪!

要快速查看我在做什么,請查看此保存的RegEx: 搜索30分鍾的等待時間

否則,這是我的代碼以及我要排序的整個輸入字符串,作為我正在處理的數據的一個很好的例子:

final String regex = "(name=\"(.*?) 30 Minute Wait,)";
    final String input1 = "name=\"Barnstormer, Fantasyland, 05 Minute Wait,name=\"Big Thunder Mountain Railroad, Frontierland, 05 Minute Wait,name=\"Celebrity Spotlight, Echo Lake, 05 Minute Wait,name=\"DINOSAUR, DinoLand U.S.A., 05 Minute Wait,name=\"Expedition Everest - Legend of the Forbidden Mountain, Asia, 05 Minute Wait,name=\"Gran Fiesta Tour StarringThree Caballeros, World Showcase, 05 Minute Wait,name=\"Great Movie Ride, Hollywood Boulevard, 05 Minute Wait,name=\"Mad Tea Party, Fantasyland, 05 Minute Wait,name=\"Meet Chewbacca at Star Wars Launch Bay, Animation Courtyard, 05 Minute Wait,name=\"Seas with Nemo & Friends, Future World, 05 Minute Wait,name=\"Star Wars Launch Bay Theater, Animation Courtyard, 05 Minute Wait,name=\"TriceraTop Spin, DinoLand U.S.A., 05 Minute Wait,name=\"Buzz Lightyear's Space Ranger Spin, Tomorrowland, 10 Minute Wait,name=\"Dumbo the Flying Elephant, Fantasyland, 10 Minute Wait,name=\"Encounter Kylo Ren at Star Wars Launch Bay, Animation Courtyard, 10 Minute Wait,name=\"it's a small world, Fantasyland, 10 Minute Wait,name=\"Kilimanjaro Safaris, Africa, 10 Minute Wait,name=\"Magic Carpets of Aladdin, Adventureland, 10 Minute Wait,name=\"Many Adventures of Winnie the Pooh, Fantasyland, 10 Minute Wait,name=\"Mickey and Minnie Starring in Red Carpet Dreams, Commissary Lane, 10 Minute Wait,name=\"Mickey's PhilharMagic, Fantasyland, 10 Minute Wait,name=\"Muppet*Vision 3D, Muppet Courtyard, 10 Minute Wait,name=\"Pirates of the Caribbean, Adventureland, 10 Minute Wait,name=\"Primeval Whirl, DinoLand U.S.A., 10 Minute Wait,name=\"Soarin', Future World, 10 Minute Wait,name=\"Spaceship Earth, Future World, 10 Minute Wait,name=\"Star Tours –Adventures Continue, Echo Lake, 10 Minute Wait,name=\"Toy Story Mania!, Pixar Place, 10 Minute Wait,name=\"Twilight Zone Tower of Terror™, Sunset Boulevard, 10 Minute Wait,name=\"Under the Sea ~ Journey ofLittle Mermaid, Fantasyland, 10 Minute Wait,name=\"Jungle Cruise, Adventureland, 15 Minute Wait,name=\"Mission: SPACE, Future World, 15 Minute Wait,name=\"Rock 'n' Roller Coaster Starring Aerosmith, Sunset Boulevard, 15 Minute Wait,name=\"Splash Mountain, Frontierland, 15 Minute Wait,name=\"Astro Orbiter, Tomorrowland, 20 Minute Wait,name=\"Meet Disney Pals at the Epcot Character Spot, Future World, 20 Minute Wait,name=\"Monsters, Inc. Laugh Floor, Tomorrowland, 20 Minute Wait,name=\"Meet Rapunzel and Tiana at Princess Fairytale Hall, Fantasyland, 25 Minute Wait,name=\"Space Mountain, Tomorrowland, 25 Minute Wait,name=\"Enchanted Tales with Belle, Fantasyland, 30 Minute Wait,name=\"Meet Cinderella and Elena at Princess Fairytale Hall, Fantasyland, 30 Minute Wait,name=\"Meet Tinker Bell at Town Square Theater, Main Street, U.S.A., 30 Minute Wait,name=\"Peter Pan's Flight, Fantasyland, 30 Minute Wait,name=\"Test Track, Future World, 30 Minute Wait,name=\"Meet Anna and Elsa at Royal Sommerhus, World Showcase, 40 Minute Wait,name=\"Tomorrowland Speedway, Tomorrowland, 40 Minute Wait,name=\"Frozen Ever After, World Showcase, 45 Minute Wait,name=\"Meet Mickey Mouse at Town Square Theater, Main Street, U.S.A., 55 Minute Wait,name=\"Meet Ariel at Her Grotto, Fantasyland, 65 Minute Wait,name=\"Seven Dwarfs Mine Train, Fantasyland, 80 Minute Wait,name=\"Haunted Mansion, Liberty Square, Temporarily Closed,name=\"Kali River Rapids, Asia, Temporarily Closed\n";

    final Pattern pattern = Pattern.compile(regex);
    final Matcher matcher = pattern.matcher(input1);

    //Create a List String for storing the Wait Time matches that we find
    List<String> waitTimesSorted = new ArrayList<String>();

    //Create a loop that the matcher uses to search through the input string for our Wait Times
    while (matcher.find()) {
        //Add the matching wait times we find to a List String     
        waitTimesSorted.add(matcher.group());              
    }

    //Print our matches to the console
    System.out.println(waitTimesSorted);

此輸出查找我30分鍾的等待時間,但同時返回在我的匹配項中在輸入字符串中找到的所有內容!

[name="Barnstormer, Fantasyland, 05 Minute Wait,name="Big Thunder Mountain Railroad, Frontierland, 05 Minute Wait,name="Celebrity Spotlight, Echo Lake, 05 Minute Wait,name="DINOSAUR, DinoLand U.S.A., 05 Minute Wait,name="Expedition Everest - Legend of the Forbidden Mountain, Asia, 05 Minute Wait,name="Gran Fiesta Tour StarringThree Caballeros, World Showcase, 05 Minute Wait,name="Great Movie Ride, Hollywood Boulevard, 05 Minute Wait,name="Mad Tea Party, Fantasyland, 05 Minute Wait,name="Meet Chewbacca at Star Wars Launch Bay, Animation Courtyard, 05 Minute Wait,name="Seas with Nemo & Friends, Future World, 05 Minute Wait,name="Star Wars Launch Bay Theater, Animation Courtyard, 05 Minute Wait,name="TriceraTop Spin, DinoLand U.S.A., 05 Minute Wait,name="Buzz Lightyear's Space Ranger Spin, Tomorrowland, 10 Minute Wait,name="Dumbo the Flying Elephant, Fantasyland, 10 Minute Wait,name="Encounter Kylo Ren at Star Wars Launch Bay, Animation Courtyard, 10 Minute Wait,name="it's a small world, Fantasyland, 10 Minute Wait,name="Kilimanjaro Safaris, Africa, 10 Minute Wait,name="Magic Carpets of Aladdin, Adventureland, 10 Minute Wait,name="Many Adventures of Winnie the Pooh, Fantasyland, 10 Minute Wait,name="Mickey and Minnie Starring in Red Carpet Dreams, Commissary Lane, 10 Minute Wait,name="Mickey's PhilharMagic, Fantasyland, 10 Minute Wait,name="Muppet*Vision 3D, Muppet Courtyard, 10 Minute Wait,name="Pirates of the Caribbean, Adventureland, 10 Minute Wait,name="Primeval Whirl, DinoLand U.S.A., 10 Minute Wait,name="Soarin', Future World, 10 Minute Wait,name="Spaceship Earth, Future World, 10 Minute Wait,name="Star Tours –Adventures Continue, Echo Lake, 10 Minute Wait,name="Toy Story Mania!, Pixar Place, 10 Minute Wait,name="Twilight Zone Tower of Terror™, Sunset Boulevard, 10 Minute Wait,name="Under the Sea ~ Journey ofLittle Mermaid, Fantasyland, 10 Minute Wait,name="Jungle Cruise, Adventureland, 15 Minute Wait,name="Mission: SPACE, Future World, 15 Minute Wait,name="Rock 'n' Roller Coaster Starring Aerosmith, Sunset Boulevard, 15 Minute Wait,name="Splash Mountain, Frontierland, 15 Minute Wait,name="Astro Orbiter, Tomorrowland, 20 Minute Wait,name="Meet Disney Pals at the Epcot Character Spot, Future World, 20 Minute Wait,name="Monsters, Inc. Laugh Floor, Tomorrowland, 20 Minute Wait,name="Meet Rapunzel and Tiana at Princess Fairytale Hall, Fantasyland, 25 Minute Wait,name="Space Mountain, Tomorrowland, 25 Minute Wait,name="Enchanted Tales with Belle, Fantasyland, 30 Minute Wait,, name="Meet Cinderella and Elena at Princess Fairytale Hall, Fantasyland, 30 Minute Wait,, name="Meet Tinker Bell at Town Square Theater, Main Street, U.S.A., 30 Minute Wait,, name="Peter Pan's Flight, Fantasyland, 30 Minute Wait,, name="Test Track, Future World, 30 Minute Wait,]

我想要返回的是這樣的:

name="Enchanted Tales with Belle, Fantasyland, 30 Minute Wait,, name="Meet Cinderella and Elena at Princess Fairytale Hall, Fantasyland, 30 Minute Wait,, name="Meet Tinker Bell at Town Square Theater, Main Street, U.S.A., 30 Minute Wait,, name="Peter Pan's Flight, Fantasyland, 30 Minute Wait,, name="Test Track, Future World, 30 Minute Wait,]

有什么辦法只能找回我要尋找的東西嗎?

我確實需要與等待時間完全匹配的示例(此處我僅以30分鍾為例),因為我想按等待時間(5分鍾等待,10分鍾等待,15分鍾等待,等),然后對它們進行排序,以確保每個組都按字母順序排列。 因此,我不在RegEx中尋找通用數字,我對等待時間非常具體,實際上我要生成RegEx時需要一系列預期的等待時間,但這是另一回事,而不是問題。

你的問題是.*? 也將遍歷任何其他name=" ,從而使其匹配太多。

為防止這種情況,簡單排除="可以防止這種情況。

另外,您無需捕獲整個匹配的表達式。 無論如何,這都是作為捕獲組0完成的。

因此,正則表達式name="([^"]*?) 30 Minute Wait,將執行此操作。
作為Java字符串,應為"name=\\"([^\\"]*?) 30 Minute Wait,"

參見regex101

受Andreas啟發

我的有效正則表達式name=[^=]*30 Minute Wait,版本更短name=[^=]*30 Minute Wait,

參見https://regex101.com/r/ijG8Xr/2

正則表達式的問題是您將“捕獲組”放置在整個輸入周圍,並使用索引調用group()返回完全匹配項,包括.* ,這意味着在等待時間之前的所有內容。

如果將正則表達式更改為"name=\\"(.*?) (30 Minute Wait),"並調用matcher.group(2) ,它將返回"30 Minute Wait"

查看group(int)方法的Javadoc: https : //docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html#group-int-

哦,您可能想用"\\\\d+"替換正則表達式中的"30"以便查找任何數字,而不僅僅是30。

您可能要跳過匹配中不需要的字符,否則它將從字符串的開頭開始匹配。 嘗試類似.*?(name=\\"(.*?) 30 Minute Wait,)

多虧了Andreas,這才是正確的RegEx,可以得到我要搜索的任何模式,而沒有別的:

final String regex = "([^\"]*?) 30 Minute Wait,";
final String input1 = "name=\"Barnstormer, Fantasyland, 05 Minute Wait,name=\"Big Thunder Mountain Railroad, Frontierland, 05 Minute Wait,name=\"Celebrity Spotlight, Echo Lake, 05 Minute Wait,name=\"DINOSAUR, DinoLand U.S.A., 05 Minute Wait,name=\"Expedition Everest - Legend of the Forbidden Mountain, Asia, 05 Minute Wait,name=\"Gran Fiesta Tour StarringThree Caballeros, World Showcase, 05 Minute Wait,name=\"Great Movie Ride, Hollywood Boulevard, 05 Minute Wait,name=\"Mad Tea Party, Fantasyland, 05 Minute Wait,name=\"Meet Chewbacca at Star Wars Launch Bay, Animation Courtyard, 05 Minute Wait,name=\"Seas with Nemo & Friends, Future World, 05 Minute Wait,name=\"Star Wars Launch Bay Theater, Animation Courtyard, 05 Minute Wait,name=\"TriceraTop Spin, DinoLand U.S.A., 05 Minute Wait,name=\"Buzz Lightyear's Space Ranger Spin, Tomorrowland, 10 Minute Wait,name=\"Dumbo the Flying Elephant, Fantasyland, 10 Minute Wait,name=\"Encounter Kylo Ren at Star Wars Launch Bay, Animation Courtyard, 10 Minute Wait,name=\"it's a small world, Fantasyland, 10 Minute Wait,name=\"Kilimanjaro Safaris, Africa, 10 Minute Wait,name=\"Magic Carpets of Aladdin, Adventureland, 10 Minute Wait,name=\"Many Adventures of Winnie the Pooh, Fantasyland, 10 Minute Wait,name=\"Mickey and Minnie Starring in Red Carpet Dreams, Commissary Lane, 10 Minute Wait,name=\"Mickey's PhilharMagic, Fantasyland, 10 Minute Wait,name=\"Muppet*Vision 3D, Muppet Courtyard, 10 Minute Wait,name=\"Pirates of the Caribbean, Adventureland, 10 Minute Wait,name=\"Primeval Whirl, DinoLand U.S.A., 10 Minute Wait,name=\"Soarin', Future World, 10 Minute Wait,name=\"Spaceship Earth, Future World, 10 Minute Wait,name=\"Star Tours –Adventures Continue, Echo Lake, 10 Minute Wait,name=\"Toy Story Mania!, Pixar Place, 10 Minute Wait,name=\"Twilight Zone Tower of Terror™, Sunset Boulevard, 10 Minute Wait,name=\"Under the Sea ~ Journey ofLittle Mermaid, Fantasyland, 10 Minute Wait,name=\"Jungle Cruise, Adventureland, 15 Minute Wait,name=\"Mission: SPACE, Future World, 15 Minute Wait,name=\"Rock 'n' Roller Coaster Starring Aerosmith, Sunset Boulevard, 15 Minute Wait,name=\"Splash Mountain, Frontierland, 15 Minute Wait,name=\"Astro Orbiter, Tomorrowland, 20 Minute Wait,name=\"Meet Disney Pals at the Epcot Character Spot, Future World, 20 Minute Wait,name=\"Monsters, Inc. Laugh Floor, Tomorrowland, 20 Minute Wait,name=\"Meet Rapunzel and Tiana at Princess Fairytale Hall, Fantasyland, 25 Minute Wait,name=\"Space Mountain, Tomorrowland, 25 Minute Wait,name=\"Enchanted Tales with Belle, Fantasyland, 30 Minute Wait,name=\"Meet Cinderella and Elena at Princess Fairytale Hall, Fantasyland, 30 Minute Wait,name=\"Meet Tinker Bell at Town Square Theater, Main Street, U.S.A., 30 Minute Wait,name=\"Peter Pan's Flight, Fantasyland, 30 Minute Wait,name=\"Test Track, Future World, 30 Minute Wait,name=\"Meet Anna and Elsa at Royal Sommerhus, World Showcase, 40 Minute Wait,name=\"Tomorrowland Speedway, Tomorrowland, 40 Minute Wait,name=\"Frozen Ever After, World Showcase, 45 Minute Wait,name=\"Meet Mickey Mouse at Town Square Theater, Main Street, U.S.A., 55 Minute Wait,name=\"Meet Ariel at Her Grotto, Fantasyland, 65 Minute Wait,name=\"Seven Dwarfs Mine Train, Fantasyland, 80 Minute Wait,name=\"Haunted Mansion, Liberty Square, Temporarily Closed,name=\"Kali River Rapids, Asia, Temporarily Closed\n";

        final Pattern pattern = Pattern.compile(regex);
        final Matcher matcher = pattern.matcher(input1);

        //Create a List String for storing the Wait Time matches that we find
        List<String> waitTimesSorted = new ArrayList<String>();

        //Create a loop that the matcher uses to search through the input string for our Wait Times
        while (matcher.find()) {
            //Add the matching wait times we find to a List String     
            waitTimesSorted.add(matcher.group());              
        }

        //Print our matches to the console
        System.out.println(waitTimesSorted);

這將返回以下結果-這正是我想要的!

[Enchanted Tales with Belle, Fantasyland, 30 Minute Wait,, Meet Cinderella and Elena at Princess Fairytale Hall, Fantasyland, 30 Minute Wait,, Meet Tinker Bell at Town Square Theater, Main Street, U.S.A., 30 Minute Wait,, Peter Pan's Flight, Fantasyland, 30 Minute Wait,, Test Track, Future World, 30 Minute Wait,]

再次感謝大家!

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM