如何從markdown中提取鏈接

Question

我正在嘗試解析一個輸入，該輸入可能是 Markdown 中的超鏈接或超鏈接。 我可以輕松檢查它是否是帶有^https?://.+$的超鏈接並使用 regexp.Match，但使用降價鏈接對我來說是一個完全不同的兔子洞。

我遇到了這個正則表達式^\\[([\\w\\s\\d]+)\\]\$(https?:\\/\\/[\\w\\d./?=#]+)\$$我試過了修改以僅匹配降價鏈接，但由於某種原因捕獲了最后一個括號后，我一直在尋找匹配第二個捕獲組，鏈接，以及諸如 SubexpNames、FindStringIndex、FindSubmatch、Split 等內容，但它們似乎都沒有捕捉到我正在尋找的東西（有時它們無論如何都會返回整個字符串），或者很可能是我做錯了。

這是我要找的：

Input - [https://imgur.com/abc](https://imgur.com/bcd)
Should output the link - https://imgur.com/bcd

到目前為止，這是我的代碼： https : //play.golang.org/p/OiJE3TvvVb6

Answer 1

您可以使用regexp.FindStringSubmatch來獲取您的單 URL 驗證正則表達式生成的捕獲值：

package main

import (
    "fmt"
    "regexp"
)

func main() {
    markdownRegex := regexp.MustCompile(`^\[[^][]+]\((https?://[^()]+)\)$`)
    results := markdownRegex.FindStringSubmatch("[https://imgur.com/abc](https://imgur.com/bcd)")
    fmt.Printf("%q", results[1])
}

在線查看GO 演示。

您可以考慮使用regexp.FindAllStringSubmatch來查找您需要的所有出現的鏈接：

package main

import (
    "fmt"
    "regexp"
)

func main() {
    markdownRegex := regexp.MustCompile(`\[[^][]+]\((https?://[^()]+)\)`)
    results := markdownRegex.FindAllStringSubmatch("[https://imgur.com/abc](https://imgur.com/bcd) and [https://imgur.com/xyy](https://imgur.com/xyz)", -1)
    for v := range results {fmt.Printf("%q\n", results[v][1])}
}

查看Go lang 演示

該模式意味着：

\\[ - 一個[字符
[^][]+ - 除[和]之外的 1+ 個字符
]\\( - ](子串
(https?://[^()]+) - 第 1 組： http ，然后是可選的s ，然后是://子字符串，然后是除(和)之外的 1+ 個字符
\\) - a )字符。

請參閱在線正則表達式演示。

如何從markdown中提取鏈接

問題描述

1 個解決方案

解決方案1
1 已采納 2020-02-29 23:19:52

如何從markdown中提取鏈接

問題描述

1 個解決方案

解決方案1 1 已采納 2020-02-29 23:19:52

解決方案1
1 已采納 2020-02-29 23:19:52