I am trying to do pattern matching with lists, but for some reason I get an unexpected match when I do the following:
> (define code '(h1 ((id an-id-here)) Some text here))
> (define code-match-expr '(pre ([class brush: python]) ...))
> (match code
[code-match-expr #t]
[_ #f])
#t
Question: Why does code
match code-match-expr
?
I tried this in the Racket REPL, because I actually want to solve another practical problem: using Pollen's pygments wrapping functions to highlight code, which will be output as HTML later on. For this purpose I wrote the following code, where the problem occurs:
(define (read-post-from-file path)
(Post-from-content (replace-code-xexprs (parse-markdown path))))
(define (replace-code-xexprs list-of-xexprs)
;; define known languages
(define KNOWN-LANGUAGE-SYMBOLS
(list 'python
'racket
'html
'css
'javascript
'erlang
'rust))
;; check if it matches for a single language's match expression
;; if it mathces any language, return that language's name as a symbol
(define (get-matching-language an-xexpr)
(define (matches-lang-match-expr? an-xexpr lang-symbol)
(display "XEXPR:") (displayln an-xexpr)
(match an-xexpr
[`(pre ([class brush: ,lang-symbol]) (code () ,more ...)) lang-symbol]
[`(pre ([class brush: ,lang-symbol]) ,more ...) lang-symbol]
[_ #f]))
(ormap (lambda (lang-symbol)
;; (display "trying to match ")
;; (display an-xexpr)
;; (display " against ")
;; (displayln lang-symbol)
(matches-lang-match-expr? an-xexpr lang-symbol))
KNOWN-LANGUAGE-SYMBOLS))
;; replace code in an xexpr with highlightable code
;; TODO: What happens if the code is in a lower level of the xexpr?
(define (replace-code-in-single-xexpr an-xexpr)
(let ([matching-language (get-matching-language an-xexpr)])
(cond [matching-language (code-highlight an-xexpr matching-language)]
[else an-xexpr])))
;; apply the check to all xexpr
(map replace-code-in-single-xexpr list-of-xexprs))
(define (code-highlight language code)
(highlight language code))
In this example I am parsing a markdown file which has the following content:
# Code Demo
```python
def hello():
print("Hello World!")
```
And I get the following xexpr
s:
1.
(h1 ((id code-demo)) Code Demo)
2.
(pre ((class brush: python)) (code () def hello():
print("Hello World!")))
However, none of those match for some reason.
match
is syntax and does not evaluate the pattern. Since code-match-expr
is a symbol it will bind the whole expression (result of evaluating code
) to the variable code-match-expr
and evaluate the rest of the expressions as the pattern matches. The result will always be #t
.
Notice that the second pattern, the symbol _
, is the same pattern . It also matches the whole expression, but _
is special in the way that it does not get bound like code-match-expr
does.
It's important that your defined variable code-match-expr
is never used, but since the match
binds a variable with the same name your original binding will be shadowed in the consequent of the match
.
Code that works as you intended might look like:
(define (test code)
(match code
[`(pre ([class brush: python]) ,more ...) #t]
[_ #f]))
(test '(h1 ((id an-id-here)) Some text here))
; ==> #f
(test '(pre ((class brush: python))))
; ==> #t
(test '(pre ((class brush: python)) a b c))
; ==> #t
As you see the pattern ,more ...
means zero or more and what kind of brackets is ignored since in Racket []
is the same as ()
and {}
.
EDIT
You still got it a little backwards. In this code:
(define (matches-lang-match-expr? an-xexpr lang-symbol)
(display "XEXPR:") (displayln an-xexpr)
(match an-xexpr
[`(pre ([class brush: ,lang-symbol]) (code () ,more ...)) lang-symbol]
[`(pre ([class brush: ,lang-symbol]) ,more ...) lang-symbol]
[_ #f]))
When a pattern is macthed, since lang-symbol
is unquoted it will match anything atomic and be bound to that as a variable in that clause. It will have nothing to do with the bound variable by the same name as a match
does not use variables, it creates them. You return the variable. Thus:
(matches-lang-match-expr? '(pre ([class brush: jiffy]) bla bla bla) 'ignored-argument)
; ==> jiffy
Here is something that does what you want:
(define (get-matching-language an-xexpr)
(define (get-language an-xexpr)
(match an-xexpr
[`(pre ([class brush: ,lang-symbol]) (code () ,more ...)) lang-symbol]
[`(pre ([class brush: ,lang-symbol]) ,more ...) lang-symbol]
[_ #f]))
(let* ((matched-lang-symbol (get-language an-xexpr))
(in-known-languages (memq matched-lang-symbol KNOWN-LANGUAGE-SYMBOLS)))
(and in-known-languages (car in-known-languages))))
Again.. match
abuses quasiquote to something completely different than creating list structure. It uses them to match literals and capture the unqoted symbols as variables.
Make sure you're clear what it is you are matching. In Racket x-expressions, attribute names are symbols but the values are strings. So the expression you're matching would be something like (pre ([class "brush: js"])) ___)
-- not (pre ([class brush: js]) ___)
.
To match that string and extract the part after "brush: "
, you could use a pregexp
match pattern. Here is a snippet that Frog uses to extract the language to give to Pygments :
(for/list ([x xs])
(match x
[(or `(pre ([class ,brush]) (code () ,(? string? texts) ...))
`(pre ([class ,brush]) ,(? string? texts) ...))
(match brush
[(pregexp "\\s*brush:\\s*(.+?)\\s*$" (list _ lang))
`(div ([class ,(str "brush: " lang)])
,@(pygmentize (apply string-append texts) lang
#:python-executable python-executable
#:line-numbers? line-numbers?
#:css-class css-class))]
[_ `(pre ,@texts)])]
[x x])))
(Here pygmentize
is a function defined in other Frog source code; it's a wrapper around running Pygments as a separate process and piping text between it. But you could substitute another way of using Pygments or any other syntax highlighter. That's N/A for your question about match
. I mention it just so that doesn't become a distraction and another embedded question. :))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.