So I want to do a search and replace in a markdown document exported from a word processor. Basically get rid of the references in favor or simpler in text links for easier updating/changing/adding. While being kramdown compatible.
I'm stuck at this JS which matches correctly but doesn't work.
Here is the markdown:
// content is defined somewhere, let's put it in a "content" variable
const content = `What is Lorem Ipsum?
Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and this<sup>[[Pubmed]](1)</sup> book. This<sup>[[Microsoft](3)</sup> not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.
The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using something<sup>[[Wikipedia]](2)</sup>.
[1]: https://pubmed.com
[2]: https://wikipedia.org
[3]: https://microsoft.com
`
Then running
// define our function to extract it
const extractCitationsFromMarkdown = content => {
// Array of regexes - first pulls the in-content links (between sup tags) but only for in those 3 types Pubmed|Microsoft|Wikipedia
// second one for matching the references at the bottom
const regexes = [
/\<sup\>\[\[(Pubmed|Microsoft|Wikipedia)\]\]\((\d+)\)<\/sup\>/mg,
/\[(\d+)\]: ([^\s]+)/mg
]
// Extract the matches from the text
const matches = regexes
.map(re => Array.from(content.matchAll(re)))
.map(groups => groups.map(g => g.slice(1)))
// format the results
return matches.at(0)
.map(([reference, referenceNumber ]) =>
([
`[${reference}]`,
matches.at(1).find(group => group.includes(referenceNumber)).at(1),
]).join(': ')
).toString(/\n/)
}
Calling it:
extractCitationsFromMarkdown(content)
Expected MD result:
What is Lorem Ipsum? Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and this<sup>[[Pubmed]](https://pubmed.com)</sup> book. This<sup>[[Microsoft](https://microsoft.com)</sup> not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.
The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using something<sup>[[Wikipedia]](https://wikipedia.org)</sup>.
Expected rendered result:
What is Lorem Ipsum? Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and this [Pubmed] book. This [microsoft.com] not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.
The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using something [Wikipedia] .
Any help would be much appreciated, been stuck on this for > 1 day.
Thank you
Your function returns a string with matches, but does not perform any replacement.
Note that you have a missing closing square bracket in the input; here:
<sup>[[Microsoft](https://microsoft.com)</sup>
^
I would propose a function that performs a search and replace, following this procedure:
function makeCitationsInline(content) { const regex = /(\<sup\>\[\[.*?\]\]\()(\d+)(\)<\/sup\>)/g; // Collect the references that are used in the text const legend = Object.fromEntries( Array.from(content.matchAll(regex), m => [m[2], m[2]]) ); // Extract those references from the footer return content.replace(/\[(\d+)\]: ([^\s]+)\s*/g, (m, i, url) => (legend[i] &&= url) ? "" : m // Insert them inline ).replace(regex, (_, pre, i, post) => pre + legend[i] + post); } const content = `What is Lorem Ipsum? Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and this<sup>[[Pubmed]](1)</sup> book. This<sup>[[Microsoft]](3)</sup> not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using something<sup>[[Wikipedia]](2)</sup>. [1]: https://pubmed.com [2]: https://wikipedia.com [3]: https://microsoft.com `; console.log(makeCitationsInline(content));
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.