如何在字符串中的特定 substring 之后获取第一个 substring？

Question

I have multiple text files that I want to process and get the version number of the 'banana' package section, here one example:我有多个要处理的文本文件并获取“banana”package 部分的版本号，这里有一个示例：

Package: apple
Settings: scim
Architecture: amd32
Size: 2312312312

Package: banana
Architecture: xsl64
Version: 94.3223.2
Size: 23232

Package: orange
Architecture: bbl64
Version: 14.3223.2
Description: Something descrip
 more description to orange

Package: friday
SHA215: d3d223d3f2ddf2323d3
Person: XCXCS
Size: 2312312312

What I know:我知道的：

Package: [name] is always first line in a section. Package：[name] 始终是节中的第一行。
Not all sections have a Package: [name] line.并非所有部分都有 Package: [name] 行。
Package: banana section always has a Version: line. Package：香蕉部分总是有一个版本：行。
Version: line order is different.版本：行序不同。 (can be second, fifth, last line..) （可以是第二行，第五行，最后一行..）
Package: banana section order is different. Package：香蕉段顺序不一样。 It can be at the start, middle, end of the document.它可以在文档的开头、中间、结尾。
Version: [number] is always different版本：[number] 总是不同的

I want to find the Version number in banana package section, so 94.3223.2 from the example.我想在 banana package 部分中找到版本号，因此示例中为94.3223.2 。 I do not want to find it by hardcoded loops line by line, but do it with a nice solution.我不想通过硬编码循环逐行找到它，而是用一个很好的解决方案来找到它。

I have tried something like this, but unfortunately it doesn't work for every scenario:我试过这样的事情，但不幸的是它并不适用于所有情况：

firstOperation = textFile.split('Package: banana').pop();
secondOperation = firstOperation.split('\n');
finalString = secondOperation[1].split('Version: ').pop();

My logic would be:我的逻辑是：

Find Package: banana line查找Package：香蕉线
Check the first occurence of 'Version:' after finding package banana line, then extract the version number from this line.在找到 package banana 行后检查第一次出现的 'Version:'，然后从该行中提取版本号。

This data processing will be a nodeJs endpoint.此数据处理将是一个 nodeJs 端点。

Answer 1

To make this slightly more extensible, you can convert it to an object:为了使其更具可扩展性，您可以将其转换为 object：

 function process(input) { let data = input.split("\n\n"); // split by double new line data = data.map(i => i.split("\n")); // split each pair data = data.map(i => i.reduce((obj, cur) => { const [key, val] = cur.split(": "); // get the key and value obj[key.toLowerCase()] = val; // lowercase the value to make it a nice object return obj; }, {})); return data; } const input = `Package: apple Settings: scim Architecture: amd32 Size: 2312312312 Package: banana Architecture: xsl64 Version: 94.3223.2 Size: 23232 Package: orange Architecture: bbl64 Version: 14.3223.2 Description: Something descrip more description to orange Package: friday SHA215: d3d223d3f2ddf2323d3 Person: XCXCS Size: 2312312312`; const data = process(input); const { version } = data.find(({ package }) => package === "banana"); // query data console.log("Banana version:", version);

Answer 2

These kinds of text extraction are always pretty fragile, so let me know if this works for your real inputs... Anyways, if we split by empty lines (which are really just double line breaks, \n\n ), and then split each "paragraph" by \n , we get chunks of lines we can work with.这些类型的文本提取总是非常脆弱，所以让我知道这是否适用于您的真实输入...无论如何，如果我们按空行（实际上只是双换行符， \n\n ）分割，然后分割\n的每个“段落”，我们都会得到可以使用的大块行。

Then we can just find the chunk that has the banana package, and then inside that chunk, we find the line that contains the version.然后我们可以找到包含香蕉 package 的块，然后在该块中，我们找到包含版本的行。

Finally, we slice off Version: to get the version text.最后，我们将Version:切片以获取版本文本。

 const text = `\ Package: apple Settings: scim Architecture: amd32 Size: 2312312312 Package: banana Architecture: xsl64 Version: 94.3223.2 Size: 23232 Package: orange Architecture: bbl64 Version: 14.3223.2 Description: Something descrip more description to orange SHA215: d3d223d3f2ddf2323d3 Person: XCXCS Size: 2312312312 `; const chunks = text.split("\n\n").map((p) => p.split("\n")); const version = chunks.find((info) => info.some((line) => line === "Package: banana") ).find((line) => line.startsWith("Version: ") ).slice("Version: ".length); console.log(version);

如何在字符串中的特定 substring 之后获取第一个 substring？

问题描述

2 个解决方案

解决方案1
2 已采纳 2022-11-15 20:32:25

解决方案2
1 2022-11-15 20:00:36

如何在字符串中的特定 substring 之后获取第一个 substring？

问题描述

2 个解决方案

解决方案1 2 已采纳 2022-11-15 20:32:25

解决方案2 1 2022-11-15 20:00:36

解决方案1
2 已采纳 2022-11-15 20:32:25

解决方案2
1 2022-11-15 20:00:36