简体   繁体   English

如何在字符串中的特定 substring 之后获取第一个 substring?

[英]How do I get the first substring after a specific substring in a string?

I have multiple text files that I want to process and get the version number of the 'banana' package section, here one example:我有多个要处理的文本文件并获取“banana”package 部分的版本号,这里有一个示例:

Package: apple
Settings: scim
Architecture: amd32
Size: 2312312312

Package: banana
Architecture: xsl64
Version: 94.3223.2
Size: 23232

Package: orange
Architecture: bbl64
Version: 14.3223.2
Description: Something descrip
 more description to orange

Package: friday
SHA215: d3d223d3f2ddf2323d3
Person: XCXCS
Size: 2312312312

What I know:我知道的:

  • Package: [name] is always first line in a section. Package:[name] 始终是节中的第一行。
  • Not all sections have a Package: [name] line.并非所有部分都有 Package: [name] 行。
  • Package: banana section always has a Version: line. Package:香蕉部分总是有一个版本:行。
  • Version: line order is different.版本:行序不同。 (can be second, fifth, last line..) (可以是第二行,第五行,最后一行..)
  • Package: banana section order is different. Package:香蕉段顺序不一样。 It can be at the start, middle, end of the document.它可以在文档的开头、中间、结尾。
  • Version: [number] is always different版本:[number] 总是不同的

I want to find the Version number in banana package section, so 94.3223.2 from the example.我想在 banana package 部分中找到版本号,因此示例中为94.3223.2 I do not want to find it by hardcoded loops line by line, but do it with a nice solution.我不想通过硬编码循环逐行找到它,而是用一个很好的解决方案来找到它。

I have tried something like this, but unfortunately it doesn't work for every scenario:我试过这样的事情,但不幸的是它并不适用于所有情况:

firstOperation = textFile.split('Package: banana').pop();
secondOperation = firstOperation.split('\n');
finalString = secondOperation[1].split('Version: ').pop();

My logic would be:我的逻辑是:

  1. Find Package: banana line查找Package:香蕉线
  2. Check the first occurence of 'Version:' after finding package banana line, then extract the version number from this line.在找到 package banana 行后检查第一次出现的 'Version:',然后从该行中提取版本号。

This data processing will be a nodeJs endpoint.此数据处理将是一个 nodeJs 端点。

To make this slightly more extensible, you can convert it to an object:为了使其更具可扩展性,您可以将其转换为 object:

 function process(input) { let data = input.split("\n\n"); // split by double new line data = data.map(i => i.split("\n")); // split each pair data = data.map(i => i.reduce((obj, cur) => { const [key, val] = cur.split(": "); // get the key and value obj[key.toLowerCase()] = val; // lowercase the value to make it a nice object return obj; }, {})); return data; } const input = `Package: apple Settings: scim Architecture: amd32 Size: 2312312312 Package: banana Architecture: xsl64 Version: 94.3223.2 Size: 23232 Package: orange Architecture: bbl64 Version: 14.3223.2 Description: Something descrip more description to orange Package: friday SHA215: d3d223d3f2ddf2323d3 Person: XCXCS Size: 2312312312`; const data = process(input); const { version } = data.find(({ package }) => package === "banana"); // query data console.log("Banana version:", version);

These kinds of text extraction are always pretty fragile, so let me know if this works for your real inputs... Anyways, if we split by empty lines (which are really just double line breaks, \n\n ), and then split each "paragraph" by \n , we get chunks of lines we can work with.这些类型的文本提取总是非常脆弱,所以让我知道这是否适用于您的真实输入...无论如何,如果我们按空行(实际上只是双换行符, \n\n )分割,然后分割\n的每个“段落”,我们都会得到可以使用的大块行。

Then we can just find the chunk that has the banana package, and then inside that chunk, we find the line that contains the version.然后我们可以找到包含香蕉 package 的块,然后在该块中,我们找到包含版本的行。

Finally, we slice off Version: to get the version text.最后,我们将Version:切片以获取版本文本。

 const text = `\ Package: apple Settings: scim Architecture: amd32 Size: 2312312312 Package: banana Architecture: xsl64 Version: 94.3223.2 Size: 23232 Package: orange Architecture: bbl64 Version: 14.3223.2 Description: Something descrip more description to orange SHA215: d3d223d3f2ddf2323d3 Person: XCXCS Size: 2312312312 `; const chunks = text.split("\n\n").map((p) => p.split("\n")); const version = chunks.find((info) => info.some((line) => line === "Package: banana") ).find((line) => line.startsWith("Version: ") ).slice("Version: ".length); console.log(version);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在特定子字符串jQuery之后获取字符串 - how to get the string after specific substring jquery 如何在javascript中的特定子字符串后获取子字符串 - How to get substring after specific substring in javascript Javascript:如何在 // 之后和之前获取字符串的 Substring - Javascript: How to get Substring of a string after // and before 获取子字符串和另一个字符串的第一次出现之间的子字符串 - Get substring between substring and first occurrence of another string javascript-如何从最后一次看到特定字符后的字符串中获取子字符串? - How to get substring from string after last seen to specific characer in javascript? 如何按数字拆分字符串,同时将数字保留在每个 substring 的第一个位置? - How do I split a string by numbers while keep the numbers in the first location of each substring? 如果所述子字符串在两个特定字符之间,如何获取字符串的子字符串 - How to get substring(s) of a string if said substring(s) are between two specific characters 如何在 JavaScript 中的最后一个特定字符之后获取子字符串? - How to get substring after last specific character in JavaScript? 如何从字符串中获取 substring? - How to get substring from string? Javascript正则表达式-在子字符串之后获取第一个单词 - Javascript Regex - Get first word after substring
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM