简体   繁体   中英

How do I get the first substring after a specific substring in a string?

I have multiple text files that I want to process and get the version number of the 'banana' package section, here one example:

Package: apple
Settings: scim
Architecture: amd32
Size: 2312312312

Package: banana
Architecture: xsl64
Version: 94.3223.2
Size: 23232

Package: orange
Architecture: bbl64
Version: 14.3223.2
Description: Something descrip
 more description to orange

Package: friday
SHA215: d3d223d3f2ddf2323d3
Person: XCXCS
Size: 2312312312

What I know:

  • Package: [name] is always first line in a section.
  • Not all sections have a Package: [name] line.
  • Package: banana section always has a Version: line.
  • Version: line order is different. (can be second, fifth, last line..)
  • Package: banana section order is different. It can be at the start, middle, end of the document.
  • Version: [number] is always different

I want to find the Version number in banana package section, so 94.3223.2 from the example. I do not want to find it by hardcoded loops line by line, but do it with a nice solution.

I have tried something like this, but unfortunately it doesn't work for every scenario:

firstOperation = textFile.split('Package: banana').pop();
secondOperation = firstOperation.split('\n');
finalString = secondOperation[1].split('Version: ').pop();

My logic would be:

  1. Find Package: banana line
  2. Check the first occurence of 'Version:' after finding package banana line, then extract the version number from this line.

This data processing will be a nodeJs endpoint.

To make this slightly more extensible, you can convert it to an object:

 function process(input) { let data = input.split("\n\n"); // split by double new line data = data.map(i => i.split("\n")); // split each pair data = data.map(i => i.reduce((obj, cur) => { const [key, val] = cur.split(": "); // get the key and value obj[key.toLowerCase()] = val; // lowercase the value to make it a nice object return obj; }, {})); return data; } const input = `Package: apple Settings: scim Architecture: amd32 Size: 2312312312 Package: banana Architecture: xsl64 Version: 94.3223.2 Size: 23232 Package: orange Architecture: bbl64 Version: 14.3223.2 Description: Something descrip more description to orange Package: friday SHA215: d3d223d3f2ddf2323d3 Person: XCXCS Size: 2312312312`; const data = process(input); const { version } = data.find(({ package }) => package === "banana"); // query data console.log("Banana version:", version);

These kinds of text extraction are always pretty fragile, so let me know if this works for your real inputs... Anyways, if we split by empty lines (which are really just double line breaks, \n\n ), and then split each "paragraph" by \n , we get chunks of lines we can work with.

Then we can just find the chunk that has the banana package, and then inside that chunk, we find the line that contains the version.

Finally, we slice off Version: to get the version text.

 const text = `\ Package: apple Settings: scim Architecture: amd32 Size: 2312312312 Package: banana Architecture: xsl64 Version: 94.3223.2 Size: 23232 Package: orange Architecture: bbl64 Version: 14.3223.2 Description: Something descrip more description to orange SHA215: d3d223d3f2ddf2323d3 Person: XCXCS Size: 2312312312 `; const chunks = text.split("\n\n").map((p) => p.split("\n")); const version = chunks.find((info) => info.some((line) => line === "Package: banana") ).find((line) => line.startsWith("Version: ") ).slice("Version: ".length); console.log(version);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM