Not pushing correct regex match to JSON using javascript

Question

Overview:

I'm using regex to parse a text document and create a JSON document. The document is parsed from console logs.

What seems to happen is (regex_1_match && regex_2_match) is not working as expected. It seems to be matching regex_1 and looks to fulfill regex_2 and saving it int the same array.

const fs = require('fs');
const filename = fs.readFileSync('test.txt').toString();
var regex_1 = /"Course([0-9.])"/g;
var regex_2 = /"(Name)"/g;
var regex_3 = /"(No Name)"/g;
var regex_1_match = filename.match(regex_1);
var regex_2_match = filename.match(regex_2);
var regex_3_match = filename.match(regex_3);

let testJSON = [];

//for each line item
for  (let index = 0; index < filename.length; index++) {
  if(regex_1_match && regex_2_match) {
    testJSON.push({
      Course: regex_1[index]
      Name: regex_2[index]
    });
  }
}
fs.writeFileSync("parsed_test_doc",JSON.stringify(testJSON));

test.txt:

------------ Course1 ------------
------------ foo ------------
------------ Name ------------
------------ Course2 ------------
------------ foo ------------
------------ No Name ------------
------------ Course3 ------------
------------ Name ------------
------------ foo ------------
------------ Course4 ------------
------------ No Name ------------
------------ Course5 ------------
------------ foo ------------
------------ Name ------------

Output:

[{
  "Course": "Course1",
  "Name": "Name"
}, {"Course": "Course2",
  "Name": "Name"
},{"Course": "Course3",
  "Name": "Name"
},{"Course": "Course4",
},{{"Course": "Course5"
}

Expected Output:

[{
  "Course": "Course1",
  "Name": "Name"
}, {
  "Course": "Course2"
}, {
  "Course": "Course3",
  "Name": "Name"
}, {
  "Course": "Course4"
}, {
  "Course": "Course5",
  "Name": "Name"
}]

Answer 1

A few notes about the example code

In the current code and the example text, this part regex_1_match && regex_2_match is true if the result from match (which is either an array or null) is true for both matches, which can give you unexpected results
Note that the variable filename contains the whole file, so index in this loop will be the number of each character in the file content
Using index to index into the regex regex_1[index] does not work this way, perhaps you meant to index into the match
regex_3_match is never used

What you might do is use a single pattern with 2 capture groups, where group 2 captures the first occurrence of Name and it is optional.

\b(Course\d+)(?:(?!Course\d)[^])*?(?:No Name|(Name)|(?=Course\d))

The pattern matches:

\b A word boundary
(Course\d+) Capture in group 1 Course and 1+ digits
(?: Non capture group
- (?!Course\d)[^] Match any char if Course and 1+ digits not directly to the right
)*? Close non capture group and optionally repeat it non greedy
(?: Non capture group for the alternatives
- No Name|(Name)|(?=Course\d) Match No Name , or capture Name in group 2 or assert the next Course and a digit to continue the match when there is no Name present.
) Close non capture group

Regex demo

const fs = require('fs');
const filename = fs.readFileSync('test.txt').toString();
const regex = /\b(Course\d+)(?:(?!Course\d)[^])*?(?:No Name|(Name)|(?=Course\d))/g;
const result = Array.from(filename.matchAll(regex), m => {
    let res = {"Course": m[1]}
    if (undefined !== m[2]) {
        console.log("not undefined")
        res["Name"]=m[2];
    }
    return res;
});
console.log(result);

Output

[
  { Course: 'Course1', Name: 'Name' },
  { Course: 'Course2' },
  { Course: 'Course3', Name: 'Name' },
  { Course: 'Course4' },
  { Course: 'Course5', Name: 'Name' }
]

 const s = `------------ Course1 ------------ ------------ foo ------------ ------------ Name ------------ ------------ Course2 ------------ ------------ foo ------------ ------------ No Name ------------ ------------ Course3 ------------ ------------ Name ------------ ------------ foo ------------ ------------ Course4 ------------ ------------ No Name ------------ ------------ Course5 ------------ ------------ foo ------------ ------------ Name ------------`; const regex = /\b(Course\d+)(?:(??Course\d)[^])*?(:?No Name|(Name)|(;=Course\d))/g. const result = Array.from(s,matchAll(regex): m => { let res = { "Course"; m[1] } if (undefined;== m[2]) { res["Name"] = m[2]; } return res. }); console.log(result);

Not pushing correct regex match to JSON using javascript

Question

1 answers

solution1
0 ACCPTED 2021-08-19 23:24:42

Not pushing correct regex match to JSON using javascript

Question

1 answers

solution1 0 ACCPTED 2021-08-19 23:24:42

solution1
0 ACCPTED 2021-08-19 23:24:42