Recognize if character is wrapped inside quotes or single quotes

Question

I'm trying to fix a problem of beautify-js with colon characters.

The problem is that it adds a white space after each colon, and this is a problem if I write:

a:hover 
a::before
::selection
etc

Because it becomes:

a: hover 
a: : before
: : selection
etc

So I've added this function that helps me to find the end of the CSS row starting from the colon that is being analyzed.

function getrow() {
  var test1 = source_text.substr(pos, source_text.indexOf(';') + 1);
  var test2 = source_text.substr(pos, source_text.indexOf('{') + 1);

  if(test1.length > test2.length) {
    return test2;
  } else {
    return test1;
  }
}

With this, I can just do:

if(getrow().indexOf("{") !== -1){
  output.push(ch);
} else {
  output.push(ch, " ");
}

When a colon is being analyzed.
If the row ends with a bracket, it means that any colon doesn't need a space after it.
If not, it means that it ends with a semi-colon, so the colon needs a white space after it.

This patch works well I think. The only problem is in this case:

a:not("[data-test='some;content']") {

Because in this case getrow() would find the semi colon before the bracket and would think that the colon needs a space after it.

I know is a very edge case but I'd like to fix this problem.
I think I should check if the semi-colon is surrounded by quotes or single-quotes, and in this case ignore it and continue looking for the next semi-color or bracket.

How could I do?

Answer 1

You can probably hack this in using a regex or several other methods, but they are still likely to have unexpected edge cases (there's a lot of different ways to do strings).

This is problem of looking ahead that the beautifier project has been struggling with. The parsers used in that project are great at look ahead/behind, and that is what you really want to do here. Instead of getting the row as text, you want to walk forward along the tokens looking for the ; or { tokens. This would remove the edge case in question, because a string token (with ; in it) is not a ; token.

Depending on the code you might be able to save the current state and call the tokenizer to walk forward until it encounters one of those tokens and then pop back to your saved state.

Answer 2

Looks like I've fixed every bug with the code below:

var text_after_pos = source_text.replace(new RegExp("('|\").*('|\")", "gm"), "").substr(pos - 1) + ";",
    semicolon = text_after_pos.substr(0, text_after_pos.indexOf(';')).length,
    closed_brace = text_after_pos.substr(0, text_after_pos.indexOf('}')).length,
    open_brace = text_after_pos.substr(0, text_after_pos.indexOf('{')).length,
    test1 = (semicolon > closed_brace) ? closed_brace : semicolon;

    //console.log(text_after_pos);
    if (test1 > open_brace && open_brace !== 0) {
         output.push(ch);
    } else {
         output.push(ch, " ");
    }

I remove text between braces as first thing, than I run the rest of my tests to know if the colon symbol needs a space after itself or not.

I've tested it with this weird CSS example:

a {
    color: purple
}
::selected {
    color: purple
}
a {
    color: purple
}
a:hover {
    color: purple
}
a {
    color: purple
}
a:not("foobar\";{}omg") {
    content: 'example\';{} text';
    content: "example\";}{ text"
}
a {
    color: purple
}

I can't imagine a worse code to try to format.

Recognize if character is wrapped inside quotes or single quotes

Question

2 answers

solution1
0 2013-11-19 22:20:38

solution2
0 2013-11-20 21:32:47

Recognize if character is wrapped inside quotes or single quotes

Question

2 answers

solution1 0 2013-11-19 22:20:38

solution2 0 2013-11-20 21:32:47

solution1
0 2013-11-19 22:20:38

solution2
0 2013-11-20 21:32:47