简体   繁体   中英

PHP counter adds an extra increment

I basically have a simple program that takes some text as input from a form, matches all the words in the text to two lexicons. One lexicon contains a list of positive words and the other contains a list of negative words. For each positive word match, $posMatchCount is incremented. For each negative word match, $negMatchCount is incremented. A simple comparison is done, and if the positive words are greater, the program returns "Positive", else, it returns "negative". It returns "Neutral" if the positive words == negative words, or if there are no positive or negative matches. Here is the complete code:

        <?php
include("positive_lexicon.php");
include("negative_lexicon.php");
?>
<html>
<head>
    <title>Output</title>
</head>
<body>

<h1>Output</h1>  
<hr>
<?php

$preprocessedDoc2 = "i love this phone but hate the battery i adore the screen size";

/////////////////////////////////////////////////////////////////////////////////match doc text with POSITIVE sentiment lexicon

$matchedPosWords = NULL;//contains matched words
$posMatchCount = 0;//count of POS matches

$array1 = explode(' ', $preprocessedDoc2);
foreach($array1 as $word){

    if(preg_match("/\s{$word}\s/", $positiveLexicon)){
        $matchedPosWords = $matchedPosWords . $word . " - ";
        $posMatchCount++;
        $posMatch = true; //for subjectivity check
    }
    else{
        $posMatch= false; //for subjectivity check
    }
}

   echo "Matched POSITIVE words: <br><br>";
   echo "<div style=\"background-color:#66FF66\">";
   echo $matchedPosWords . " (Total: {$posMatchCount})";
   echo "</div>";
   echo "<br><br>";

/////////////////////////////////////////////////////////////////////////////////match doc text with NEGATIVE sentiment lexicon   

$matchedNegWords = NULL;//contains matched words
$negMatchCount = 0;//count of NEG matches

$array2 = explode(' ', $preprocessedDoc2);
foreach($array2 as $word2){

    if(preg_match("/\s{$word2}\s/", $negativeLexicon)){
        $matchedNegWords = $matchedNegWords . $word2 . " - ";
        $negMatchCount++;
        $negMatch = true; //for subjectivity check
    }
    else{
        $negMatch = false; //for subjectivity check
    }
}

   echo "Matched NEGATIVE words: <br><br>";
   echo "<div style=\"background-color:#FF5050\">";
   echo $matchedNegWords . " (Total: {$negMatchCount})";
   echo "</div>";
   echo "<br><br>";

/////////////////////////////////////////////////////////////////////////////////comparison between POSITIVE and NEGATIVE words

echo "analyzing document's sentiment ...<br><br>";

function checkPolarity($posWords, $negWords, $posMatch1, $negMatch1){//function to check polarity of doc


    if((($posMatch1==false) && ($negMatch1==false))||($posWords==$negWords)){
        return "<strong>NEUTRAL</strong>"; //if there are no POS or NEG matches, or matches are equal, return NEUTRAL

    }

    if($posWords > $negWords){
        return "<strong>POSITIVE</strong>"; //if count of POS matches is greater than count of NEG matches, return POSITIVE

    }

    else{
        return "<strong>NEGATIVE</strong>"; //if count of NEG matches is greater than count of POS matches, return NEGATIVE

    }



}

$polarity = checkPolarity($posMatchCount, $negMatchCount, $posMatch, $negMatch); //call function to check polarity   

echo "Polarity of the document is: " . $polarity; //display overall polarity
echo "<br><br>";

$polarity = "";



?>

</body>
</html>

However, sometimes it returns "neural" even though the number of positive words is greater than negative words. Sometimes it does an extra increment. For example, the string input "i love this phone but hate the battery i adore the screen size " returns the following:

Matched POSITIVE words:

love - adore - - (Total: 3)


Matched NEGATIVE words:

hate - - (Total: 2)

Even though there are only two positive matches and one negative match, it gives a count of 3 for positive matches and 2 for negative matches. I know the problem would be spotted right away on SO, even though I can not seem to find it. I will try my luck..

In my opinion code does not look wrong. But the output that you have put

Matched POSITIVE words:

love - adore - - (Total: 3)


Matched NEGATIVE words:

hate - - (Total: 2)

You have single space in last entry for both positive or negative matches, which i consider is wrong.

if you like , please change the code to this to debug and check.

echo "Foreach for Positive words started <br/>";
foreach($array1 as $word){

    if(preg_match("/\s{$word}\s/", $positiveLexicon) && trim($word) != "" ){
        echo $word."= <br/>"; // there should be no empty word in this
        $matchedPosWords = $matchedPosWords." - ". $word; // there should be no dash at the last, only word
        $posMatchCount++;
        $posMatch = true; //for subjectivity check
    }
    else{
        $posMatch= false; //for subjectivity check
    }
}
echo "Foreach for Positive words Ended <br/>";

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM