Tag[bleu] Recent Newest Questions

Early stopping based on BLEU in FairSeq

My goal is to use BLEU as early stopping metric while training a translation model in FairSeq. Following the documentation, I am adding the following ...

Split several sentences in pandas dataframe

I have a pandas dataframe with a column that looks like this. sentences ['This is text.', 'This is another t ...

Which BLEU smoothing function is commonly used for Image Captioning evaluation?

I'm studying and running some experiments on the Image Captioning field, and one thing I'm not being able to fully figure out is when I have to evalua ...

Bleu_score in NLTK library

I am new to using the nltk library. I want to find the two most similar strings. In doing so, I used the 'bleu_score' as follows: The output is lik ...

what is the diffrence beetween the bleu score and the vaerage sentence of bleu score

i'm having a hard time finding the bleus core for my seq to seq model for the task of question generation , my questions are the following : if i use ...

What are the differences between BLEU score and METEOR?

I am trying to understand the concept of evaluating the machine translation evaluation scores. I understand how what BLEU score is trying to achieve. ...

I compare two identical sentences with BLEU NLTK and don't get 1.0. Why?

I’m trying to use the BLEU score from NLTK for quality evaluation of the machine translation. I wanted to check this code with two identical sentences ...

Calculating BLEU and Rouge score as fast as possible

I have around 200 candidate sentences and for each candidate, I want to measure the bleu score by comparing each sentence with thousands of reference ...

How to calculate BLEU Score without the brevity penalty

code: How can I ignore brevity penalty from the BLEU score calculated here? ...

How to match hypothesis with references using BLEU?

In the following sys contains happy that is the exact match for the second reference, but why the bleu score is still zero? It prints If BLEU is ...

Why Sacrebleu returns zero BLEU score for short sentences?

Why scarebleu needs that sentences ends with dot? If I remove dots, the value is zero. This returns the following: When I remove the ending dots ...

What's the difference between NLTK's BLEU score and SacreBLEU?

I'm curious if anyone is familiar with the difference between using NLTK's BLEU score calculation and the SacreBLEU library. In particular, I'm using ...

Average of BLEU scores on two subsets of data is not the same as overall score

For evaluating a sequence generation model, I'm using BLEU1:BLEU4. I separated the test set to two sets and calculated the scores on each set separate ...

BLEU - Error N-gram overlaps of lower order

I ran the code below This is the error The hypothesis contains 0 counts of 3-gram overlaps. Therefore the BLEU score evaluates to 0, independen ...

GCP Zero BLEU score

Is it normal that for a trained Italian translation model gcp does not count the BLEU score? ...

Cannot import name 'evaluate' from 'bleu'

I'm trying to to do this from bleu import evaluate But I get the following error: ImportError: cannot import name 'evaluate' from 'bleu' (/opt/co ...

Should the BLEU score for subword NMT be calculated on the subwords or should they be joined first?

This wasn't too clear in the papers I've read. When a model is trained on a bilingual corpus that was split into subwords e.g. via Byte-Pair Encoding, ...

NLTK sentence_bleu method 7 gives scores above 1

When using the NLTK sentence_bleu function in combination with SmoothingFunction method 7, the max score is 1.1167470964180197. This while the BLEU sc ...

Is it okay to compare Test BLEU score between NMT models while using a slightly modified standard test sets?

I am using tst2013.en found here, as my test sets to get the Test BLEU score to compare to other previous models. However, I have to filter out some s ...

Bleu Score in Model Evaluation Metric

In many seq2seq implementations, I saw that they use accuracy metric in compiling the model and Bleu score only in predictions. Why they don't use Bl ...