| Title: | Metrics for Assessing the Quality of Generated Text |
|---|---|
| Description: | Implementation of the BLEU-Score in 'C++' to evaluate the quality of generated text. The BLEU-Score, introduced by Papineni et al. (2002) <doi:10.3115/1073083.1073135>, is a metric for evaluating the quality of generated text. It is based on the n-gram overlap between the generated text and reference texts. Additionally, the package provides some smoothing methods as described in Chen and Cherry (2014) <doi:10.3115/v1/W14-3346>. |
| Authors: | Philipp Koch [aut, cre] |
| Maintainer: | Philipp Koch <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 0.1.2 |
| Built: | 2026-05-16 08:17:35 UTC |
| Source: | https://github.com/lazerlambda/sacrebleu |
This function applies tokenization based on the 'tok' library and computes the BLEU score. An already initialized tokenizer can be provided using the 'tokenizer' argument or a valid huggingface identifier (string) can be passed. If the identifier is used only, the tokenizer is newly initialized on every call.
bleu_corpus( references, candidates, tokenizer = "bert-base-cased", n = 4, weights = NULL, smoothing = NULL, epsilon = 0.1, k = 1 )bleu_corpus( references, candidates, tokenizer = "bert-base-cased", n = 4, weights = NULL, smoothing = NULL, epsilon = 0.1, k = 1 )
references |
A list of a list of reference sentences ('list(list(c(1,2,...)), list(c(3,5,...)))'). |
candidates |
A list of candidate sentences ('list(c(1,2,...), c(3,5,...))'). |
tokenizer |
Either an already initialized 'tok' tokenizer object or a huggingface identifier (default is 'bert-base-cased') |
n |
N-gram for BLEU score (default is set to 4). |
weights |
Weights for the n-grams (default is set to 1/n for each entry). |
smoothing |
Smoothing method for BLEU score (default is set to 'standard', 'floor', 'add-k' available) |
epsilon |
Epsilon value for epsilon-smoothing (default is set to 0.1). |
k |
K value for add-k-smoothing (default is set to 1). |
The BLEU score for the candidate sentence.
cand_corpus <- list("This is good", "This is not good") ref_corpus <- list(list("Perfect outcome!", "Excellent!"), list("Not sufficient.", "Horrible.")) tok <- tok::tokenizer$from_pretrained("bert-base-uncased") bleu_corpus <- bleu_corpus(ref_corpus, cand_corpus, tok)cand_corpus <- list("This is good", "This is not good") ref_corpus <- list(list("Perfect outcome!", "Excellent!"), list("Not sufficient.", "Horrible.")) tok <- tok::tokenizer$from_pretrained("bert-base-uncased") bleu_corpus <- bleu_corpus(ref_corpus, cand_corpus, tok)
'bleu_sentence_ids' computes the BLEU score for a corpus and its respective reference sentences. The sentences must be tokenized before so they are represented as integer vectors. Akin to 'sacrebleu' ('Python'), the function allows the application of different smoothing methods. Epsilon- and add-k-smoothing are available. Epsilon-smoothing is equivalent to 'floor' smoothing in the sacreBLEU implementation. The different smoothing techniques are described in Chen et al., 2014 (https://aclanthology.org/W14-3346/).
bleu_corpus_ids( references, candidates, n = 4, weights = NULL, smoothing = NULL, epsilon = 0.1, k = 1 )bleu_corpus_ids( references, candidates, n = 4, weights = NULL, smoothing = NULL, epsilon = 0.1, k = 1 )
references |
A list of a list of reference sentences ('list(list(c(1,2,...)), list(c(3,5,...)))'). |
candidates |
A list of candidate sentences ('list(c(1,2,...), c(3,5,...))'). |
n |
N-gram for BLEU score (default is set to 4). |
weights |
Weights for the n-grams (default is set to 1/n for each entry). |
smoothing |
Smoothing method for BLEU score (default is set to 'standard', 'floor', 'add-k' available) |
epsilon |
Epsilon value for epsilon-smoothing (default is set to 0.1). |
k |
K value for add-k-smoothing (default is set to 1). |
The BLEU score for the candidate sentence.
cand_corpus <- list(c(1,2,3), c(1,2)) ref_corpus <- list(list(c(1,2,3), c(2,3,4)), list(c(1,2,6), c(781, 21, 9), c(7, 3))) bleu_corpus_ids_standard <- bleu_corpus_ids(ref_corpus, cand_corpus) bleu_corpus_ids_floor <- bleu_corpus_ids(ref_corpus, cand_corpus, smoothing="floor", epsilon=0.01) bleu_corpus_ids_add_k <- bleu_corpus_ids(ref_corpus, cand_corpus, smoothing="add-k", k=1)cand_corpus <- list(c(1,2,3), c(1,2)) ref_corpus <- list(list(c(1,2,3), c(2,3,4)), list(c(1,2,6), c(781, 21, 9), c(7, 3))) bleu_corpus_ids_standard <- bleu_corpus_ids(ref_corpus, cand_corpus) bleu_corpus_ids_floor <- bleu_corpus_ids(ref_corpus, cand_corpus, smoothing="floor", epsilon=0.01) bleu_corpus_ids_add_k <- bleu_corpus_ids(ref_corpus, cand_corpus, smoothing="add-k", k=1)
This function applies tokenization based on the 'tok' library and computes the BLEU score. An already initializied tokenizer can be provided using the 'tokenizer' argument or a valid huggingface identifier (string) can be passed. If the identifier is used only, the tokenizer is newly initialized on every call.
bleu_sentence( references, candidate, tokenizer = "bert-base-cased", n = 4, weights = NULL, smoothing = NULL, epsilon = 0.1, k = 1 )bleu_sentence( references, candidate, tokenizer = "bert-base-cased", n = 4, weights = NULL, smoothing = NULL, epsilon = 0.1, k = 1 )
references |
A list of reference sentences. |
candidate |
A candidate sentence. |
tokenizer |
Either an already initialized 'tok' tokenizer object or a huggingface identifier (default is 'bert-base-cased') |
n |
N-gram for BLEU score (default is set to 4). |
weights |
Weights for the n-grams (default is set to 1/n for each entry). |
smoothing |
Smoothing method for BLEU score (default is set to 'standard', 'floor', 'add-k' available) |
epsilon |
Epsilon value for epsilon-smoothing (default is set to 0.1). |
k |
K value for add-k-smoothing (default is set to 1). |
The BLEU score for the candidate sentence.
cand <- "Hello World!" ref <- list("Hello everyone.", "Hello Planet", "Hello World") tok <- tok::tokenizer$from_pretrained("bert-base-uncased") bleu_standard <- bleu_sentence(ref, cand, tok)cand <- "Hello World!" ref <- list("Hello everyone.", "Hello Planet", "Hello World") tok <- tok::tokenizer$from_pretrained("bert-base-uncased") bleu_standard <- bleu_sentence(ref, cand, tok)
'bleu_sentence_ids' computes the BLEU score for a single candidate sentence and a list of reference sentences. The sentences must be tokenized before so they are represented as integer vectors. Akin to 'sacrebleu' ('Python'), the function allows the application of different smoothing methods. Epsilon- and add-k-smoothing are available. Epsilon-smoothing is equivalent to 'floor' smoothing in the sacrebleu implementation. The different smoothing techniques are described in Chen et al., 2014 (https://aclanthology.org/W14-3346/).
bleu_sentence_ids( references, candidate, n = 4, weights = NULL, smoothing = NULL, epsilon = 0.1, k = 1 )bleu_sentence_ids( references, candidate, n = 4, weights = NULL, smoothing = NULL, epsilon = 0.1, k = 1 )
references |
A list of reference sentences. |
candidate |
A candidate sentence. |
n |
N-gram for BLEU score (default is set to 4). |
weights |
Weights for the n-grams (default is set to 1/n for each entry). |
smoothing |
Smoothing method for BLEU score (default is set to 'standard', 'floor', 'add-k' available) |
epsilon |
Epsilon value for epsilon-smoothing (default is set to 0.1). |
k |
K value for add-k-smoothing (default is set to 1). |
The BLEU score for the candidate sentence.
ref_corpus <- list(c(1,2,3,4)) cand_corpus <- c(1,2,3,5) bleu_standard <- bleu_sentence_ids(ref_corpus, cand_corpus) bleu_floor <- bleu_sentence_ids(ref_corpus, cand_corpus, smoothing="floor", epsilon=0.01) bleu_add_k <- bleu_sentence_ids(ref_corpus, cand_corpus, smoothing="add-k", k=1)ref_corpus <- list(c(1,2,3,4)) cand_corpus <- c(1,2,3,5) bleu_standard <- bleu_sentence_ids(ref_corpus, cand_corpus) bleu_floor <- bleu_sentence_ids(ref_corpus, cand_corpus, smoothing="floor", epsilon=0.01) bleu_add_k <- bleu_sentence_ids(ref_corpus, cand_corpus, smoothing="add-k", k=1)
Validate Arguments
validate_arguments(weights, smoothing, n)validate_arguments(weights, smoothing, n)
weights |
Weight vector for 'bleu_corpus_ids' and 'bleu_sentence_ids' functions |
smoothing |
Smoothing method for 'bleu_corpus_ids' and 'bleu_sentence_ids' functions |
n |
N-gram for 'bleu_corpus_ids' and 'bleu_sentence_ids' functions |
A list with the validated arguments (weights and smoothing)
Validate References
validate_references(references, target)validate_references(references, target)
references |
A list of reference sentences. |
target |
A vector of target lengths. |
A boolean value indicating if the references are valid.