Help for package sacRebleu

Type:

Package

Title:

Metrics for Assessing the Quality of Generated Text

Version:

0.2.0

Date:

2025-01-19

Description:

Implementation of the BLEU-Score in 'C++' to evaluate the quality of generated text. The BLEU-Score, introduced by Papineni et al. (2002) <doi:10.3115/1073083.1073135>, is a metric for evaluating the quality of generated text. It is based on the n-gram overlap between the generated text and reference texts. Additionally, the package provides some smoothing methods as described in Chen and Cherry (2014) <doi:10.3115/v1/W14-3346>.

License:

GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]

SystemRequirements:

libclang/llvm-config

Depends:

R (≥ 4.2.0)

Imports:

checkmate, Rcpp (≥ 1.0.12)

LinkingTo:

Rcpp

URL:

https://github.com/LazerLambda/sacRebleu

BugReports:

https://github.com/LazerLambda/sacRebleu/issues

Suggests:

knitr, rmarkdown, testthat (≥ 3.0.0), vctrs, withr

Config/testthat/edition:

RoxygenNote:

7.3.2

VignetteBuilder:

knitr

Encoding:

UTF-8

Language:

en-US

Config/rextendr/version:

0.3.1

NeedsCompilation:

yes

Packaged:

2025-01-21 18:15:29 UTC; philko

Author:

Philipp Koch [aut, cre]

Maintainer:

Philipp Koch <PhillKoch@protonmail.com>

Repository:

CRAN

Date/Publication:

2025-01-22 08:10:02 UTC

sacRebleu: An R package for calculating BLEU scores

Description

This package provides functions for calculating BLEU scores, a common metric for evaluating machine translation models.

Author(s)

Maintainer: Philipp Koch PhillKoch@protonmail.com

Computes BLEU score (Papineni et al., 2002).

Description

'bleu_sentence_ids' computes the BLEU score for a corpus and its respective reference sentences. The sentences must be tokenized before so they are represented as integer vectors. Akin to 'sacrebleu' ('Python'), the function allows the application of different smoothing methods. Epsilon- and add-k-smoothing are available. Epsilon-smoothing is equivalent to 'floor' smoothing in the sacreBLEU implementation. The different smoothing techniques are described in Chen et al., 2014 (https://aclanthology.org/W14-3346/).

Usage

bleu_corpus_ids(
  references,
  candidates,
  n = 4,
  weights = NULL,
  smoothing = NULL,
  epsilon = 0.1,
  k = 1
)

Arguments

references

A list of a list of reference sentences ('list(list(c(1,2,...)), list(c(3,5,...)))').

candidates

A list of candidate sentences ('list(c(1,2,...), c(3,5,...))').

n

N-gram for BLEU score (default is set to 4).

weights

Weights for the n-grams (default is set to 1/n for each entry).

smoothing

Smoothing method for BLEU score (default is set to 'standard', 'floor', 'add-k' available)

epsilon

Epsilon value for epsilon-smoothing (default is set to 0.1).

k

K value for add-k-smoothing (default is set to 1).

Value

The BLEU score for the candidate sentence.

Examples

cand_corpus <- list(c(1,2,3), c(1,2))
ref_corpus <- list(list(c(1,2,3), c(2,3,4)), list(c(1,2,6), c(781, 21, 9), c(7, 3)))
bleu_corpus_ids_standard <- bleu_corpus_ids(ref_corpus, cand_corpus)
bleu_corpus_ids_floor <- bleu_corpus_ids(ref_corpus, cand_corpus, smoothing="floor", epsilon=0.01)
bleu_corpus_ids_add_k <- bleu_corpus_ids(ref_corpus, cand_corpus, smoothing="add-k", k=1)

Computes BLEU-Score (Papineni et al., 2002).

Description

'bleu_sentence_ids' computes the BLEU score for a single candidate sentence and a list of reference sentences. The sentences must be tokenized before so they are represented as integer vectors. Akin to 'sacrebleu' ('Python'), the function allows the application of different smoothing methods. Epsilon- and add-k-smoothing are available. Epsilon-smoothing is equivalent to 'floor' smoothing in the sacrebleu implementation. The different smoothing techniques are described in Chen et al., 2014 (https://aclanthology.org/W14-3346/).

Usage

bleu_sentence_ids(
  references,
  candidate,
  n = 4,
  weights = NULL,
  smoothing = NULL,
  epsilon = 0.1,
  k = 1
)

Arguments

references

A list of reference sentences.

candidate

A candidate sentence.

n

N-gram for BLEU score (default is set to 4).

weights

Weights for the n-grams (default is set to 1/n for each entry).

smoothing

Smoothing method for BLEU score (default is set to 'standard', 'floor', 'add-k' available)

epsilon

Epsilon value for epsilon-smoothing (default is set to 0.1).

k

K value for add-k-smoothing (default is set to 1).

Value

The BLEU score for the candidate sentence.

Examples

ref_corpus <- list(c(1,2,3,4))
cand_corpus <- c(1,2,3,5)
bleu_standard <- bleu_sentence_ids(ref_corpus, cand_corpus)
bleu_floor <- bleu_sentence_ids(ref_corpus, cand_corpus, smoothing="floor", epsilon=0.01)
bleu_add_k <- bleu_sentence_ids(ref_corpus, cand_corpus, smoothing="add-k", k=1)

Validate Arguments

Description

Validate Arguments

Usage

validate_arguments(weights, smoothing, n)

Arguments

weights

Weight vector for 'bleu_corpus_ids' and 'bleu_sentence_ids' functions

smoothing

Smoothing method for 'bleu_corpus_ids' and 'bleu_sentence_ids' functions

n

N-gram for 'bleu_corpus_ids' and 'bleu_sentence_ids' functions

Value

A list with the validated arguments (weights and smoothing)

Validate References

Description

Validate References

Usage

validate_references(references, target)

Arguments

references

A list of reference sentences.

target

A vector of target lengths.

Value

A boolean value indicating if the references are valid.

sacRebleu: An R package for calculating BLEU scores

Description

Author(s)

See Also

Computes BLEU score (Papineni et al., 2002).

Description

Usage

Arguments

Value

Examples

Computes BLEU-Score (Papineni et al., 2002).

Description

Usage

Arguments

Value

Examples

Validate Arguments

Description

Usage

Arguments

Value

Validate References

Description

Usage

Arguments

Value