Version: | 0.1 |
Date: | 2025-06-02 |
Maintainer: | Stefano Cacciatore <tkcaccia@gmail.com> |
Title: | Analysis of Clinical Data |
Description: | A collection of tools to easily analyze clinical data, including functions for correlation analysis, and statistical testing. The package facilitates the integration of clinical metadata with other omics layers, enabling exploration of quantitative variables. It also includes the utility for frequency matching samples across a dataset based on patient variables. |
Depends: | R (≥ 3.5.0), stats, minerva, Matrix, clinfun, methods |
Suggests: | knitr, rmarkdown |
VignetteBuilder: | knitr |
SuggestsNote: | No suggestions |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Packaged: | 2025-06-02 17:29:15 UTC; user |
NeedsCompilation: | yes |
Repository: | CRAN |
Author: | Stefano Cacciatore
|
Date/Publication: | 2025-06-04 15:10:05 UTC |
Add analysis results to a clinical object
Description
This function appends the results of an analysis to a clinical object.
Usage
add_analysis(ma, name, x)
Arguments
ma |
A clinical object to which results will be attached. |
name |
The name (string) of the analysis. |
x |
The result object (e.g., data frame, matrix) to attach. |
Value
Returns the updated clinical object with the new analysis appended.
Author(s)
Stefano Cacciatore
Examples
data(prostate)
ma=initialization(prostate[,"Hospital"])
ma=add_analysis(ma,"Gender",prostate[,"Gender"])
ma=add_analysis(ma,"Gleason score",prostate[,"Gleason score"])
ma=add_analysis(ma,"BMI",prostate[,"BMI"])
ma=add_analysis(ma,"Age",prostate[,"Age"])
ma
Convert Character Matrix to Numeric Matrix
Description
Converts a matrix of character or factor entries to a numeric matrix, preserving row and column names.
Usage
as.data.matrix(x)
Arguments
x |
A matrix containing character or factor values that represent numeric entries. |
Details
This function is useful when a matrix has been read as character or factor and needs to be transformed into numeric format for analysis. It preserves both the row and column names of the input matrix.
Value
A numeric matrix of the same dimensions as x
, with the same row and column names.
Author(s)
Stefano Cacciatore
Examples
# Create a character matrix
ma <- matrix(c("1", "2", "3", "4"), ncol = 2)
# Convert to numeric matrix
ma_numeric <- as.data.matrix(ma)
# Print result
ma_numeric
Categorical Information
Description
Summarization of the categorical information.
Usage
categorical.test (name,x,y,total.column=FALSE,...)
Arguments
name |
the name of the feature. |
x |
the information to summarize. |
y |
the classification of the cohort. |
total.column |
option to visualize the total (by default = " |
... |
further arguments to be passed to the function. |
Value
The function returns a table with the summarized information and The p-value computated using the Fisher's test.
Author(s)
Stefano Cacciatore
See Also
correlation.test
,continuous.test
, txtsummary
Examples
data(prostate)
hosp=prostate[,"Hospital"]
gender=prostate[,"Gender"]
GS=prostate[,"Gleason score"]
BMI=prostate[,"BMI"]
age=prostate[,"Age"]
A=categorical.test("Gender",gender,hosp)
B=categorical.test("Gleason score",GS,hosp)
C=continuous.test("BMI",BMI,hosp,digits=2)
D=continuous.test("Age",age,hosp,digits=1)
rbind(A,B,C,D)
Continuous Information
Description
Summarization of the continuous information.
Usage
continuous.test (name,
x,
y,
center = c("median", "mean"),
digits = 3,
scientific = FALSE,
range = c("IQR","95%CI","range","sd"),
logchange = FALSE,
pos=2,
method=c("non-parametric","parametric"),
total.column=FALSE, ...)
Arguments
name |
the name of the feature. |
x |
the information to summarize. |
y |
the classification of the cohort. |
center |
A character string specifying the measure of central tendency to report: either |
digits |
how many significant digits are to be used. |
scientific |
either a logical specifying whether result should be encoded in scientific format. |
range |
the range to be visualized. |
logchange |
either a logical specifying whether log2 of fold change should be visualized. |
pos |
a value indicating the position of range to be visualized. 1 for column, 2 for row. |
method |
a character string indicating which test method is to be computed. "non-parametric" (default), or "parametric". |
total.column |
option to visualize the total (by default = " |
... |
further arguments to be passed to or from methods. |
Value
The function returns a table with the summarized information and the relative p-value. For non-parametric method, if the number of group is equal to two, the p-value is computed using the Wilcoxon rank-sum test, Kruskal-Wallis test otherwise. For parametric method, if the number of group is equal to two, the p-value is computed using the Student's t-Test, ANOVA one-way otherwise.
Author(s)
Stefano Cacciatore
See Also
correlation.test
, categorical.test
, txtsummary
Examples
data(prostate)
hosp=prostate[,"Hospital"]
gender=prostate[,"Gender"]
GS=prostate[,"Gleason score"]
BMI=prostate[,"BMI"]
age=prostate[,"Age"]
A=categorical.test("Gender",gender,hosp)
B=categorical.test("Gleason score",GS,hosp)
C=continuous.test("BMI",BMI,hosp,digits=2)
D=continuous.test("Age",age,hosp,digits=1)
rbind(A,B,C,D)
Continuous Information
Description
Summarization of the continuous information.
Usage
correlation.test (x,
y,
method = c("pearson", "spearman","MINE"),
name=NA,
perm=100, ...)
Arguments
x |
a numeric vector. |
y |
a numeric vector. |
method |
a character string indicating which correlation method is to be computed. "pearson" (default), "spearman", or "MINE". |
name |
the name of the feature. |
perm |
number of permutation needed to estimate the p-value with MINE correlation. |
... |
further arguments to be passed to or from methods. |
Value
The function returns a table with the summarized information.
Author(s)
Stefano Cacciatore
See Also
categorical.test
,continuous.test
, txtsummary
Examples
data(prostate)
correlation.test(prostate[,"Age"],prostate[,"BMI"],name="correlation between Age and BMI")
Frequency Matching
Description
A method to select unbalanced groupd in a cohort.
Usage
frequency_matching (data,label,times=5,seed=1234)
Arguments
data |
a data.frame of data. |
label |
a classification of the groups. |
times |
The ratio between the two groups. |
seed |
a single number for random number generation. |
Value
The function returns a list with 2 items or 4 items (if a test data set is present):
data |
the data after the frequency matching. |
label |
the label after the frequency matching. |
selection |
the rows selected for the frequency matching. |
Author(s)
Stefano Cacciatore
Examples
data(prostate)
hosp=prostate[,"Hospital"]
gender=prostate[,"Gender"]
GS=prostate[,"Gleason score"]
BMI=prostate[,"BMI"]
age=prostate[,"Age"]
A=categorical.test("Gender",gender,hosp)
B=categorical.test("Gleason score",GS,hosp)
C=continuous.test("BMI",BMI,hosp,digits=2)
D=continuous.test("Age",age,hosp,digits=1)
# Analysis without matching
rbind(A,B,C,D)
# The order is important. Right is more important than left in the vector
# So, Ethnicity will be more important than Age
var=c("Age","BMI","Gleason score")
data.categorized=prostate[,var]
# Extract the Age vector
x <- data.categorized[["Age"]]
# Compute quantiles (0%, 25%, 50%, 75%, 100%) with NA handling
breaks <- quantile(x, probs = c(0, 0.25, 0.5, 0.75, 1), na.rm = TRUE)
# Apply the cut and update the Age column with labeled bins
data.categorized[["Age"]] <- cut(x, breaks = breaks, include.lowest = TRUE)
# Extract the Age vector
x <- data.categorized[["BMI"]]
# Compute quantiles (0%, 25%, 50%, 75%, 100%) with NA handling
breaks <- quantile(x, probs = c(0, 0.25, 0.5, 0.75, 1), na.rm = TRUE)
# Apply the cut and update the Age column with labeled bins
data.categorized[["BMI"]] <- cut(x, breaks = breaks, include.lowest = TRUE)
times=c(1,1)
names(times)=c("Hospital A","Hospital B")
t=frequency_matching(data.categorized,prostate[,"Hospital"],times=times)
newdata=prostate[t$selection,]
hosp.new=newdata[,"Hospital"]
gender.new=newdata[,"Gender"]
GS.new=newdata[,"Gleason score"]
BMI.new=newdata[,"BMI"]
age.new=newdata[,"Age"]
A=categorical.test("Gender",gender.new,hosp.new)
B=categorical.test("Gleason score",GS.new,hosp.new)
C=continuous.test("BMI",BMI.new,hosp.new,digits=2)
D=continuous.test("Age",age.new,hosp.new,digits=1)
# Analysis with matching
rbind(A,B,C,D)
Initialize a Clinical Table Object
Description
Initializes a new object of class "clinical.table"
, setting up the grouping variable and a flag for whether to include a total column.
Usage
initialization(y, total.column = FALSE)
Arguments
y |
A numeric or factor vector representing the outcome or grouping variable. |
total.column |
Logical; whether to include a total column in the results (default is |
Value
Returns an S4 object of class "clinical.table"
.
Examples
y <- factor(c("A", "B", "A", "B"))
ma <- initialization(y)
Intersection of Multiple Vectors
Description
Returns the intersection of all vectors provided as arguments.
Usage
intersect(x, y, ...)
Arguments
x |
A vector. |
y |
A vector. |
... |
Additional vectors to include in the intersection. |
Details
This function computes the intersection of two or more vectors by identifying elements that are present in all of them. Unlike the base R intersect
, which only compares two vectors, this version can take multiple vectors as input.
Value
A character vector (or NULL
if no intersection exists) containing the elements that are common to all input vectors.
Author(s)
Stefano Cacciatore
See Also
Examples
x <- c("a", "b", "c")
y <- c("b", "c", "d")
z <- c("c", "b", "e")
intersect(x, y, z) # Returns "b" and "c"
intersect(x, y, c("f", "g")) # Returns NULL
Continuous Information
Description
Summarization of the continuous information.
Usage
multi_analysis (data,
y,
FUN=c("continuous.test","correlation.test"), ...)
Arguments
data |
the matrix containing the continuous values. Each row corresponds to a different sample. Each column corresponds to a different variable. |
y |
the classification of the cohort. |
FUN |
function to be considered. Choices are " |
... |
further arguments to be passed to or from methods. |
Value
The function returns a table with the summarized information. If the number of group is equal to two, the p-value is computed using the Wilcoxon rank-sum test, Kruskal-Wallis test otherwise.
Author(s)
Stefano Cacciatore
See Also
categorical.test
,continuous.test
,correlation.test
, txtsummary
Examples
data(prostate)
multi_analysis(prostate[,c("BMI","Age")],prostate[,"Hospital"],FUN="continuous.test")
Clinical Data of a Cohort of Prostate Cancer Patiens
Description
The data belong to a cohort of 35 patients with prostate cancer from two different hospitals.
Usage
data(prostate)
Value
The data.frame "prostate
" with the following elements: "Hospital
", "Gender
", "Gleason score
", "BMI
", and "Age
".
Examples
data(prostate)
head(prostate)
Median and Coefficient Interval
Description
Summarization of a numeric vector.
Usage
txtsummary(x,
f = c("median", "mean"),
digits = 0,
scientific = FALSE,
range = c("IQR", "95%CI", "range", "sd"))
Arguments
x |
a numeric vector. |
f |
xxx. |
digits |
how many significant digits are to be used. |
scientific |
either a logical specifying whether result should be encoded in scientific format. |
range |
the range to be visualized. |
Value
The function returns the median and the range (interquartile or 95% coefficient interval) of numeric vetor.
Author(s)
Stefano Cacciatore
See Also
categorical.test
,continuous.test
,correlation.test
, txtsummary
Examples
data(prostate)
txtsummary(prostate[,"BMI"])