Run Gene Quality Control (QC) function
run_RefGeneQC.Rd
Run Gene Quality Control (QC) function
Usage
run_RefGeneQC(
ASE_df,
XCI_ref,
SNP_DETECTION_DP = 30,
SNP_DETECTION_MAF = 0.1,
SAMPLE_NUM_THR = 3,
HE_allele_cell_number_THR = 50,
QC_total_allele_THR = 10
)
Arguments
- ASE_df
A dataframe (tibble) containing single-cell allele-specific expression (scASE) data for all samples. This dataframe should have the following columns:
SNP_ID: SNP identifier
POS: Genomic position of the SNP (GRCh38)
REF: Reference allele of the SNP A,T,G,C
ALT: Alternative allele of the SNP A,T,G,C
cell_barcode: Cell barcode
REFcount: Allelic expression of the reference allele
ALTcount: Allelic expression of the alternative allele
OTHcount: Allelic expression of the other allele
Sample_ID: Sample ID
Gene: Gene annotated to the SNP
- XCI_ref
A dataframe (tibble) containing X chromosome inactivation status. This dataframe should have the following two columns:
Gene: Gene name
XCI_status: XCI status escape, variable, inactive
- SNP_DETECTION_DP
Threshold for the total allele count (depth) of the SNP in the scASE data. SNP–Sample pairs with a total allele count of at least "SNP_DETECTION_DP" are used for the analysis. Default: 30.
- SNP_DETECTION_MAF
Threshold for the minor allele count of the scASE data. SNP–Sample pairs with a minor allele ratio between "SNP_DETECTION_MAF" and "1 - SNP_DETECTION_MAF" are used for the analysis. Default: 0.1.
- SAMPLE_NUM_THR
Threshold for the sample size used in the calculation of the ratio of expression from Xi. Genes evaluated in at least
SAMPLE_NUM_THR
samples are used for the calculation of the ratio of expression from Xi. Default: 3.- HE_allele_cell_number_THR
Threshold for the number of cells expressing reference SNPs. Candidate reference SNPs expressed in at least "HE_allele_cell_number_THR" cells are used for the analysis. Default: 50.
- QC_total_allele_THR
Threshold for the total allele count (depth) of the SNP used for calculating the ratio of expression from Xi. Note that this count is calculated with cells successfully assigned to the group based on the inactivated X chromosome. This filter is applied in the final step of scLinax and differs from "SNP_DETECTION_DP". Default: 10.
Value
A dataframe (tibble) with the following columns:
Gene: Gene name
Mean_AR_target, SD_AR_target: Mean and standard deviation of the ratio of expression from Xi across other candidate reference genes when the SNPs on the gene were used as references
Mean_AR_reference, SD_AR_reference: Mean and standard deviation of the ratio of expression from Xi for the gene when SNPs on other candidate reference genes were used as references
Mean_Total_allele_target, SD_Total_allele_target: Mean and standard deviation of the total allele count across data points when calculating the ARs defined above for
target
.Mean_Total_allele_reference, SD_Total_allele_reference: Mean and standard deviation of the total allele count across data points when calculating the ARs defined above for
reference
.Sample_N_target: Number of samples calculating the ARs defined above for
target
.Sample_N_reference: Number of samples calculating the ARs defined above for
reference
.Count_target: Number of data points calculating the ARs defined above for
target
.Count_reference: Number of data points calculating the ARs defined above for
reference
.