Yi's Knowledge Basehttp://y1zhou.com/Recent content on Yi's Knowledge BaseHugo -- gohugo.ioen-us2019-{year}Mon, 14 Sep 2020 12:06:41 -0400Introductionhttp://y1zhou.com/series/time-series/time-series-introduction/Fri, 28 Aug 2020 19:05:11 -0400http://y1zhou.com/series/time-series/time-series-introduction/We introduce some basic ideas of time series analysis and stochastic processes. Of particular importance are the concepts of stationarity and the autocovariance and sample autocovariance functions.Matriceshttp://y1zhou.com/series/linear-algebra/linear-algebra-matrices/Wed, 26 Aug 2020 15:14:34 -0400http://y1zhou.com/series/linear-algebra/linear-algebra-matrices/Matrix algebra plays an important role in many areas of statistics, such as linear statistical models and multivariate analysis. In this chapter we introduce basic terminology and some basic matrix operations. We also introduce some basic types of matrices.Estimationhttp://y1zhou.com/series/linear-model/linear-models-estimation/Mon, 30 Sep 2019 13:46:57 -0400http://y1zhou.com/series/linear-model/linear-models-estimation/In this chapter we introduce the concept of linear models. We use the ordinary least squares estimator to get unbiased estimates of the unknown parameters. $R^2$ is introduced as a measure of the goodness of fit, and the different types of sum of squares in a linear model are briefly discussed.Basic Conceptshttp://y1zhou.com/series/maths-stat/1-probability/mathematical-statistics-basic-concepts/Wed, 25 Sep 2019 11:05:06 -0500http://y1zhou.com/series/maths-stat/1-probability/mathematical-statistics-basic-concepts/Introducing the concept of the probability of an event. Also covers set operations and the sample-point method.Basic Conceptshttp://y1zhou.com/series/nonparam-stat/1-introduction/nonparametric-methods-basic-concepts/Fri, 25 Jan 2019 22:50:34 -0500http://y1zhou.com/series/nonparam-stat/1-introduction/nonparametric-methods-basic-concepts/A brief introduction to what we’re going to discuss in later chapters.Conditional Probabilityhttp://y1zhou.com/series/maths-stat/1-probability/mathematical-statistics-conditional-probability/Thu, 26 Sep 2019 11:51:56 -0500http://y1zhou.com/series/maths-stat/1-probability/mathematical-statistics-conditional-probability/Introducing conditional probability and independence of events. Bayes' rule comes in as well.Fundamentals of Nonparametric Methodshttp://y1zhou.com/series/nonparam-stat/1-introduction/nonparametric-methods-fundamentals/Fri, 25 Jan 2019 22:50:34 -0500http://y1zhou.com/series/nonparam-stat/1-introduction/nonparametric-methods-fundamentals/Some basic tools such as the permutation test and the binomial test. We also introduce order statistics and ranks, which will come in handy in later chapters.Linear Dependence and Independencehttp://y1zhou.com/series/linear-algebra/2-linear-dep-and-indep/Mon, 31 Aug 2020 12:33:24 -0400http://y1zhou.com/series/linear-algebra/2-linear-dep-and-indep/A short piece on linearly dependent and independent sets of vectors.Autoregressive Serieshttp://y1zhou.com/series/time-series/2-arma/time-series-autoregressive-model/Sat, 12 Sep 2020 20:31:05 -0400http://y1zhou.com/series/time-series/2-arma/time-series-autoregressive-model/We talk about autoregressive models of different orders, and introduce their mean, variance, ACF and PACF values. Its stationarity is also briefly discussed.Definitions for Discrete Random Variableshttp://y1zhou.com/series/maths-stat/2-discrete-random-variables/mathematical-statistics-discrete-rv-definition/Sun, 06 Oct 2019 10:46:18 -0400http://y1zhou.com/series/maths-stat/2-discrete-random-variables/mathematical-statistics-discrete-rv-definition/The probability mass function, cumulative distribution function, expectation and variance for random variables.Location Inference for Single Sampleshttp://y1zhou.com/series/nonparam-stat/2-single-samples/nonparametric-methods-single-sample-location-inference/Tue, 26 Mar 2019 21:12:45 -0500http://y1zhou.com/series/nonparam-stat/2-single-samples/nonparametric-methods-single-sample-location-inference/The Wilcoxin signed rank test explained.Moving Average Modelhttp://y1zhou.com/series/time-series/2-arma/time-series-moving-average-model/Sat, 12 Sep 2020 20:31:13 -0400http://y1zhou.com/series/time-series/2-arma/time-series-moving-average-model/The mean, variance, ACF and PACF of moving average models. Instead of stationarity, a new property called invertibility is introduced.Common Discrete Random Variableshttp://y1zhou.com/series/maths-stat/2-discrete-random-variables/mathematical-statistics-common-discrete-random-variables/Sun, 06 Oct 2019 10:46:18 -0400http://y1zhou.com/series/maths-stat/2-discrete-random-variables/mathematical-statistics-common-discrete-random-variables/We introduce the binomial (Bernoulli), geometric and Poisson probability distributions and their properties. The properties include their expectations, variances and moment generating functions.Other Single Sample Inferenceshttp://y1zhou.com/series/nonparam-stat/2-single-samples/nonparametric-methods-other-single-sample-inferences/Fri, 26 Apr 2019 23:45:36 -0500http://y1zhou.com/series/nonparam-stat/2-single-samples/nonparametric-methods-other-single-sample-inferences/Explore whether the sample is consistent with a specified distribution at the population level. Kolmogorov’s test, Lilliefors test and Shapiro-Wilk test are introduced, as well as tests for runs or trends.ARMA Modelhttp://y1zhou.com/series/time-series/2-arma/time-series-arma-model/Sat, 12 Sep 2020 20:31:18 -0400http://y1zhou.com/series/time-series/2-arma/time-series-arma-model/The mean, variance, ACF and PACF of ARMA models. The backshift operator is introduced, and the stationarity and invertibility of the general ARMA(p, q) model is discussed.Model Fitting and Forecastinghttp://y1zhou.com/series/time-series/time-series-model-fitting-and-forecasting/Mon, 14 Sep 2020 12:06:41 -0400http://y1zhou.com/series/time-series/time-series-model-fitting-and-forecasting/This model-building strategy consists of three steps: model specification (identification), model fitting, and model diagnostics.Vector Spacehttp://y1zhou.com/series/linear-algebra/linear-algebra-vector-space/Mon, 31 Aug 2020 13:13:34 -0400http://y1zhou.com/series/linear-algebra/linear-algebra-vector-space/We introduce some basic terminology - vector space, subspace, span, basis, dimension, norm and distance. These concepts lay the foundation for future discussions on matrices and matrix properties.Definitions for Continuous Random Variableshttp://y1zhou.com/series/maths-stat/3-continuous-random-variables/mathematical-statistics-continuous-rv-definition/Wed, 25 Sep 2019 10:46:18 -0400http://y1zhou.com/series/maths-stat/3-continuous-random-variables/mathematical-statistics-continuous-rv-definition/The probability density function, cumulative distribution function, expectation and variance for a continuous random variable.Methods for Paired Sampleshttp://y1zhou.com/series/nonparam-stat/3-multiple-samples/nonparametric-methods-paired-samples/Mon, 29 Apr 2019 14:22:47 -0400http://y1zhou.com/series/nonparam-stat/3-multiple-samples/nonparametric-methods-paired-samples/An obvious extension of the one-sample procedures.Common Continuous Random Variableshttp://y1zhou.com/series/maths-stat/3-continuous-random-variables/mathematical-statistics-common-continuous-rvs/Fri, 01 Nov 2019 10:46:18 -0400http://y1zhou.com/series/maths-stat/3-continuous-random-variables/mathematical-statistics-common-continuous-rvs/The uniform distribution, normal distribution, exponential distribution and their properties.Two Independent Sampleshttp://y1zhou.com/series/nonparam-stat/3-multiple-samples/nonparametric-methods-two-independent-samples/Thu, 02 May 2019 12:09:42 -0400http://y1zhou.com/series/nonparam-stat/3-multiple-samples/nonparametric-methods-two-independent-samples/With two independent samples, we may ask about the centrality of the population distribution and see if there’s a shift. Wilcoxon-Mann-Whitney is here!Basic Tests for Three or More Sampleshttp://y1zhou.com/series/nonparam-stat/3-multiple-samples/nonparametric-methods-three-or-more-samples/Sat, 04 May 2019 12:09:42 -0400http://y1zhou.com/series/nonparam-stat/3-multiple-samples/nonparametric-methods-three-or-more-samples/Nonparametric analogues of the one-way classification ANOVA and the simplest two-way classifications, namely the Kruskal-Wallis test, the Jonckheere-Terpstra test, and the Friedman test.Multivariate Probability Distributionshttp://y1zhou.com/series/maths-stat/mathematical-statistics-multivariate-probability-distributions/Wed, 06 Nov 2019 09:57:16 -0400http://y1zhou.com/series/maths-stat/mathematical-statistics-multivariate-probability-distributions/Joint probability distributions of two or more random variables defined on the same sample space. Also covers independence, conditional expectation and total expectation.Correlation and Concordancehttp://y1zhou.com/series/nonparam-stat/4-association-analysis/nonparametric-methods-correlation-and-concordance/Sun, 05 May 2019 10:46:18 -0400http://y1zhou.com/series/nonparam-stat/4-association-analysis/nonparametric-methods-correlation-and-concordance/Measures for the strength of relationships between variables (two or more). The Spearman rank correlation coefficient, Kendall’s tau and Kendall’s W are introduced.Categorical Datahttp://y1zhou.com/series/nonparam-stat/4-association-analysis/nonparametric-methods-categorical-data/Mon, 06 May 2019 10:46:18 -0400http://y1zhou.com/series/nonparam-stat/4-association-analysis/nonparametric-methods-categorical-data/Dealing with contingency tables. Fisher’s exact test comes back, together with Chi-squared test and likelihood-ratio test. We also talk about testing goodness-of-fit.Functions of Random Variableshttp://y1zhou.com/series/maths-stat/mathematical-statistics-functions-of-random-variables/Sun, 08 Dec 2019 09:57:16 -0400http://y1zhou.com/series/maths-stat/mathematical-statistics-functions-of-random-variables/Finding the distribution of a real-valued function of multiple random variables. There’s the method of distribution functions, transformations and moment generating functions.Bootstraphttp://y1zhou.com/series/nonparam-stat/5-modern-methods/nonparametric-methods-bootstrap/Mon, 06 May 2019 10:46:18 -0400http://y1zhou.com/series/nonparam-stat/5-modern-methods/nonparametric-methods-bootstrap/The procedure and applications of the nonparametric bootstrap.Density Estimationhttp://y1zhou.com/series/nonparam-stat/5-modern-methods/nonparametric-methods-density-estimation/Mon, 06 May 2019 10:46:18 -0400http://y1zhou.com/series/nonparam-stat/5-modern-methods/nonparametric-methods-density-estimation/Wanna know more about histograms and density plots?Modern Nonparametric Regressionhttp://y1zhou.com/series/nonparam-stat/5-modern-methods/nonparametric-methods-modern-nonparametric-regression/Wed, 08 May 2019 10:46:18 -0400http://y1zhou.com/series/nonparam-stat/5-modern-methods/nonparametric-methods-modern-nonparametric-regression/LOWESS, penalized least squares and the cubic spline.Sampling Distribution and Limit Theoremshttp://y1zhou.com/series/maths-stat/mathematical-statistics-sampling-distribution-and-limit-theorems/Sat, 28 Dec 2019 09:57:16 -0400http://y1zhou.com/series/maths-stat/mathematical-statistics-sampling-distribution-and-limit-theorems/We observe a random sample from a probability distribution of interest and want to estimate its properties. The CLT also comes into place.Brief Review Before STAT 6520http://y1zhou.com/series/maths-stat/mathematical-statistics-brief-review-before-6520/Wed, 08 Jan 2020 09:57:16 -0400http://y1zhou.com/series/maths-stat/mathematical-statistics-brief-review-before-6520/A brief review of probability theory and statistics we’ve learnt so far.Bias and Variancehttp://y1zhou.com/series/maths-stat/8-estimation/mathematical-statistics-bias-and-variance/Sat, 25 Jan 2020 10:46:18 -0400http://y1zhou.com/series/maths-stat/8-estimation/mathematical-statistics-bias-and-variance/The bias, variance and mean squared error of an estimator. The efficiency is used to compare two estimators.Consistencyhttp://y1zhou.com/series/maths-stat/8-estimation/mathematical-statistics-consistency/Mon, 27 Jan 2020 10:46:18 -0400http://y1zhou.com/series/maths-stat/8-estimation/mathematical-statistics-consistency/Introducing consistency, a concept about the convergence of estimators. We start from the convergence of non-random number sequences to convergence in probability, then to consistency of estimators and its properties.The Method of Momentshttp://y1zhou.com/series/maths-stat/8-estimation/mathematical-statistics-method-of-moments/Tue, 28 Jan 2020 10:46:18 -0400http://y1zhou.com/series/maths-stat/8-estimation/mathematical-statistics-method-of-moments/A fairly simple method of constructing estimators that’s not often used now.Maximum Likelihood Estimatorhttp://y1zhou.com/series/maths-stat/9-estimation-under-parametric-models/mathematical-statistics-maximum-likelihood-estimator/Wed, 29 Jan 2020 10:46:18 -0400http://y1zhou.com/series/maths-stat/9-estimation-under-parametric-models/mathematical-statistics-maximum-likelihood-estimator/Under parametric family distributions, there’s a much better way of constructing estimators - the maximum likelihood estimator.Sufficiencyhttp://y1zhou.com/series/maths-stat/9-estimation-under-parametric-models/mathematical-statistics-sufficiency/Thu, 30 Jan 2020 10:46:18 -0400http://y1zhou.com/series/maths-stat/9-estimation-under-parametric-models/mathematical-statistics-sufficiency/Introducing sufficient statistics for the inference of parameters. The factorization theorem comes in handy!Optimal Unbiased Estimatorhttp://y1zhou.com/series/maths-stat/9-estimation-under-parametric-models/mathematical-statistics-optimal-unbiased-estimator/Sun, 02 Feb 2020 10:46:18 -0400http://y1zhou.com/series/maths-stat/9-estimation-under-parametric-models/mathematical-statistics-optimal-unbiased-estimator/Introducing the Minimum Variance Unbiased Estimator and the procedure of deriving it.Confidence Intervalshttp://y1zhou.com/series/maths-stat/mathematical-statistics-confidence-intervals/Sat, 08 Feb 2020 09:57:16 -0400http://y1zhou.com/series/maths-stat/mathematical-statistics-confidence-intervals/Confidence intervals and methods of contructing them.Statistical Decisionhttp://y1zhou.com/series/maths-stat/11-hypothesis-testing/mathematical-statistics-statistical-decision/Wed, 01 Apr 2020 16:55:50 -0400http://y1zhou.com/series/maths-stat/11-hypothesis-testing/mathematical-statistics-statistical-decision/Up till now we’ve made the assumption that the data is generated from a statistical model controlled by some parameter(s). We used estimation to determine a point or a range of possible values of parameters based on the sample. On the other hand, the goal of data analysis is often to help make decisions, which is not directly addressed by estimation.
Drug approval example Suppose a new drug can be approved only with $\geq 90%$ effective rate.Statistical Testhttp://y1zhou.com/series/maths-stat/11-hypothesis-testing/mathematical-statistics-statistical-test/Wed, 01 Apr 2020 16:55:50 -0400http://y1zhou.com/series/maths-stat/11-hypothesis-testing/mathematical-statistics-statistical-test/Here we introduce the elements of a statistical test, namely null and alternative hypotheses, test statistic, rejection region, and type I and type II errors. We then proceed to large-sample Z-tests and some small-sample tests derived from the small sample CIs.p-valueshttp://y1zhou.com/series/maths-stat/11-hypothesis-testing/mathematical-statistics-p-values/Thu, 09 Apr 2020 18:26:34 -0400http://y1zhou.com/series/maths-stat/11-hypothesis-testing/mathematical-statistics-p-values/Introducing the definition of p-values, and why they are important in statistical tests.Optimal Testshttp://y1zhou.com/series/maths-stat/11-hypothesis-testing/mathematical-statistics-optimal-tests/Tue, 14 Apr 2020 18:26:34 -0400http://y1zhou.com/series/maths-stat/11-hypothesis-testing/mathematical-statistics-optimal-tests/Briefly introducing the optimality of a statistical test and showing why it’s a difficult problem to solve.Likelihood Ratio Testhttp://y1zhou.com/series/maths-stat/11-hypothesis-testing/mathematical-statistics-likelihood-ratio-test/Sat, 18 Apr 2020 22:15:07 -0400http://y1zhou.com/series/maths-stat/11-hypothesis-testing/mathematical-statistics-likelihood-ratio-test/In the previous section, we considered the situation where we
Test $H_0$: $\theta = \theta_0$ vs. $H_a$: $\theta = \theta_a$ using rejection rule $\frac{L(\theta_0)}{L(\theta_a)} < k_\alpha$. Test $H_0$: $\theta = \theta_0$ vs. $H_a$: $\theta \in \Theta_a$ (typically one-sided) using the rejection rule $\frac{L(\theta_0)}{L(\theta_a)} < k_\alpha$ if it does not depend on $\theta \in \Theta_a$. Beyond these situations, there’s many other cases, such as
What if $H_0: \theta \in \Theta_0$ is composite?Linear Modelshttp://y1zhou.com/series/maths-stat/mathematical-statistics-linear-models/Tue, 21 Apr 2020 13:05:20 -0400http://y1zhou.com/series/maths-stat/mathematical-statistics-linear-models/So far we’ve finished the main materials of this course - estimation and hypothesis testing. The starting point of all the statistical analyses is really modeling. In other words, we assume that our data are generated by some random mechanism, specifically we’ve been focusing on i.i.d. samples from a fixed population distribution.
Although this assumption can be regarded reasonable for many applications, in practice there are other scenarios where this doesn’t make sense, e.Abouthttp://y1zhou.com/about/Tue, 30 Jun 2020 00:00:00 +0000http://y1zhou.com/about/1 2 3 4 5 6 7 8 9 10 11 12 { "name": "Yi", "job": "PhD Student", "field": "Bioinformatics", "interests": [ "cancer", "systems biology", "metabolic reprogramming", "NLP", "data visualization", ], "skills": ["R", "Python", "linux"], "personality": "ENTP-A" }Co-expression based cancer staging and applicationhttp://y1zhou.com/publications/yu-2020-coexpression/Tue, 30 Jun 2020 00:00:00 +0000http://y1zhou.com/publications/yu-2020-coexpression/A novel method is developed for predicting the stage of a cancer tissue based on the consistency level between the co-expression patterns in the given sample and samples in a specific stage. The basis for the prediction method is that cancer samples of the same stage share common functionalities as reflected by the co-expression patterns, which are distinct from samples in the other stages. Test results reveal that our prediction results are as good or potentially better than manually annotated stages by cancer pathologists.Metabolic Reprogramming in Cancer: the bridge that connects intracellular stresses and cancer behaviorshttp://y1zhou.com/publications/yi-2020-nsr-perspective/Thu, 30 Apr 2020 00:00:00 +0000http://y1zhou.com/publications/yi-2020-nsr-perspective/We outline in this perspective a novel framework for cancer study from the angle of stress-induced metabolic reprogramming. The driving question is: what may dictate the same or highly similar evolutionary trajectory across different cancers, consisting of cell proliferation, drug resistance, migration and metastasis? We have observed that cancer and cancer-forming cells are under a persistent intracellular alkaline stress, due to chronic inflammation and local iron overload. A wide range of reprogrammed metabolisms (RMs) are induced to keep the intracellular pH within a livable range for survival.Install Dependencies for Puppeteer on Manjaro Linuxhttp://y1zhou.com/posts/manjaro-puppeteer/Mon, 13 Apr 2020 21:31:56 -0400http://y1zhou.com/posts/manjaro-puppeteer/Elucidation of Functional Roles of Sialic Acids in Cancer Migrationhttp://y1zhou.com/publications/sun-2020-sialic-acid/Tue, 31 Mar 2020 00:00:00 +0000http://y1zhou.com/publications/sun-2020-sialic-acid/Sialic acids (SA), negatively charged nine-carbon sugars, have long been implicated in cancer metastasis since 1960’s but its detailed functional roles remain elusive. We present a computational analysis of transcriptomic data of cancer vs. control tissues of eight types in TCGA, aiming to elucidate the possible reason for the increased production and utilization of SAs in cancer and their possible driving roles in cancer migration. Our analyses have revealed for all cancer types:Automatic and Interpretable Model for Periodontitis Diagnosis in Panoramic Radiographshttp://y1zhou.com/publications/li-2020-miccai/Sat, 14 Mar 2020 00:00:00 +0000http://y1zhou.com/publications/li-2020-miccai/Periodontitis is a prevalent and irreversible chronic inflammatory disease both in developed and developing countries, and affects about 20% - 50% of the global population. The tool for automatically diagnosing periodontitis is highly demanded to screen at-risk people for periodontitis and its early detection could prevent the onset of tooth loss, especially in local community and health care settings with limited dental professionals. In the medical field, doctors need to understand and trust the decisions made by computational models and proposing interpretable machine learning models is crucial for disease diagnosis.Neural Functions Play Different Roles in Triple Negative Breast Cancer (TNBC) and non-TNBChttp://y1zhou.com/publications/tan-2020-neural/Thu, 20 Feb 2020 00:00:00 +0000http://y1zhou.com/publications/tan-2020-neural/Triple negative breast cancer (TNBC) represents the most malignant subtype of breast cancer, and yet our understanding about its unique biology remains elusive. We have conducted a comparative computational analysis of transcriptomic data of TNBC and non-TNBC (NTNBC) tissue samples from the TCGA database, focused on genes involved in neural functions. Our main discoveries are:
While both subtypes involve neural functions, TNBC has substantially more up-regulated neural genes than NTNBC, suggesting that TNBC is more complex than NTNBC; Non-neural functions related to cell-microenvironment interactions and intracellular damage processing are key inducers of the neural genes in both TNBC and NTNBC, but the inducer-responder relationships are different in the two cancer subtypes; Key neural functions such as neural crest formation are predicted to enhance adaptive immunity in TNBC while glia development, along with a few other neural functions, induce both innate and adaptive immunity in NTNBC.Metabolic Reprogramming in Cancer is Induced to Increase Proton Productionhttp://y1zhou.com/publications/sun-2020-metabolic/Mon, 13 Jan 2020 00:00:00 +0000http://y1zhou.com/publications/sun-2020-metabolic/Considerable metabolic reprogramming has been observed in a conserved manner across multiple cancer types, but their true causes remain elusive. We present an analysis of around 50 such reprogrammed metabolisms (RMs) including the Warburg effect, nucleotide de novo synthesis and sialic acid biosynthesis in cancer.
Analyses of the biochemical reactions conducted by these RMs, coupled with gene expression data of their catalyzing enzymes, in 7,011 tissues of 14 cancer types, revealed that all RMs produce more H+ than their original metabolisms.Transcription regulation by DNA methylation under stressful conditions in human cancerhttp://y1zhou.com/publications/cao-2017-transcription/Thu, 23 Nov 2017 00:00:00 +0000http://y1zhou.com/publications/cao-2017-transcription/We aim to address one question: do cancer vs. normal tissue cells execute their transcription regulation essentially the same or differently, and why? We utilized an integrated computational study of cancer epigenomes and transcriptomes of 10 cancer types, by using penalized linear regression models to evaluate the regulatory effects of DNA methylations on gene expressions.