PSY2013 Lecture 07 Notes

PSY2013 Research Methods in Psychology

Statistics in Psychology


Part 1: Descriptive Statistics 描述性统计

Basic Concepts 基本概念

  1. Variables 变量

    • Dependent Variable (DV) 因变量: The variable that is measured.
    • Independent Variable (IV) 自变量: The variable that is manipulated.
    • Scales 测量尺度: Nominal, Ordinal, Interval, Ratio 名义、顺序、区间、比率
  2. Population and Sample 总体与样本

    • Population 总体: The entire group of individuals or instances about whom we hope to learn.
    • Sample 样本: A subset of the population examined in the study.

Descriptive Statistics 描述性统计

  • Definition 定义

    • Methods for organizing and summarizing data.
    • 组织和总结数据的方法。
    • Examples: Tables, graphs, and descriptive values (e.g., average score).
    • 例如:表格、图表和描述性值(如平均分)。
  • Parameters and Statistics 参数与统计量

    • Parameter 参数: A descriptive value for a population.
    • Statistic 统计量: A descriptive value for a sample.
    • 参数:总体的描述值。
    • 统计量:样本的描述值。

Four Types of Measurement Scales 四种测量尺度

  1. Nominal Scale 名义尺度

    • Unordered set of categories identified only by name.
    • 仅以名称标识的无序类别集合。
    • Example: Gender, ethnicity 性别、种族
  2. Ordinal Scale 顺序尺度

    • Ordered set of categories.
    • 有序类别集合。
    • Example: Class rankings 班级排名
  3. Interval Scale 区间尺度

    • Ordered series of equal-sized categories; arbitrary zero point.
    • 等间距有序系列;零点是任意的。
    • Example: Temperature in Celsius 摄氏温度
  4. Ratio Scale 比率尺度

    • Interval scale with a true zero point.
    • 具有真实零点的区间尺度。
    • Example: Height, weight 身高、体重

Frequency Distributions 频数分布

  • Definition 定义

    • Organized tabulation showing the number of individuals located in each category.
    • 显示每个类别中个体数量的有组织的列表。
    • Example: Frequency table 频数表
  • Types 类型

    • Histograms 直方图: Bars touch, used for interval or ratio scales.
    • 条形接触,用于区间或比率尺度。
    • Polygons 多边形图: Dots connected by lines.
    • 点通过线连接。
    • Bar Graphs 条形图: Bars do not touch, used for nominal or ordinal scales.
    • 条形不接触,用于名义或顺序尺度。

Central Tendency 集中趋势

  • Definition 定义
    • Measures that determine a single value to describe the center of a distribution.
    • 确定单个值来描述分布中心的测量。
    • Mean 平均数: Sum of scores divided by the number of scores.
    • 总分数除以分数个数。
      • $$
        \text{Mean} (\mu) = \frac{\sum X}{N}$$
    • Median 中位数: Middle score in a distribution.
    • 分布中的中间分数。
    • Mode 众数: Most frequently occurring score.
    • 出现最频繁的分数。

Variability 变异性

  • Definition 定义
    • Measure of how spread out the scores are in a distribution.
    • 测量分数在分布中的分散程度。
    • Range 范围: Difference between the highest and lowest scores.
    • 最高分和最低分之间的差。
    • Standard Deviation 标准差: Average distance between each score and the mean.
    • 每个分数与平均数之间的平均距离。
      • $$
        \text{Standard Deviation} (\sigma) = \sqrt{\frac{\sum (X - \mu)^2}{N}}$$
    • Variance 方差: Mean of the squared deviations.
    • 平均平方偏差。
      • $$
        \text{Variance} (\sigma^2) = \frac{\sum (X - \mu)^2}{N}$$

Correlation 相关性

  • Pearson Correlation 皮尔逊相关系数
    • Measures the direction and strength of the linear relationship between two variables.
    • 测量两个变量之间线性关系的方向和强度。
    • Formula 公式:
      • $$
        r = \frac{\sum (X - \overline{X})(Y - \overline{Y})}{\sqrt{\sum (X - \overline{X})^2 \sum (Y - \overline{Y})^2}}$$
    • Range 范围: -1 to 1

Regression 回归分析

  • Linear Regression 线性回归
    • Determines the equation for the best-fitting line.
    • 确定最佳拟合线的方程。
    • Equation 方程:
      • $$
        Y = bX + a$$
      • $b$: Slope 斜率
      • $a$: Y-intercept 截距

Part 2: Probability and Samples 概率与样本

Inferential Statistics 推论统计

  • Definition 定义
    • Methods for using sample data to make general conclusions about populations.
    • 使用样本数据对总体做出一般结论的方法。

z-Scores 标准分数

  • Formula 公式
    • $$
      z = \frac{X - \mu}{\sigma}$$
    • $$
      X = \mu + z\sigma$$
  • Purpose 目的
    • Specifies the precise location of each X value within a distribution.
    • 指定每个X值在分布中的精确位置。
    • Example 例子
      • Mean $\mu$ = 100, Standard Deviation $\sigma$ = 10, X = 130 → z = 3
      • 平均数 $\mu$ = 100,标准差 $\sigma$ = 10,X = 130 → z = 3

Probability 概率

  • Definition 定义
    • Method for measuring the likelihood of obtaining a specific sample.
    • 测量获得特定样本的可能性的方法。
    • Formula 公式:
      • $$
        P(A) = \frac{\text{number of outcomes classified as A}}{\text{total number of possible outcomes}}$$

Sampling and Probability 抽样与概率

  • Random Sampling 随机抽样

    • Each member of a population has an equal chance of being selected.
    • 总体的每个成员都有同等的被选中机会。
  • Normal Distribution 正态分布

    • Symmetrical, bell-shaped distribution.
    • 对称的钟形分布。
    • Properties 特性
      • Mean = Median = Mode
      • 平均数 = 中位数 = 众数
      • Empirical Rule 经验法则: ~68% within 1 SD, ~95% within 2 SDs, ~99.7% within 3 SDs
      • 68%在1个标准差内,95%在2个标准差内,~99.7%在3个标准差内

Central Limit Theorem 中心极限定理

  • Definition 定义
    • The distribution of sample means will be normal if the sample size is large enough (n ≥ 30).
    • 如果样本量足够大(n ≥ 30),样本均值的分布将是正态分布。
    • Properties 特性
      • Mean of sample means = Population mean
      • 样本均值的平均数 = 总体均值
      • Standard Error 标准误差:
        • $$
          \sigma_M = \frac{\sigma}{\sqrt{n}}$$

Part 3: Hypothesis Testing 假设检验

Hypothesis Testing 假设检验

  • Definition 定义

    • Statistical method that uses sample data to evaluate a hypothesis about a population.
    • 使用样本数据评估总体假设的统计方法。
    • Steps 步骤
      1. State hypothesis about the population.
        • 陈述关于总体的假设。
      2. Use hypothesis to predict the characteristics the sample should have.
        • 使用假设预测样本应具有的特征。
      3. Obtain a sample from the population.
        • 从总体中获取样本。
      4. Compare data with the hypothesis prediction.
        • 将数据与假设预测进行比较。
  • Null Hypothesis (H0) 零假设

    • States that there is no change or effect.
    • 声明没有变化或效果。
  • Alternative Hypothesis (H1) 备择假设

    • States that there is a change or effect.
    • 声明存在变化或效果。

Critical Regions and Alpha Level 临界区和显著性水平

  • Critical Region 临界区
    • Extreme sample values that are very unlikely to occur if H0 is true.
    • 如果零假设成立,极不可能发生的样本值。
  • Alpha Level (α) 显著性水平
    • Probability value used to define “very unlikely” outcomes.
    • 用于定义“极不可能”结果的概率值。
    • Common α values 常用α值: 0.05, 0.01, 0.001

Errors in Hypothesis Testing 假设检验中的错误

  • Type I Error (α) 一类错误

    • Rejecting H0 when it is true.
    • 拒绝成立的零假设。
    • False positive 假阳性
  • Type II Error (β) 二类错误

    • Failing to reject H0 when it is false.
    • 未拒绝错误的零假设。
    • False negative 假阴性

Statistical Power 统计功效

  • Definition 定义
    • Probability that the test will correctly reject a false null hypothesis.
    • 检验正确拒绝错误的零假设的概率。
    • Formula 公式:
      • $$
        \text{Power} = 1 - \beta$$

Measuring Effect Size 测量效果大小

  • Cohen’s d 科恩的d
    • Measures the size of the mean difference in terms of standard deviation units.
    • 以标准差单位衡量平均差异的大小。

Part 4: t-Tests t检验

The t Statistic t统计量

  • Definition 定义
    • Used to test hypotheses about an unknown population mean when the population standard deviation is also unknown.
    • 用于在总体标准差未知的情况下检验未知总体均值的假设。
    • Formula 公式:
      • $$
        t = \frac{M - \mu}{s_M}$$
      • $s_M$: Estimated standard error 估计标准误差

Degrees of Freedom 自由度

  • Definition 定义
    • Number of scores in a sample that are independent and free to vary.
    • 样本中独立且自由变化的分数数量。
    • Formula 公式:
      • $$
        df = n - 1$$

Hypothesis Tests with the t Statistic 使用t统计量的假设检验

  • Steps 步骤
    1. State the hypotheses and select an alpha level.
      • 陈述假设并选择显著性水平。
    2. Locate the critical region using df and alpha level.
      • 使用自由度和显著性水平找到临界区。
    3. Calculate the test statistic.
      • 计算检验统计量。
    4. Make a decision.
      • 做出决定。

Independent-Measures t Test 独立样本t检验

  • Purpose 目的
    • Determine whether the sample mean difference indicates a real mean difference between two populations.
    • 确定样本均值差异是否表示两个总体之间的真实均值差异。

Pooled Variance 合并方差

  • Definition 定义
    • Provides an unbiased basis for calculating the standard error.
    • 提供计算标准误差的无偏基础。
    • Formula 公式:
      • $$
        s_p^2 = \frac{SS_1 + SS_2}{df_1 + df_2}$$

Repeated-Measures t Test 重复测量t检验

  • Definition 定义
    • Evaluate the mean difference between two treatment conditions using data from a single sample.
    • 使用单个样本的数据评估两个处理条件之间的平均差异。
    • Difference Score 差异分数:
      • $$
        D = X_2 - X_1$$

Confidence Intervals 置信区间

  • Definition 定义
    • Range of values that estimate the unknown population mean.
    • 估计未知总体均值的值范围。
    • Formula 公式:
      • $$
        \mu = M \pm t(s_M)$$

Author

TosakaUCW

Posted on

2024-05-27

Updated on

2024-05-27

Licensed under

Comments