[tool_anchor_carousel]

t-test

  • Purpose of the tool

The t-test is used to determine whether the means of two groups differ statistically significantly.
It is used to assess whether an observed difference between two groups goes beyond random variation.

The decision is made by comparing the p-value with the specified significance level (usually α = 0.05):

  • p ≤ α → accept H₁ (reject H₀)
  • p > α → Retain H₀
  • Example: Tomato Sauce

A new tomato sauce recipe is being tested in product development.
The goal is to determine whether the average viscosity of the new recipe differs from that of the previous recipe.

To this end, viscosity measurements are taken on samples of the old recipe and the new recipe.
The measured values for both groups are collected independently of one another and considered as separate samples.

Using the t-test (two samples), we aim to determine whether the observed difference in the mean values is statistically significant or can be explained by random variation.

Interpretation of the results:

The calculated p-value is significantly above the significance level of 0.05, so the null hypothesis is retained. We conclude that the average viscosity of the old and new formulations is the same.

There is therefore no evidence that the average viscosity of the new formulation differs from that of the previous formulation.

Explanation of the graph:

  • The points mark the mean viscosity values for the old and new formulations.
  • The error bars represent the 95% confidence interval of the mean.
  • The significant overlap of the confidence intervals indicates that there is no statistically significant difference between the mean values.
  • Procedure

    (How was this graphic created?)

Preliminary Work

  1. Select a continuous measurement parameter (e.g., viscosity).
  2. Define two groups whose mean values are to be compared (e.g., viscosity of old vs. new formulation).
  3. Set the significance level (usually α = 0.05).
  4. Check whether the data show any signs of significant deviations from the normal distribution.
  5. Ensure that the measured values were collected independently of one another.

Use in AlphadiTab

  1. In the Analyze phase, select the 2-Sample t-Test tool.
  2. For sample 1, “Viskositaet_mPas_Alt”
  3. For sample 2, enter “Viskositaet_mPas_Neu”
  4. Perform the analysis by clicking “Create New.”

Interpretation

  1. Check whether the p-value is less than or equal to the significance level.
    p ≤ α → statistically significant difference in the means.
    p > α → no statistically significant difference in the means.

Important: The interpretation refers exclusively to the mean

  • Settings

Data

Manual entry:


The comparison is based on manually entered standard deviations and sample sizes for two samples.

Non-manual entry:


The comparison is based on the selected data columns.

Direction (hypothesis type)

The direction determines what type of difference between the two samples should be tested.

2-sides

Null hypothesis
H0: μ1 − μ2 = Δ0

Alternative hypothesis
HA: μ1 − μ2 ≠ Δ0

Select “two-tailed” if you want to test whether the means of the two samples differ, without specifying a particular direction.

  • The test determines whether the mean of the first sample is greater than or less than that of the second.
  • This setting is useful when there is no specific expectation regarding the direction of the difference.

Example:
Do the average test scores of Group A and Group B differ?

Greater

Null hypothesis
H0: μ1 − μ2 = Δ0

Alternative hypothesis
HA: μ1 − μ2 > Δ0

Select “Greater” if you want to test whether the mean of the first sample is greater than that of the second sample.

  • The test only checks whether Sample 1 has significantly higher values than Sample 2.
  • Differences in the other direction are not taken into account.

Example:
Is the average revenue higher after an advertising campaign than before the campaign?

Less

Null hypothesis
H0: μ1 − μ2 = Δ0

Alternative hypothesis
HA: μ1 − μ2 < Δ0

Select “Smaller” if you want to test whether the mean of the first sample is smaller than that of the second sample.

  • The test only checks whether Sample 1 has significantly lower values than Sample 2.
  • Differences in the opposite direction are not taken into account.

Example:
Is the average processing time lower after optimization than before?

  • Requirements

Two groups

There must be exactly two groups whose means are to be compared (e.g., old vs. new formula).

Why is this important?

The t-test is a method for comparing two means.

Independent samples

The measured values of the two groups must not influence each other (no pairing of the same items).

Why is this important?

The test assumes that the groups were collected independently of one another.

Continuous measurement data

The measured values must be continuous.

Why is this important?

The t-test compares means of numerical measurement data.

Normally distributed data
The repeated measured values should show no signs of a significant deviation from the normal distribution.

Why is this important?

  • The t-test is based on assumptions of normal distribution. In the case of significant deviations, the test results may be unreliable.
  • The t-test is robust against slight deviations from the normal distribution. However, for highly skewed distributions or pronounced outliers, an alternative method should be used.
  • Tools

    (When are other options more suitable?)

If more than two groups are to be compared, an analysis of variance (ANOVA) is more appropriate.

If the data are heavily skewed or contain significant outliers, a nonparametric method should be used.

If the same items or individuals are being compared before and after an intervention, a paired t-test is appropriate.

If variances rather than means are to be compared, an F-test or Levene’s test is more appropriate.

If proportions rather than means are to be compared, a proportion test is the appropriate tool.

Production

Net weight of tomato sauce

Two filling machines are used in production. The goal is to determine whether the average fill volume differs between Machine A and Machine B.

Aggregated measurement data is available for both machines.

  • Machine A: n = 25, mean = 500.2 ml, standard deviation = 1.1 ml
  • Machine B: n = 25, mean = 498.9 ml, standard deviation = 1.0 ml

The means are compared using a t-test (two-sample).

Interpretation:

The t-test shows a statistically significant difference in the mean filling volume between the two machines.
The p-value is less than 0.05, so the null hypothesis is rejected.
The machines differ in their mean filling volume.

IT help desks

Response Time for Inquiries

Tickets are processed at multiple locations in the IT service desk.
Response times are regularly analyzed to identify differences in service quality.

In the example of IT tickets, data is available from three locations.
The t-test (two-sample) is generally only suitable for comparing two groups.

If there are more than two locations, there are two possible approaches:

Pairwise comparisons using the t-test
Each location can be compared in pairs with the other locations (e.g., Location A vs. B, A vs. C, B vs. C).
In each case, the analysis checks whether the mean response times between two locations differ statistically significantly.

Alternative: Analysis of Variance (ANOVA)
If all locations are to be considered simultaneously, an analysis of variance (ANOVA) is the more appropriate tool.
ANOVA tests whether there is at least one significant difference between the means of the locations without having to perform multiple individual tests.

Note on Interpretation

With multiple paired t-tests, the risk of chance hits increases. For an overall assessment of the locations, ANOVA is therefore generally preferable.

Interpretation:

The t-test shows no statistically significant difference between the mean turnaround times at the DLZ Nord and DLZ Ost locations.
The p-value is above the significance level of 0.05, so the null hypothesis is retained.

From a statistical perspective, the mean turnaround times at the two locations do not differ.

Sales

Sales ratio by region

In the sales department, customer inquiries are handled by two teams.
The goal is to determine whether the average processing time differs between Team A and Team B.

Interpretation:

The t-test shows no statistically significant difference in the average processing time between the two teams.
The p-value is greater than 0.05, so the null hypothesis is retained.
On average, both teams process the quotes at the same speed.

Logistics

Delivery time to the logistics center

In the logistics department, customer orders are picked and shipped.
New forklifts were introduced to increase efficiency.

The goal is to determine whether the average delivery time (in hours) has decreased following the introduction of the new forklifts.

The analysis is performed using a two-sample t-test as a one-tailed test; in this case, “greater than” was selected.

H₀: μ_(Before) − μ_(After) = 0
H₁: μ_(Before) − μ_(After) > 0

Interpretation:

The one-tailed t-test shows a statistically significant difference between the mean delivery times before and after the introduction of the new forklifts (t = 3.29; p = 0.001).

Since the p-value is below the significance level of 0.05, the null hypothesis is rejected.
The average delivery time before the introduction is significantly higher than after the introduction.

It can therefore be concluded that the introduction of the new forklifts led to a significant reduction in the average delivery time.

Shopping

Supplier Comparison

The purchasing department sources components from two suppliers. The goal is to determine whether the average scrap rate per delivery differs between Supplier A and Supplier B. The scrap rate is measured as a percentage for each delivery.

Note:

The t-test assumes approximately normally distributed, continuous data.

Percentage values such as the scrap rate can be discrete, as they are derived from count data.
For small delivery quantities (e.g., 10 parts per delivery), there are only a few possible percentage values (0%, 10%, 20% …). In such cases, the assumption of a normal distribution may be violated, and the t-test may not be appropriate.

For larger delivery quantities with many possible values, the t-test can generally be applied without issue in practice.

Interpretation:

The t-test shows a statistically significant difference in the average rejection rate between the suppliers (p < 0.05). The null hypothesis is rejected. The suppliers differ in terms of their average rejection rates. Supplier A has the lower rejection rate.

Planning

Forecast deviation

In production planning, demand forecasts are created for different planning periods.
To evaluate the quality of the forecast, the forecast error is calculated.

The goal is to investigate whether the average forecast error differs between short-term and long-term planning periods.

Short-term planning horizon:

n = 30, mean = 0.0%, standard deviation = 1.5%

Long-term planning horizon

n = 30, mean = 0.0%, standard deviation = 3.8%

Interpretation:

The t-test for two independent samples shows that the mean forecast errors for the short-term and long-term planning periods do not differ statistically significantly.

Since the p-value is above the significance level of 0.05, the null hypothesis is not rejected.

There is therefore no statistical evidence that the average forecast error differs between the two planning horizons.

  • Terms

Continuous data: Data collected using a measuring instrument that may include both units of measurement and decimal places.

Normally distributed data: Data that can be well described by a normal distribution. This can be verified, for example, using a normality test.

x̄ = sample mean: The average value of the collected measurement data.

s = sample standard deviation: A measure of the dispersion of the data around the mean.

n = sample size: The number of observations within a sample.

α = significance level: The specified probability of error with which the null hypothesis is incorrectly rejected.

p-value: Result of the hypothesis test used to make a decision between the two hypotheses.

t-value: Test statistic of the t-test. It describes how large the observed difference in means is relative to the dispersion of the data.

df = Degrees of freedom: A value derived from the sample size that determines the form of the t-distribution.

Δ₀ = Hypothesis difference: Reference value against which the difference in means is tested. Typically, Δ₀ = 0.

Confidence level: Probability that the calculated confidence interval covers the true parameter value (e.g., 95%).

Confidence interval: Range of values that, at the chosen confidence level, contains the true difference in means.

Null hypothesis: Initial hypothesis assuming no difference or the hypothesized difference. It is tested in the hypothesis test.

Alternative hypothesis: The counter-hypothesis to the null hypothesis. It describes the substantive question, e.g., whether means differ significantly.

Test direction: Indicates whether a difference is tested without specifying the direction (two-tailed) or with a specific direction (greater/less).

Two-tailed: The test examines whether the means differ, regardless of the direction.

Greater: The test examines whether the mean of the first sample is greater than that of the second sample.

Smaller: The test examines whether the mean of the first sample is smaller than that of the second sample.

  • Formulas

Standard error of the difference of the means

\( \mathrm{SE}=\sqrt{\frac{\mathrm{s}_1^2}{\mathrm{n}_1}+\frac{\mathrm{s}_2^2}{\mathrm{n}_2}} \)

Test statistic (t-value)

\( \mathrm{t}=\frac{(\bar{\mathrm{x}}_1-\bar{\mathrm{x}}_2)-\Delta_0}{\mathrm{SE}} \)

Degrees of freedom (Welch)

\( \mathrm{df}=\frac{\left(\frac{\mathrm{s}_1^2}{\mathrm{n}_1}+\frac{\mathrm{s}_2^2}{\mathrm{n}_2}\right)^2}{\frac{\left(\frac{\mathrm{s}_1^2}{\mathrm{n}_1}\right)^2}{\mathrm{n}_1-1}+\frac{\left(\frac{\mathrm{s}_2^2}{\mathrm{n}_2}\right)^2}{\mathrm{n}_2-1}} \)

Confidence interval for the difference in means

\( (\bar{\mathrm{x}}_1-\bar{\mathrm{x}}_2)\pm \mathrm{t}_{1-\alpha/2,\mathrm{df}}\cdot \mathrm{SE} \)