alphapepttools.tl.nan_safe_ttest_ind

alphapepttools.tl.nan_safe_ttest_ind#

alphapepttools.tl.nan_safe_ttest_ind(a, b, min_valid_values=None, **kwargs)#

NaN-safe wrapper around scipy.stats.ttest_ind.

Performs independent t-test between two samples, but returns (nan, nan) if either input has fewer than two non-NaN values. Automatically converts inputs to pandas Series if needed. Defaults are set to omit NaNs and not assume equal variance (Welch’s t-test), which can be changed by passing different arguments for “nan_policy” and “equal_var” to **kwargs.

Parameters:
  • a (pd.Series) – First sample for comparison.

  • b (pd.Series) – Second sample for comparison.

  • min_valid_values (int, optional) – Minimum number of non-NaN values required in either sample to perform t-test. Since this function has no means of imputation, this means that BOTH samples must have at least this many non-NaN values to perform the t-test. Default is 2.

  • **kwargs – Additional keyword arguments passed to scipy.stats.ttest_ind.

Return type:

tuple[float, float] | tuple[nan, nan]

Returns:

tuple (t_statistic, p_value) if both samples have at least 2 non-NaN values, otherwise (nan, nan).

Examples

>>> import pandas as pd
>>> import numpy as np
>>> from alphapepttools.tl.stats import nan_safe_ttest_ind
>>> a = pd.Series([1, 2, 3, np.nan])
>>> b = pd.Series([4, 5, 6, 7])
>>> t_stat, p_val = nan_safe_ttest_ind(a, b)
>>> # Returns valid t-test results since both have >= 2 non-NaN values
>>> c = pd.Series([1, np.nan])  # Only 1 non-NaN value
>>> t_stat, p_val = nan_safe_ttest_ind(c, b)
>>> # Returns (nan, nan) since c has < 2 non-NaN values