Entropy and Mutual Information

bamt.mi_entropy_gauss.query_filter(data: DataFrame, columns: List, values: List)[source]
Filters the data according to the column-value list

Arguments

Effects

None

bamt.mi_entropy_gauss.entropy_gauss(pd_data)[source]
Calculate entropy for Gaussian multivariate distributions.

Arguments

Effects

None

bamt.mi_entropy_gauss.entropy_all(data, method='MI')[source]
For one varibale, H(X) is equal to the following:

-1 * sum of p(x) * log(p(x))

For two variables H(X|Y) is equal to the following:

sum over x,y of p(x,y)*log(p(y)/p(x,y))

For three variables, H(X|Y,Z) is equal to the following:
-1 * sum of p(x,y,z) * log(p(x|y,z)),

where p(x|y,z) = p(x,y,z)/p(y)*p(z)

Arguments

data : pd.DataFrame Returns ——- H : entropy value

bamt.mi_entropy_gauss.entropy_cond(data, column_cont, column_disc, method)[source]
bamt.mi_entropy_gauss.mi_gauss(data, method='MI', conditional=False)[source]

Calculate Mutual Information based on entropy. In the case of continuous uses entropy for Gaussian multivariate distributions.

Arguments

Effects

None Notes —– - Need to preprocess data with code_categories

bamt.mi_entropy_gauss.mi(edges: list, data: DataFrame, method='MI')[source]

Bypasses all nodes and summarizes scores, taking into account the parent-child relationship.

Arguments

Returns

sum_score : float Effects ——- None