The feature class module

The master template for a feature class.

Each feature class in the ./features folder can inherit the main feature class functionality.

The functions here are necessary to evaluate each individual feature found inside a feature class.

class hcga.feature_class.FeatureClass(graph=None)[source]

Main functionality to be inherited by each feature class

Initialise a feature class.

Parameters

graph (Graph) – graph for initialisation, converted to given encoding

classmethod __init_subclass__()[source]

Initialise class variables to default for each child class.

_clustering_statistics(community_partition, feat_name, feat_desc, feat_interpret)[source]

Compute quality of the community partitions.

_feature_statistics(feat_dist, feat_name, feat_desc, feat_interpret)[source]

Computes summary statistics of distributions.

_feature_statistics_advanced(feat_dist, feat_name, feat_desc, feat_interpret)[source]

Computes advanced summary statistics of distributions.

_feature_statistics_basic(feat_dist, feat_name, feat_desc, feat_interpret)[source]

Computes basic summary statistics of distributions.

_feature_statistics_medium(feat_dist, feat_name, feat_desc, feat_interpret)[source]

Computes medium summary statistics of distributions.

_list_statistics(feat_dist, feat_name, feat_desc, feat_interpret)[source]

Compute list statisttics.

_node_feature_statistics(feat_dist, feat_name, feat_desc, feat_interpret)[source]

Computes summary statistics of each feature distribution.

_test_feature_exists(feature_name)[source]

Test if feature feature_name exists in description list.

add_feature(feature_name, feature_function, feature_description, feature_interpret, function_args=None, statistics=None)[source]

Adds a computed feature value and its description.

Parameters
  • feature_name (str) – name of the feature

  • feature_function (function) – function to evaluate to compute a feature

  • feature_description (str) – short description of the feature

  • feature_interpret (int) – interpretability score of thee feature

  • function_args (list) – additional arguments to pass to feature_function

  • statistics (str) – type of statistics to apply to high dimensional features.

classmethod add_feature_description(feature_name, feature_desc, feature_interpret)[source]

Adds the description to the class variable if not already there.

compute_features()[source]

Main feature extraction function.

This function should be used by each specific feature class to add new features.

compute_normalize_features()[source]

Triple the number of features by normalising by node and edges.

evaluate_feature(feature_function, feature_name, function_args=None, statistics=None)[source]

Evaluating a feature function.

We catch any error during a computation, which may result in Nan feature value. In addition, the evaluation has to be done before timoeut, or it will return Nan.

Parameters
  • feature_function (function) – function to evaluate to compute a feature

  • feature_name (str) – name of the feature

  • function_args (list) – additional arguments to pass to feature_function

  • statistics (str) – type of statistics to apply to high dimensional features.

get_feature_description(feature_name)[source]

Returns interpretability score of the feature feature_name.

get_feature_info(feature_name)[source]

Returns a dictionary of information about the feature feature_name.

get_feature_interpretability(feature_name)[source]

Returns interpretability score of the feature feature_name.

get_features(all_features=False)[source]

Compute all the possible features.

get_info()[source]

Return a dictionary of informations about the feature class.

classmethod setup_class(normalize_features=True, statistics_level='basic', n_node_features=0, timeout=10)[source]

Initializes the class by adding descriptions for all features.

Parameters
  • normalize_features (bool) – normalise features by number of nodes and number of edges

  • statistics_level (str) – ‘basic’, ‘advanced’ - for features that provide distributions we must compute statistics.

  • n_node_features (int) – dimension of node features for feature constructors

  • timeout (int) – number of seconds before the calculation for a feature is cancelled

Returns

dataframe with feature information

Return type

(DataFrame)

class hcga.feature_class.InterpretabilityScore(score)[source]

Class to represent interpretability scores of features.

Init function for InterpretabilityScore.

Parameters

score – (int/{‘min’, ‘max’} value of score to set

get_score()[source]

Get the interpretability score.

set_score(score)[source]

Set the interpretability score.