PyGenStability module#

PyGenStability code to solve generalized Markov Stability including Markov stability.

The generalized Markov Stability is of the form

\[Q_{gen}(t,H) = \mathrm{Tr} \left [H^T \left (F(t)-\sum_{k=1}^m v_{2k-1} v_{2k}^T\right)H\right]\]

where \(F(t)\) is the quality matrix and \(v_k\) are null model vectors. The choice of the quality matrix and null model vectors are arbitrary in the generalized Markov Stability setting, and can be parametrised via built-in constructors, or specified by the user via the constructor module.

pygenstability.pygenstability.run(graph=None, constructor='linearized', min_scale=-2.0, max_scale=0.5, n_scale=20, log_scale=True, scales=None, n_tries=100, with_all_tries=False, with_NVI=True, n_NVI=20, with_postprocessing=True, with_ttprime=True, with_spectral_gap=False, exp_comp_mode='spectral', result_file='results.pkl', n_workers=4, tqdm_disable=False, with_optimal_scales=True, optimal_scales_kwargs=None, method='louvain', constructor_kwargs=None)[source]#

This is the main function to compute graph clustering across scales with Markov Stability.

This function needs a graph object as an adjacency matrix encoded with scipy.csgraph. The default settings will provide a fast and generic run with linearized Markov Stability, which corresponds to modularity with a scale parameter. Other built-in constructors are available to perform Markov Stability with matrix exponential computations. Custom constructors can be added via the constructor module. Additional parameters can be used to set the range and number of scales, number of trials for generalized Markov Stability optimisation, with Louvain or Leiden algorithm.

Parameters:

graph (scipy.csgraph) – graph to cluster, if None, the constructor cannot be a str
constructor (str/function) – name of the generalized Markov Stability constructor, or custom constructor function. It must have two arguments, graph and scale.
min_scale (float) – minimum Markov scale
max_scale (float) – maximum Markov scale
n_scale (int) – number of scale steps
log_scale (bool) – use linear or log scales for scales
scales (array) – custom scale vector, if provided, it will override the other scale arguments
n_tries (int) – number of generalized Markov Stability optimisation evaluations
with_all_tries (bools) – store all partitions with stability values found in different optimisation evaluations
with_NVI (bool) – compute NVI(t) between generalized Markov Stability optimisations at each scale t
n_NVI (int) – number of randomly chosen generalized Markov Stability optimisations to estimate NVI
with_postprocessing (bool) – apply the final postprocessing step
with_ttprime (bool) – compute the NVI(t,tprime) matrix to compare scales t and tprime
with_spectral_gap (bool) – normalise scale by spectral gap
exp_comp_mode (str) – mode to compute matrix exponential, can be expm or spectral
result_file (str) – path to the result file
n_workers (int) – number of workers for multiprocessing
tqdm_disable (bool) – disable progress bars
with_optimal_scales (bool) – apply optimal scale selection algorithm
optimal_scales_kwargs (dict) – kwargs to pass to optimal scale selection, see optimal_scale module.
method (str) – optimiation method, louvain or leiden
constructor_kwargs (dict) – additional kwargs to pass to constructor prepare method

Returns:

Results dict with the following entries

’run_params’: dict with parameters used to run the code
’scales’: scales of the scan
’number_of_communities’: number of communities at each scale
’stability’: value of stability cost function at each scale
’community_id’: community node labels at each scale
’all_tries’: all community node labels with stability values found in different
optimisation evaluations at each scale (included if with_all_tries==True)
’NVI’: NVI(t) at each scale
’ttprime’: NVI(t,tprime) matrix
’block_nvi’: block NVI curve (included if with_optimal_scales==True)
’selected_partitions’: selected partitions (included if with_optimal_scales==True)

pygenstability.pygenstability.evaluate_NVI(index_pair, partitions)[source]#

Evaluations of Normalized Variation of Information (NVI).

NVI is defined for two partitions \(p1\) and \(p2\) as:

\[NVI = \frac{E(p1) + E(p2) - 2MI(p1, p2)}{JE(p1,p2)}\]

where \(E\) is the entropy, \(JE\) the joint entropy and \(MI\) the mutual information.

Parameters:

index_pair (list) – list of two indices to select pairs of partitions
partitions (list) – list of partitions

Returns:

float, Normalized Variation Information