Matrix

To reduce overhead and external dependencies, PyHPO uses an internal data matrix, pyhpo.Matrix. It is used for row- and columnwise comparisons of HPOSets.

Matrix should not be used for other purposes, as it does not contain much error handling and expects conform clients.

Matrix class

class pyhpo.matrix.Matrix(rows, cols, data=None)[source]

# noqa: E501

Poor man’s implementation of a DataFrame/Matrix

This is used to calculate similarities between HPO sets and is surprisingly much faster than using pandas DataFrames

Note

Pandas:

===== COMPARING SETS ======
23806489 function calls (23770661 primitive calls) in 19.705 seconds

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
....
9900    0.267    0.000   19.106    0.002 set.py:318(similarity)
9900    1.124    0.000   14.330    0.001 set.py:477(_sim_score)
....

Matrix:

===== COMPARING SETS ======
12870433 function calls in 6.642 seconds

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
....
9900    0.048    0.000    6.424    0.001 set.py:316(similarity)
9900    0.928    0.000    5.112    0.001 set.py:432(_sim_score)
....

Warning

This Matrix should not be used as a public interface. It’s only used internally for calculations.

Parameters:
  • rows (int) – The number of rows in the Matrix

  • cols (int) – The number of columns in the Matrix

  • data (list of values, default None) – A list with values to fill the Matrix.

n_rows

The number of rows in the Matrix

Type:

int

n_cols

The number of columns in the Matrix

Type:

int

rows

Iterator over all rows

Example:

print(matrix)

>>    ||   0|   1|   2|   3|
>> =========================
>> 0  ||  11|  12|  13|  14|
>> 1  ||  21|  22|  23|  24|
>> 2  ||  31|  32|  33|  34|

for row in matrix.rows:
    print(row)

>> [11, 12, 13, 14]
>> [21, 22, 23, 24]
>> [31, 32, 33, 34]
Type:

iterator

columns

Iterator over all columns

Example:

print(matrix)

>>    ||   0|   1|   2|   3|
>> =========================
>> 0  ||  11|  12|  13|  14|
>> 1  ||  21|  22|  23|  24|
>> 2  ||  31|  32|  33|  34|

for col in matrix.columns:
    print(col)

>> [11, 21, 31]
>> [12, 22, 32]
>> [13, 23, 33]
>> [14, 24, 34]
Type:

iterator