Skip to content

algorithms: reduce memory footprint of local UDFs

Kostas FILIPPOPOLITIS requested to merge dev/713/reduce-udf-memory into master

Created by: jassak

Some memory expensive lines are identified, in local UDFs, and are replaced with equivalent memory efficient ones.

All changes fall into three categories:

  • Avoid memory copies when operation can be done inplace
  • Use of numpy.einsum, when applicable, for summing over arrays
  • Avoid using sklearn methods when re-implementing is easy, because sklearn hasn't been written with the above two considerations in mind

There is also one change in udfio.py. When constructing a pandas.DataFrame from separate columns, for relational tables, the flag copy=False is now used. This avoids consolidating columns into a single 2-dimensional array. However, when some operation, later in the UDF, requires the columns to be consolidated, pandas will eventually do it in the background, copying their data into a single contiguous block of memory. This is unavoidable since many operations in numpy need to operate on contiguous blocks of memory in order to maximize their efficiency, since they exploit the CPU cache and the CPU vectorization capabilities.

In most cases the memory reduction is accompanied by a reduction in execution time as well.

Results are presented in the next table

file method data mem before mem after perc time before time after perc
pca.py local1 160MB 172MB 2KB 0.001% 250ms 30ms 12%
pca.py local2 160MB 305MB 153MB 50% 300ms 400ms 133%
pearson.py local1 320MB 153MB 5KB 0.003% 330ms 300ms 90%
ttest_independent.py local_independet 16MB 24MB 3KB 0.012% 360ms 190ms 53%
ttest_onesample.py local_one_sample 8MB 16MB 3KB 0.018% 380ms 800μs 0.21%
ttest_paired.py local_paired 16MB 24MB 8MB 33% 600ms 4ms 0.67%
metrics.py _confusion_matrix_local 16MB 44MB 2MB 4.5% 140ms 6ms 4.3%
metrics.py _roc_curve_local 16MB 1.6GB 2MB 0.12% 2.7s 600ms 22%
logistic_regression.py LogistcRegression._fit_local_step 24MB 24MB 8KB 0.32% 23ms 233ms 1000%
linear_regression.py LinearRegression._compute_summary_local 16MB 8MB 8MB 100% 360ms 5ms 1.3%
anova_oneway.py local1 16MB 106MB 74MB 70% 200ms 100ms 50%
destriptive_stats.py local 168MB 424MB 358MB 84% 6s 5.8s 97%
udfio.py from_relational_table 2GB 2GB 8KB ~0% 1.6s 150μs 0.009%

Merge request reports