Skip to content

[SYSTEMDS-3288] CLA SDC isolated DefaultTuple #1533

Merged
Baunsgaard merged 2 commits intoapache:mainfrom
Baunsgaard:SmarterSDC
Feb 8, 2022
Merged

[SYSTEMDS-3288] CLA SDC isolated DefaultTuple #1533
Baunsgaard merged 2 commits intoapache:mainfrom
Baunsgaard:SmarterSDC

Conversation

@Baunsgaard
Copy link
Contributor

No description provided.

@Baunsgaard Baunsgaard marked this pull request as draft February 7, 2022 18:25
This commit change the SDC groups with a default tuple to isolate the
default into an array other than the dictionary. This leads to cheep
"morphing" between SDC like column groups to improve performance of
matrix multiplications and other like operations that benefit from
the morphing of column groups to more effecient types.

Minor additions (times are for that operation specific improvements)

- Improved CLA rexpand (one hot encode) by ~10-50x
- BitSet DDC preAggregate to use Bit set operations ~10x
- MapToData getCounts specialization ~3x
- Full integration of centralMoment (previously extracted MatrixBlock)
- Hardening interface for AColGroup, reduce inefficient extractions
- Compression time -1 sec for census_enc now ~5.5-6.5

Closes apache#1533
@Baunsgaard Baunsgaard marked this pull request as ready for review February 8, 2022 13:58
@Baunsgaard Baunsgaard closed this in b8d4897 Feb 8, 2022
@Baunsgaard Baunsgaard merged commit b8d4897 into apache:main Feb 8, 2022
@github-pages github-pages bot temporarily deployed to github-pages February 8, 2022 14:12 Inactive
@Baunsgaard Baunsgaard deleted the SmarterSDC branch March 16, 2022 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant