You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In case of values of data type `String`, ClickHouse® applies a hashing algorithm before storing the values into the internal array, otherwise the amount of space needed could get enormous.
The second task: it needs to read a state and split it into an array of values.
64
+
#### Getting the Hash Values
65
+
The second task: now that we know how the state is formed, how can we demangle it and convert it into an `Array` of values.
66
+
Unfortunatelly it is not possible to get the original values back, as `sipHash128` is a one way conversion, but at least we can try to get an `Array` of hashes.
42
67
Luckily for us, ClickHouse® use the exact same serialization (`LEB128` + list of values) for Arrays (in this case if `uniqExactState` and `Array` are serialized into `RowBinary` format).
43
68
44
-
We need one a helper -- `UDF` function to do that conversion:
69
+
One way to "convert" the `uniqExactState` to an `Array` of hashes would be via an external helper
And here is the full example, how you can convert `uniqExactState(string)` to `uniqState(string)` or `uniqCombinedState(string)` using `pipe` UDF and `arrayReduce('func', [..])`.
97
+
This way only works if you have direct access to your ClickHouse® installation.
98
+
However if you are on a managed platform like Altinity.Cloud installing executable `UDF`s is typically not supported for security reasons.
99
+
Luckily we know that the internal representation of `sipHash128` is `FixedString(16)` which has exactly 128 bit. `UInt128` also takes up exactly 128 bit.
100
+
Therefore we can consider the `uniqExactState(String)` as a representation of `Array(UInt128)`.
101
+
102
+
Again, we can therefore convert our state to an `Array`:
As you can see the `Array` is identical to the one we created with the `pipe` function.
111
+
112
+
#### Full Example of Conversion
113
+
114
+
And here is the full example, how you can convert `uniqExactState(string)` to any approximate `uniq` function like `uniqState(string)` or `uniqCombinedState(string)` by `reinterpret` and `arrayReduce('func', [..])`.
72
115
73
116
```sql
74
117
-- Generate demo with random data, uniqs are stored as heavy uniqExact
@@ -89,25 +132,16 @@ GROUP BY id;
89
132
90
133
-- Let's add a new columns to store optimized, approximate uniq & uniqCombined
0 commit comments