54
edits
(Filling out resources section) |
(Fixing l10l sampling methodology.) |
||
Line 24: | Line 24: | ||
SELECT * FROM longitudinal LIMIT 1000 ... | SELECT * FROM longitudinal LIMIT 1000 ... | ||
For a statistically sound sample, use TABLESAMPLE: | |||
SELECT * FROM longitudinal TABLESAMPLE BERNOULLI(xx) | |||
Where xx is an integer representing what percentage of data you want to include in your sample (e.g. 10% sample -> xx=10). | |||
A couple of caveats: | |||
* This sampling method will only decrease your query run time if you're manipulating the data a lot. Bernoulli sampling still requires reading the whole DB before proceeding. | |||
* This sample will not be deterministic. I.e. you will not get the same sample for every run. This can cause problems when using Presto Views or logical tables. | |||
* Unlike LIMIT, this method does not guarantee a fixed number of results. | |||
=== Arrays === | === Arrays === |
edits