Near Cloud Storage Computer

Modern cloud platforms disaggregate computation and storage into separate services. In this project, we explored the idea of using the limited computation inside the simple storage service (S3) offered by AWS to accelerate data analytics. We use the existing S3 Select feature to accelerate not only simple database operators like select and project, but also complex operators like join, group-by, and top-K. We propose optimization techniques for each individual operator and demonstrate more than 6x performance improvement over a set of representative queries.


Michael Stonebraker, Xiangyao Yu