2023 (35 publications)
AutoOD: Automatic Outlier Detection. Proc. ACM Manag. Data, 2023

Lei Cao, Yizhou Yan, Yu Wang, Samuel Madden, Elke A. Rundensteiner

Few-shot Text-to-SQL Translation using Structure and Content Prompt Learning. Proc. ACM Manag. Data, 2023

Zihui Gu, Ju Fan, Nan Tang, Lei Cao, Bowen Jia, Sam Madden, Xiaoyong Du

FactorJoin: A New Cardinality Estimation Framework for Join Queries. Proc. ACM Manag. Data, 2023

Ziniu Wu, Parimarjan Negi, Mohammad Alizadeh, Tim Kraska, Samuel Madden

Extract-Transform-Load for Video Streams. Proc. VLDB Endow., 2023

Ferdinand Kossmann, Ziniu Wu, Eugenie Lai, Nesime Tatbul, Lei Cao, Tim Kraska, Sam Madden

Check Out the Big Brain on BRAD: Simplifying Cloud Data Processing with Learned Automated Data Meshes. Proc. VLDB Endow., 2023

Tim Kraska, Tianyu Li, Samuel Madden, Markos Markakis, Amadou Ngom, Ziniu Wu, Geoffrey X. Yu

Robust Query Driven Cardinality Estimation under Changing Workloads. Proc. VLDB Endow., 2023

Parimarjan Negi, Ziniu Wu, Andreas Kipf, Nesime Tatbul, Ryan Marcus, Sam Madden, Tim Kraska, Mohammad Alizadeh

Pando: Enhanced Data Skipping with Logical Data Partitioning. Proc. VLDB Endow., 2023

Sivaprasad Sudhir, Wenbo Tao, Nikolay Pavlovich Laptev, Cyrille Habis, Michael J. Cafarella, Samuel Madden

Future of Database System Architectures. SIGMOD Conference Companion, 2023

Gustavo Alonso, Natassa Ailamaki, Sailesh Krishnamurthy, Sam Madden, Swami Sivasubramanian, Raghu Ramakrishnan

Interpretable Outlier Summarization. CoRR, 2023

Yu Wang, Lei Cao, Yizhou Yan, Samuel Madden

RITA: Group Attention is All You Need for Timeseries Analytics. CoRR, 2023

Jiaming Liang, Lei Cao, Samuel Madden, Zachary G. Ives, Guoliang Li

Interleaving Pre-Trained Language Models and Large Language Models for Zero-Shot NL2SQL Generation. CoRR, 2023

Zihui Gu, Ju Fan, Nan Tang, Songyue Zhang, Yuxin Zhang, Zui Chen, Lei Cao, Guoliang Li, Sam Madden, Xiaoyong Du

SEED: Simple, Efficient, and Effective Data Management via Large Language Models. CoRR, 2023

Zui Chen, Lei Cao, Sam Madden, Ju Fan, Nan Tang, Zihui Gu, Zeyuan Shang, Chunwei Liu, Michael J. Cafarella, Tim Kraska

Extract-Transform-Load for Video Streams. CoRR, 2023

Ferdinand Kossmann, Ziniu Wu, Eugenie Lai, Nesime Tatbul, Lei Cao, Tim Kraska, Samuel Madden

R3: Record-Replay-Retroaction for Database-Backed Applications. Proc. VLDB Endow., 2023

Qian Li, Peter Kraft, Michael J. Cafarella, Çagatay Demiralp, Goetz Graefe, Christos Kozyrakis, Michael Stonebraker, Lalith Suresh, Xiangyao Yu, Matei Zaharia

Transactions Make Debugging Easy. CIDR, 2023

Qian Li, Peter Kraft, Michael J. Cafarella, Çagatay Demiralp, Goetz Graefe, Christos Kozyrakis, Michael Stonebraker, Lalith Suresh, Matei Zaharia

Causal Data Integration. Proc. VLDB Endow., 2023

Brit Youngmann, Michael J. Cafarella, Babak Salimi, Anna Zeng

On Explaining Confounding Bias. ICDE, 2023

Brit Youngmann, Michael J. Cafarella, Yuval Moskovitch, Babak Salimi

NEXUS: On Explaining Confounding Bias. SIGMOD Conference Companion, 2023

Brit Youngmann, Michael J. Cafarella, Yuval Moskovitch, Babak Salimi

Causal Data Integration. CoRR, 2023

Brit Youngmann, Michael J. Cafarella, Babak Salimi, Anna Zeng

Epoxy: ACID Transactions Across Diverse Data Stores. Proc. VLDB Endow., 2023

Peter Kraft, Qian Li, Xinjing Zhou, Peter Bailis, Michael Stonebraker, Xiangyao Yu, Matei Zaharia

Two is Better Than One: The Case for 2-Tree for Skewed Data Sets. CIDR, 2023

Xinjing Zhou, Xiangyao Yu, Goetz Graefe, Michael Stonebraker

Joint Proceedings of Workshops at the 49th International Conference on Very Large Data Bases (VLDB 2023), Vancouver, Canada, August 28 - September 1, 2023. VLDB WorkshopsCEUR Workshop Proceedings, 2023

Rajesh Bordawekar, Cinzia Cappiello, Vasilis Efthymiou, Lisa Ehrlinger, Vijay Gadepally, Sainyam Galhotra, Sandra Geisler, Sven Groppe, Le Gruenwald, Alon Y. Halevy, Hazar Harmouch, Oktie Hassanzadeh, Ihab F. Ilyas, Ernesto Jiménez-Ruiz, Sanjay Krishnan, Tirthankar Lahiri, Guoliang Li, Jiaheng Lu, Wolfgang Mauerer, Umar Farooq Minhas, Felix Naumann, M. Tamer Özsu, El Kindi Rezig, Kavitha Srinivas, Michael Stonebraker, Satyanarayana R. Valluri, Maria-Esther Vidal, Haixun Wang, Jiannan Wang, Yingjun Wu, Xun Xue, Mohamed Zaït, Kai Zeng

Unshackling Database Benchmarking from Synthetic Workloads. ICDE, 2023

Parimarjan Negi, Laurent Bindschaedler, Mohammad Alizadeh, Tim Kraska, Jyoti Leeka, Anja Gruenheid, Matteo Interlandi

Auto-WLM: Machine Learning Enhanced Workload Management in Amazon Redshift. SIGMOD Conference Companion, 2023

Gaurav Saxena, Mohammad Rahman, Naresh Chainani, Chunbin Lin, George Caragea, Fahim Chowdhury, Ryan Marcus, Tim Kraska, Ippokratis Pandis, Balakrishnan (Murali) Narayanaswamy

CorBit: Leveraging Correlations for Compressing Bitmap Indexes. VLDB Workshops, 2023

Xi Lyu, Andreas Kipf, Pascal Pfeil, Dominik Horn, Jana Giceva, Tim Kraska

Hyperspecialized Compilation for Serverless Data Analytics. VLDB Workshops, 2023

Leonhard F. Spiegelberg, Tim Kraska, Malte Schwarzkopf

2022 (36 publications)
ExSample: Efficient Searches on Video Repositories through Adaptive Sampling. ICDE, 2022

Oscar R. Moll, Favyen Bastani, Sam Madden, Mike Stonebraker, Vijay Gadepally, Tim Kraska

ExSample: Efficient Searches on Video Repositories through Adaptive Sampling. ICDE, 2022

Oscar R. Moll, Favyen Bastani, Sam Madden, Mike Stonebraker, Vijay Gadepally, Tim Kraska

A Demonstration of AutoOD: A Self-tuning Anomaly Detection System. Proc. VLDB Endow., 2022

Dennis M. Hofmann, Peter VanNostrand, Huayi Zhang, Yizhou Yan, Lei Cao, Samuel Madden, Elke A. Rundensteiner

Self-Organizing Data Containers. CIDR, 2022

Samuel Madden, Jialin Ding, Tim Kraska, Sivaprasad Sudhir, David E. Cohen, Timothy G. Mattson, Nesime Tatbul

Ad-hoc Searches on Image Databases. Poly/DMAH@VLDB, 2022

Oscar R. Moll Thomae, Sam Madden, Vijay Gadepally

Tile-based Lightweight Integer Compression in GPU. SIGMOD Conference, 2022

Anil Shanbhag, Bobbi W. Yogatama, Xiangyao Yu, Samuel Madden

SeeSaw: interactive ad-hoc search over image databases. CoRR, 2022

Oscar R. Moll, Manuel Favela, Samuel Madden, Vijay Gadepally

FactorJoin: A New Cardinality Estimation Framework for Join Queries. CoRR, 2022

Ziniu Wu, Parimarjan Negi, Mohammad Alizadeh, Tim Kraska, Samuel Madden

Nonintrusive Measurements for Detecting Progressive Equipment Faults. IEEE Trans. Instrum. Meas., 2022

Daisy H. Green, Devin W. Quinn, Samuel Madden, Peter A. Lindahl, Steven B. Leeb

A Progress Report on DBOS: A Database-oriented Operating System. CIDR, 2022

Qian Li, Peter Kraft, Kostis Kaffes, Athinagoras Skiadopoulos, Deeptaanshu Kumar, Jason Li, Michael J. Cafarella, Goetz Graefe, Jeremy Kepner, Christos Kozyrakis, Michael Stonebraker, Lalith Suresh, Matei Zaharia

Apiary: A DBMS-Backed Transactional Function-as-a-Service Framework. CoRR, 2022

Peter Kraft, Qian Li, Kostis Kaffes, Athinagoras Skiadopoulos, Deeptaanshu Kumar, Danny Cho, Jason Li, Robert Redmond, Nathan W. Weckwerth, Brian S. Xia, Peter Bailis, Michael J. Cafarella, Goetz Graefe, Jeremy Kepner, Christos Kozyrakis, Michael Stonebraker, Lalith Suresh, Xiangyao Yu, Matei Zaharia

Transactions Make Debugging Easy. CoRR, 2022

Qian Li, Peter Kraft, Michael J. Cafarella, Çagatay Demiralp, Goetz Graefe, Christos Kozyrakis, Michael Stonebraker, Lalith Suresh, Matei Zaharia

Infrastructure for Rapid Open Knowledge Network Development. AI Mag., 2022

Michael J. Cafarella, Michael R. Anderson, Iz Beltagy, Arie Cattan, Sarah E. Chasins, Ido Dagan, Doug Downey, Oren Etzioni, Sergey Feldman, Tian Gao, Tom Hope, Kexin Huang, Sophie Johnson, Daniel King, Kyle Lo, Yuze Lou, Matthew D. Shapiro, Dinghao Shen, Shivashankar Subramanian, Lucy Lu Wang, Yuning Wang, Yitong Wang, Daniel S. Weld, Jenny M. Vo-Phamhi, Anna Zeng, Jiayun Zou

Building a Shared Conceptual Model of Complex, Heterogeneous Data Systems: A Demonstration. CIDR, 2022

Michael R. Anderson, Yuze Lou, Jiayun Zou, Michael J. Cafarella, Sarah E. Chasins, Doug Downey, Tian Gao, Kexin Huang, Dinghao Shen, Jenny M. Vo-Phamhi, Yitong Wang, Yuning Wang, Anna Zeng

Debugging the OmniTable Way. OSDI, 2022

Andrew Quinn, Jason Flinn, Michael J. Cafarella, Baris Kasikci

On Explaining Confounding Bias. CoRR, 2022

Brit Youngmann, Michael J. Cafarella, Yuval Moskovitch, Babak Salimi

The Seattle report on database research. Commun. ACM, 2022

Daniel Abadi, Anastasia Ailamaki, David G. Andersen, Peter Bailis, Magdalena Balazinska, Philip A. Bernstein, Peter A. Boncz, Surajit Chaudhuri, Alvin Cheung, AnHai Doan, Luna Dong, Michael J. Franklin, Juliana Freire, Alon Y. Halevy, Joseph M. Hellerstein, Stratos Idreos, Donald Kossmann, Tim Kraska, Sailesh Krishnamurthy, Volker Markl, Sergey Melnik, Tova Milo, C. Mohan, Thomas Neumann, Beng Chin Ooi, Fatma Ozcan, Jignesh M. Patel, Andrew Pavlo, Raluca A. Popa, Raghu Ramakrishnan, Christopher Ré, Michael Stonebraker, Dan Suciu

Applying Machine Learning and Data Fusion to the "Missing Person" Problem. Computer, 2022

K. M. A. Solaiman, Tao Sun, Alina Nesen, Bharat K. Bhargava, Michael Stonebraker

Kyrix-J: Visual Discovery of Connected Datasets in a Data Lake. CIDR, 2022

Wenbo Tao, Adam Sah, Leilani Battle, Remco Chang, Michael Stonebraker

Machine Learning with DBOS. CoRR, 2022

Robert Redmond, Nathan W. Weckwerth, Brian S. Xia, Qian Li, Peter Kraft, Deeptaanshu Kumar, Çagatay Demiralp, Michael Stonebraker

Research Report: Progress on Building a File Observatory for Secure Parser Development. SP, 2022

Tim Allison, Wayne Burke, Dustin Graf, Chris Mattmann, Anastasija Mensikova, Mike Milano, Philip Southam, Ryan Stonebraker

SageDB: An Instance-Optimized Data Analytics System. Proc. VLDB Endow., 2022

Jialin Ding, Ryan Marcus, Andreas Kipf, Vikram Nathan, Aniruddha Nrusimha, Kapil Vaidya, Alexander van Renen, Tim Kraska

Can Learned Models Replace Hash Functions? Proc. VLDB Endow., 2022

Ibrahim Sabek, Kapil Vaidya, Dominik Horn, Andreas Kipf, Michael Mitzenmacher, Tim Kraska

SNARF: A Learning-Enhanced Range Filter. Proc. VLDB Endow., 2022

Kapil Vaidya, Tim Kraska, Subarna Chatterjee, Eric R. Knorr, Michael Mitzenmacher, Stratos Idreos

TreeLine: An Update-In-Place Key-Value Store for Modern Storage. Proc. VLDB Endow., 2022

Geoffrey X. Yu, Markos Markakis, Andreas Kipf, Per-Åke Larson, Umar Farooq Minhas, Tim Kraska

Bao: Making Learned Query Optimization Practical. SIGMOD Rec., 2022

Ryan Marcus, Parimarjan Negi, Hongzi Mao, Nesime Tatbul, Mohammad Alizadeh, Tim Kraska

LSI: a learned secondary index structure. aiDM@SIGMOD, 2022

Andreas Kipf, Dominik Horn, Pascal Pfeil, Ryan Marcus, Tim Kraska

LSI: A Learned Secondary Index Structure. CoRR, 2022

Andreas Kipf, Dominik Horn, Pascal Pfeil, Ryan Marcus, Tim Kraska

2021 (51 publications)
Inferring and improving street maps with data-driven automation. Commun. ACM, 2021

Favyen Bastani, Songtao He, Satvat Jagwani, Edward Park, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden, Mohammad Amin Sadeghi

Replicated Layout for In-Memory Database Systems. Proc. VLDB Endow., 2021

Sivaprasad Sudhir, Michael J. Cafarella, Samuel Madden

RPT: Relational Pre-trained Transformer Is Almost All You Need towards Democratizing Data Preparation. Proc. VLDB Endow., 2021

Nan Tang, Ju Fan, Fangyi Li, Jianhong Tu, Xiaoyong Du, Guoliang Li, Samuel Madden, Mourad Ouzzani

LANCET: Labeling Complex Data at Scale. Proc. VLDB Endow., 2021

Huayi Zhang, Lei Cao, Samuel Madden, Elke A. Rundensteiner

Updating Street Maps using Changes Detected in Satellite Imagery. SIGSPATIAL/GIS, 2021

Favyen Bastani, Songtao He, Satvat Jagwani, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden, Mohammad Amin Sadeghi

Inferring high-resolution traffic accident risk maps based on satellite imagery and GPS trajectories. ICCV, 2021

Songtao He, Mohammad Amin Sadeghi, Sanjay Chawla, Mohammad Alizadeh, Hari Balakrishnan, Samuel Madden

ELITE: Robust Deep Anomaly Detection with Meta Gradient. KDD, 2021

Huayi Zhang, Lei Cao, Peter VanNostrand, Samuel Madden, Elke A. Rundensteiner

SkyQuery: an aerial drone video sensing platform. Onward, 2021

Favyen Bastani, Songtao He, Ziwen Jiang, Osbert Bastani, Sam Madden

Asynchronous Prefix Recoverability for Fast Distributed Stores. SIGMOD Conference, 2021

Tianyu Li, Badrish Chandramouli, Jose M. Faleiro, Samuel Madden, Donald Kossmann

TagMe: GPS-Assisted Automatic Object Annotation in Videos. CoRR, 2021

Songtao He, Favyen Bastani, Mohammad Alizadeh, Hari Balakrishnan, Michael J. Cafarella, Tim Kraska, Sam Madden

SkyQuery: An Aerial Drone Video Sensing Platform. CoRR, 2021

Favyen Bastani, Songtao He, Ziwen Jiang, Osbert Bastani, Michael J. Cafarella, Tim Kraska, Sam Madden

Updating Street Maps using Changes Detected in Satellite Imagery. CoRR, 2021

Favyen Bastani, Songtao He, Satvat Jagwani, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden, Mohammad Amin Sadeghi

DBOS: A DBMS-oriented Operating System. Proc. VLDB Endow., 2021

Athinagoras Skiadopoulos, Qian Li, Peter Kraft, Kostis Kaffes, Daniel Hong, Shana Mathew, David Bestor, Michael J. Cafarella, Vijay Gadepally, Goetz Graefe, Jeremy Kepner, Christos Kozyrakis, Tim Kraska, Michael Stonebraker, Lalith Suresh, Matei Zaharia

Data Governance in a Database Operating System (DBOS). Poly/DMAH@VLDB, 2021

Deeptaanshu Kumar, Qian Li, Jason Li, Peter Kraft, Athinagoras Skiadopoulos, Lalith Suresh, Michael J. Cafarella, Michael Stonebraker

Technical Report on Data Integration and Preparation. CoRR, 2021

El Kindi Rezig, Michael J. Cafarella, Vijay Gadepally

ML-In-Databases: Assessment and Prognosis. IEEE Data Eng. Bull., 2021

Tim Kraska, Umar Farooq Minhas, Thomas Neumann, Olga Papaemmanouil, Jignesh M. Patel, Christopher Ré, Michael Stonebraker

DICE: Data Discovery by Example. Proc. VLDB Endow., 2021

El Kindi Rezig, Anshul Bhandari, Anna Fariha, Benjamin Price, Allan Vanterpool, Vijay Gadepally, Michael Stonebraker

Horizon: Scalable Dependency-driven Data Cleaning. Proc. VLDB Endow., 2021

El Kindi Rezig, Mourad Ouzzani, Walid G. Aref, Ahmed K. Elmagarmid, Ahmed R. Mahmood, Michael Stonebraker

FlexPushdownDB: Hybrid Pushdown and Caching in a Cloud DBMS. Proc. VLDB Endow., 2021

Yifei Yang, Matt Youill, Matthew E. Woicik, Yizhou Liu, Xiangyao Yu, Marco Serafini, Ashraf Aboulnaga, Michael Stonebraker

Kyrix-S: Authoring Scalable Scatterplot Visualizations of Big Data. IEEE Trans. Vis. Comput. Graph., 2021

Wenbo Tao, Xinli Hou, Adam Sah, Leilani Battle, Remco Chang, Michael Stonebraker

Flow-Loss: Learning Cardinality Estimates That Matter. Proc. VLDB Endow., 2021

Parimarjan Negi, Ryan Marcus, Andreas Kipf, Hongzi Mao, Nesime Tatbul, Tim Kraska, Mohammad Alizadeh

Davos: A System for Interactive Data-Driven Decision Making. Proc. VLDB Endow., 2021

Zeyuan Shang, Emanuel Zgraggen, Benedetto Buratti, Philipp Eichmann, Navid Karimeddiny, Charlie Meyer, Wesley Runnels, Tim Kraska

Towards a Benchmark for Learned Systems. ICDE Workshops, 2021

Laurent Bindschaedler, Andreas Kipf, Tim Kraska, Ryan Marcus, Umar Farooq Minhas

Partitioned Learned Bloom Filters. ICLR, 2021

Kapil Vaidya, Eric Knorr, Michael Mitzenmacher, Tim Kraska

LEA: A Learned Encoding Advisor for Column Stores. aiDM@SIGMOD, 2021

Lujing Cen, Andreas Kipf, Ryan Marcus, Tim Kraska

Instance-Optimized Data Layouts for Cloud Analytics Workloads. SIGMOD Conference, 2021

Jialin Ding, Umar Farooq Minhas, Badrish Chandramouli, Chi Wang, Yinan Li, Ying Li, Donald Kossmann, Johannes Gehrke, Tim Kraska

Bao: Making Learned Query Optimization Practical. SIGMOD Conference, 2021

Ryan Marcus, Parimarjan Negi, Hongzi Mao, Nesime Tatbul, Mohammad Alizadeh, Tim Kraska

Steering Query Optimizers: A Practical Take on Big Data Workloads. SIGMOD Conference, 2021

Parimarjan Negi, Matteo Interlandi, Ryan Marcus, Mohammad Alizadeh, Tim Kraska, Marc T. Friedman, Alekh Jindal

Tuplex: Data Science in Python at Native Code Speed. SIGMOD Conference, 2021

Leonhard F. Spiegelberg, Rahul Yesantharao, Malte Schwarzkopf, Tim Kraska

Flow-Loss: Learning Cardinality Estimates That Matter. CoRR, 2021

Parimarjan Negi, Ryan Marcus, Andreas Kipf, Hongzi Mao, Nesime Tatbul, Tim Kraska, Mohammad Alizadeh

LEA: A Learned Encoding Advisor for Column Stores. CoRR, 2021

Lujing Cen, Andreas Kipf, Ryan Marcus, Tim Kraska

When Are Learned Models Better Than Hash Functions? CoRR, 2021

Ibrahim Sabek, Kapil Vaidya, Dominik Horn, Andreas Kipf, Tim Kraska

PLEX: Towards Practical Learned Indexing. CoRR, 2021

Mihail Stoian, Andreas Kipf, Ryan Marcus, Tim Kraska

Bounding the Last Mile: Efficient Learned String Indexing. CoRR, 2021

Benjamin Spector, Andreas Kipf, Kapil Vaidya, Chi Wang, Umar Farooq Minhas, Tim Kraska

2020 (68 publications)
ExSample: Efficient Searches on Video Repositories through Adaptive Sampling. CoRR, 2020

Oscar R. Moll, Favyen Bastani, Sam Madden, Mike Stonebraker, Vijay Gadepally, Tim Kraska

Deductive optimization of relational data storage. Proc. ACM Program. Lang., 2020

John K. Feser, Sam Madden, Nan Tang, Armando Solar-Lezama

Vaas: Video Analytics At Scale. Proc. VLDB Endow., 2020

Favyen Bastani, Oscar R. Moll, Samuel Madden

Debugging Large-Scale Data Science Pipelines using Dagger. Proc. VLDB Endow., 2020

El Kindi Rezig, Ashrita Brahmaroutu, Nesime Tatbul, Mourad Ouzzani, Nan Tang, Timothy G. Mattson, Samuel Madden, Michael Stonebraker

Smartphone Placement Within Vehicles. IEEE Trans. Intell. Transp. Syst., 2020

Johan Wahlström, Isaac Skog, Peter Händel, Bill Bradley, Samuel Madden, Hari Balakrishnan

RoadTagger: Robust Road Attribute Inference with Graph Neural Networks. AAAI, 2020

Songtao He, Favyen Bastani, Satvat Jagwani, Edward Park, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Samuel Madden, Mohammad Amin Sadeghi

Dagger: A Data (not code) Debugger. CIDR, 2020

El Kindi Rezig, Lei Cao, Giovanni Simonini, Maxime Schoemans, Samuel Madden, Nan Tang, Mourad Ouzzani, Michael Stonebraker

Large-scale in-memory analytics on Intel® Optane™ DC persistent memory. DaMoN, 2020

Anil Shanbhag, Nesime Tatbul, David E. Cohen, Samuel Madden

Sat2Graph: Road Graph Extraction Through Graph-Tensor Encoding. ECCV, 2020

Songtao He, Favyen Bastani, Satvat Jagwani, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Mohamed M. Elshrif, Samuel Madden, Mohammad Amin Sadeghi

Kaskade: Graph Views for Efficient Graph Analytics. ICDE, 2020

Joana M. F. da Trindade, Konstantinos Karanasos, Carlo Curino, Samuel Madden, Julian Shun

BeeCluster: drone orchestration via predictive optimization. MobiSys, 2020

Songtao He, Favyen Bastani, Arjun Balasingam, Karthik Gopalakrishnan, Ziwen Jiang, Mohammad Alizadeh, Hari Balakrishnan, Michael J. Cafarella, Tim Kraska, Sam Madden

MIRIS: Fast Object Track Queries in Video. SIGMOD Conference, 2020

Favyen Bastani, Songtao He, Arjun Balasingam, Karthik Gopalakrishnan, Mohammad Alizadeh, Hari Balakrishnan, Michael J. Cafarella, Tim Kraska, Sam Madden

Human-in-the-loop Outlier Detection. SIGMOD Conference, 2020

Chengliang Chai, Lei Cao, Guoliang Li, Jian Li, Yuyu Luo, Samuel Madden

Starling: A Scalable Query Engine on Cloud Functions. SIGMOD Conference, 2020

Matthew Perron, Raul Castro Fernandez, David J. DeWitt, Samuel Madden

Continuously Adaptive Similarity Search. SIGMOD Conference, 2020

Huayi Zhang, Lei Cao, Yizhou Yan, Samuel Madden, Elke A. Rundensteiner

Unnatural Language Processing: Bridging the Gap Between Synthetic and Natural Language Data. CoRR, 2020

Alana Marzoev, Samuel Madden, M. Frans Kaashoek, Michael J. Cafarella, Jacob Andreas

Sat2Graph: Road Graph Extraction through Graph-Tensor Encoding. CoRR, 2020

Songtao He, Favyen Bastani, Satvat Jagwani, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Mohamed M. Elshrif, Samuel Madden, Mohammad Amin Sadeghi

Relational Pretrained Transformers towards Democratizing Data Preparation [Vision]. CoRR, 2020

Nan Tang, Ju Fan, Fangyi Li, Jianhong Tu, Xiaoyong Du, Guoliang Li, Sam Madden, Mourad Ouzzani

Toward a Harmonized WHO Family of International Classifications Content Model. AMIA, 2020

Samson W. Tu, Csongor Nyulas, Tania Tudorache, Mark A. Musen, Andrea Martinuzzi, Coen H. van Gool, Vincenzo Della Mea, Christopher G. Chute, Lucilla Frattura, Nicholas R. Hardiker, Huib ten Napel, Richard Madden, Ann-Helene Almborg, Jeewani Anupama Ginige, Catherine Sykes, Can Çelik, Robert Jakob

Toward a Harmonized WHO Family of International Classifications Content Model. MIE, 2020

Samson W. Tu, Csongor I. Nyulas, Tania Tudorache, Mark A. Musen, Andrea Martinuzzi, Coen H. van Gool, Vincenzo Della Mea, Christopher G. Chute, Lucilla Frattura, Nick Hardiker, Huib ten Napel, Richard Madden, Ann-Helene Almborg, Jeewani Anupama Ginige, Catherine Sykes, Can Celik, Robert Jakob

A Polystore Based Database Operating System (DBOS). Poly/DMAH@VLDB, 2020

Michael J. Cafarella, David J. DeWitt, Vijay Gadepally, Jeremy Kepner, Christos Kozyrakis, Tim Kraska, Michael Stonebraker, Matei Zaharia

Towards Data Discovery by Example. Poly/DMAH@VLDB, 2020

El Kindi Rezig, Allan Vanterpool, Vijay Gadepally, Benjamin Price, Michael J. Cafarella, Michael Stonebraker

DBOS: A Proposal for a Data-Centric Operating System. CoRR, 2020

Michael J. Cafarella, David J. DeWitt, Vijay Gadepally, Jeremy Kepner, Christos Kozyrakis, Tim Kraska, Michael Stonebraker, Matei Zaharia

Constructing Expressive Relational Queries with Dual-Specification Synthesis. CIDR, 2020

Christopher Baik, Zhongjun Jin, Michael J. Cafarella, H. V. Jagadish

Duoquest: A Dual-Specification System for Expressive SQL Queries. SIGMOD Conference, 2020

Christopher Baik, Zhongjun Jin, Michael J. Cafarella, H. V. Jagadish

A Method for Optimizing Opaque Filter Queries. SIGMOD Conference, 2020

Wenjia He, Michael R. Anderson, Maxwell Strome, Michael J. Cafarella

Duoquest: A Dual-Specification System for Expressive SQL Queries. CoRR, 2020

Christopher Baik, Zhongjun Jin, Michael J. Cafarella, H. V. Jagadish

Winds from Seattle: Database Research Directions. Proc. VLDB Endow., 2020

Peter Bailis, Magda Balazinska, Xin Luna Dong, Juliana Freire, Raghu Ramakrishnan, Michael Stonebraker, Joseph M. Hellerstein

Pattern Functional Dependencies for Data Cleaning. Proc. VLDB Endow., 2020

Abdulhakim Ali Qahtan, Nan Tang, Mourad Ouzzani, Yang Cao, Michael Stonebraker

Poly'19 Workshop Summary: GDPR. SIGMOD Rec., 2020

Michael Stonebraker, Timothy G. Mattson, Tim Kraska, Vijay Gadepally

The Role of Latency and Task Complexity in Predicting Visual Search Behavior. IEEE Trans. Vis. Comput. Graph., 2020

Leilani Battle, R. Jordan Crouser, Audace Nakeshimana, Ananda Montoly, Remco Chang, Michael Stonebraker

Big Data Visualization and Analytics: Future Research Challenges and Emerging Applications. EDBT/ICDT Workshops, 2020

Gennady L. Andrienko, Natalia V. Andrienko, Steven Mark Drucker, Jean-Daniel Fekete, Danyel Fisher, Stratos Idreos, Tim Kraska, Guoliang Li, Kwan-Liu Ma, Jock D. Mackinlay, Antti Oulasvirta, Tobias Schreck, Heidrun Schumann, Michael Stonebraker, David Auber, Nikos Bikakis, Panos K. Chrysanthis, George Papastefanatos, Mohamed A. Sharaf

PushdownDB: Accelerating a DBMS Using S3 Computation. ICDE, 2020

Xiangyao Yu, Matt Youill, Matthew E. Woicik, Abdurrahman Ghanem, Marco Serafini, Ashraf Aboulnaga, Michael Stonebraker

The Next 5 Years: What Opportunities Should the Database Community Seize to Maximize its Impact? SIGMOD Conference, 2020

Magda Balazinska, Surajit Chaudhuri, Anastasia Ailamaki, Juliana Freire, Sailesh Krishnamurthy, Michael Stonebraker

PushdownDB: Accelerating a DBMS using S3 Computation. CoRR, 2020

Xiangyao Yu, Matt Youill, Matthew E. Woicik, Abdurrahman Ghanem, Marco Serafini, Ashraf Aboulnaga, Michael Stonebraker

Kyrix-S: Authoring Scalable Scatterplot Visualizations of Big Data. CoRR, 2020

Wenbo Tao, Xinli Hou, Adam Sah, Leilani Battle, Remco Chang, Michael Stonebraker

Context-Aware Parse Trees. CoRR, 2020

Fangke Ye, Shengtian Zhou, Anand Venkat, Ryan Marcus, Paul Petersen, Jesmin Jahan Tithi, Tim Mattson, Tim Kraska, Pradeep Dubey, Vivek Sarkar, Justin Gottschlich

MISIM: An End-to-End Neural Code Similarity System. CoRR, 2020

Fangke Ye, Shengtian Zhou, Anand Venkat, Ryan Marcus, Nesime Tatbul, Jesmin Jahan Tithi, Paul Petersen, Timothy G. Mattson, Tim Kraska, Pradeep Dubey, Vivek Sarkar, Justin Gottschlich

ARDA: Automatic Relational Data Augmentation for Machine Learning. Proc. VLDB Endow., 2020

Nadiia Chepurko, Ryan Marcus, Emanuel Zgraggen, Raul Castro Fernandez, Tim Kraska, David R. Karger

Benchmarking Learned Indexes. Proc. VLDB Endow., 2020

Ryan Marcus, Andreas Kipf, Alexander van Renen, Mihail Stoian, Sanchit Misra, Alfons Kemper, Thomas Neumann, Tim Kraska

Automated Data Slicing for Model Validation: A Big Data - AI Integration Approach. IEEE Trans. Knowl. Data Eng., 2020

Yeounoh Chung, Tim Kraska, Neoklis Polyzotis, Ki Hyun Tae, Steven Euijong Whang

Fast Mapping onto Census Blocks. HPEC, 2020

Jeremy Kepner, Andreas Kipf, Darren Engwirda, Navin Vembar, Michael Jones, Lauren Milechin, Vijay Gadepally, Chris Hill, Tim Kraska, William Arcand, David Bestor, William Bergeron, Chansup Byun, Matthew Hubbell, Michael Houle, Andrew C. Kirby, Anna Klein, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Sid Samsi, Charles Yee, Peter Michaleas

Cost-Guided Cardinality Estimation: Focus Where it Matters. ICDE Workshops, 2020

Parimarjan Negi, Ryan Marcus, Hongzi Mao, Nesime Tatbul, Tim Kraska, Mohammad Alizadeh

Learned garbage collection. MAPL@PLDI, 2020

Lujing Cen, Ryan Marcus, Hongzi Mao, Justin Gottschlich, Mohammad Alizadeh, Tim Kraska

ALEX: An Updatable Adaptive Learned Index. SIGMOD Conference, 2020

Jialin Ding, Umar Farooq Minhas, Jia Yu, Chi Wang, Jaeyoung Do, Yinan Li, Hantian Zhang, Badrish Chandramouli, Johannes Gehrke, Donald Kossmann, David B. Lomet, Tim Kraska

IDEBench: A Benchmark for Interactive Data Exploration. SIGMOD Conference, 2020

Philipp Eichmann, Emanuel Zgraggen, Carsten Binnig, Tim Kraska

DB4ML - An In-Memory Database Kernel with Machine Learning Support. SIGMOD Conference, 2020

Matthias Jasny, Tobias Ziegler, Tim Kraska, Uwe Röhm, Carsten Binnig

RadixSpline: a single-pass learned index. aiDM@SIGMOD, 2020

Andreas Kipf, Ryan Marcus, Alexander van Renen, Mihail Stoian, Alfons Kemper, Tim Kraska, Thomas Neumann

The Case for a Learned Sorting Algorithm. SIGMOD Conference, 2020

Ani Kristo, Kapil Vaidya, Ugur Çetintemel, Sanchit Misra, Tim Kraska

Learning Multi-Dimensional Indexes. SIGMOD Conference, 2020

Vikram Nathan, Jialin Ding, Mohammad Alizadeh, Tim Kraska

ARDA: Automatic Relational Data Augmentation for Machine Learning. CoRR, 2020

Nadiia Chepurko, Ryan Marcus, Emanuel Zgraggen, Raul Castro Fernandez, Tim Kraska, David R. Karger

Bao: Learning to Steer Query Optimizers. CoRR, 2020

Ryan Marcus, Parimarjan Negi, Hongzi Mao, Nesime Tatbul, Mohammad Alizadeh, Tim Kraska

Learned Garbage Collection. CoRR, 2020

Lujing Cen, Ryan Marcus, Hongzi Mao, Justin Gottschlich, Mohammad Alizadeh, Tim Kraska

RadixSpline: A Single-Pass Learned Index. CoRR, 2020

Andreas Kipf, Ryan Marcus, Alexander van Renen, Mihail Stoian, Alfons Kemper, Tim Kraska, Thomas Neumann

Fast Mapping onto Census Blocks. CoRR, 2020

Jeremy Kepner, Darren Engwirda, Vijay Gadepally, Chris Hill, Tim Kraska, Michael Jones, Andreas Kipf, Lauren Milechin, Navin Vembar

Partitioned Learned Bloom Filter. CoRR, 2020

Kapil Vaidya, Eric Knorr, Tim Kraska, Michael Mitzenmacher

Benchmarking Learned Indexes. CoRR, 2020

Ryan Marcus, Andreas Kipf, Alexander van Renen, Mihail Stoian, Sanchit Misra, Alfons Kemper, Thomas Neumann, Tim Kraska

Cortex: Harnessing Correlations to Boost Query Performance. CoRR, 2020

Vikram Nathan, Jialin Ding, Tim Kraska, Mohammad Alizadeh

Learned Indexes for a Google-scale Disk-based Database. CoRR, 2020

Hussam Abu-Libdeh, Deniz Altinbüken, Alex Beutel, Ed H. Chi, Lyric Doshi, Tim Kraska, Xiaozhou Li, Andy Ly, Christopher Olston

2019 (77 publications)
Towards Multiverse Databases. HotOS, 2019

Alana Marzoev, Lara Timbó Araújo, Malte Schwarzkopf, Samyukta Yagati, Eddie Kohler, Robert Tappan Morris, M. Frans Kaashoek, Sam Madden

Smile: A System to Support Machine Learning on EEG Data at Scale. Proc. VLDB Endow., 2019

Lei Cao, Wenbo Tao, Sungtae An, Jing Jin, Yizhou Yan, Xiaoyu Liu, Wendong Ge, Adam Sah, Leilani Battle, Jimeng Sun, Remco Chang, M. Brandon Westover, Samuel Madden, Michael Stonebraker

Efficient Discovery of Sequence Outlier Patterns. Proc. VLDB Endow., 2019

Lei Cao, Yizhou Yan, Samuel Madden, Elke A. Rundensteiner, Mathan Gopalsamy

Data Civilizer 2.0: A Holistic Framework for Data Preparation and Analytics. Proc. VLDB Endow., 2019

El Kindi Rezig, Lei Cao, Michael Stonebraker, Giovanni Simonini, Wenbo Tao, Samuel Madden, Mourad Ouzzani, Nan Tang, Ahmed K. Elmagarmid

SageDB: A Learned Database System. CIDR, 2019

Tim Kraska, Mohammad Alizadeh, Alex Beutel, Ed H. Chi, Ani Kristo, Guillaume Leclerc, Samuel Madden, Hongzi Mao, Vikram Nathan

Unsupervised String Transformation Learning for Entity Consolidation. ICDE, 2019

Dong Deng, Wenbo Tao, Ziawasch Abedjan, Ahmed K. Elmagarmid, Ihab F. Ilyas, Guoliang Li, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, Nan Tang

Raha: A Configuration-Free Error Detection System. SIGMOD Conference, 2019

Mohammad Mahdavi, Ziawasch Abedjan, Raul Castro Fernandez, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, Nan Tang

OLTP through the looking glass, and what we found there. Making Databases Work, 2019

Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, Michael Stonebraker

C-store: a column-oriented DBMS. Making Databases Work, 2019

Mike Stonebraker, Daniel J. Abadi, Adam Batkin, Xuedong Chen, Mitch Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Sam Madden, Elizabeth J. O'Neil, Patrick E. O'Neil, Alex Rasin, Nga Tran, Stan Zdonik

The end of an architectural era: it's time for a complete rewrite. Making Databases Work, 2019

Michael Stonebraker, Samuel Madden, Daniel J. Abadi, Stavros Harizopoulos, Nabil Hachem, Pat Helland

Deductive Optimization of Relational Data Storage. CoRR, 2019

John K. Feser, Samuel Madden, Nan Tang, Armando Solar-Lezama

SysML: The New Frontier of Machine Learning Systems. CoRR, 2019

Alexander Ratner, Dan Alistarh, Gustavo Alonso, David G. Andersen, Peter Bailis, Sarah Bird, Nicholas Carlini, Bryan Catanzaro, Eric S. Chung, Bill Dally, Jeff Dean, Inderjit S. Dhillon, Alexandros G. Dimakis, Pradeep Dubey, Charles Elkan, Grigori Fursin, Gregory R. Ganger, Lise Getoor, Phillip B. Gibbons, Garth A. Gibson, Joseph E. Gonzalez, Justin Gottschlich, Song Han, Kim M. Hazelwood, Furong Huang, Martin Jaggi, Kevin G. Jamieson, Michael I. Jordan, Gauri Joshi, Rania Khalaf, Jason Knight, Jakub Konecný, Tim Kraska, Arun Kumar, Anastasios Kyrillidis, Jing Li, Samuel Madden, H. Brendan McMahan, Erik Meijer, Ioannis Mitliagkas, Rajat Monga, Derek Gordon Murray, Dimitris S. Papailiopoulos, Gennady Pekhimenko, Theodoros Rekatsinas, Afshin Rostamizadeh, Christopher Ré, Christopher De Sa, Hanie Sedghi, Siddhartha Sen, Virginia Smith, Alex Smola, Dawn Song, Evan R. Sparks, Ion Stoica, Vivienne Sze, Madeleine Udell, Joaquin Vanschoren, Shivaram Venkataraman, Rashmi Vinayak, Markus Weimer, Andrew Gordon Wilson, Eric P. Xing, Matei Zaharia, Ce Zhang, Ameet Talwalkar

Kaskade: Graph Views for Efficient Graph Analytics. CoRR, 2019

Joana M. F. da Trindade, Konstantinos Karanasos, Carlo Curino, Samuel Madden, Julian Shun

Technical Report: Optimizing Human Involvement for Entity Matching and Consolidation. CoRR, 2019

Ji Sun, Dong Deng, Ihab F. Ilyas, Guoliang Li, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, Nan Tang

Machine-Assisted Map Editing. CoRR, 2019

Favyen Bastani, Songtao He, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden

Inferring and Improving Street Maps with Data-Driven Automation. CoRR, 2019

Favyen Bastani, Songtao He, Satvat Jagwani, Edward Park, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden, Mohammad Amin Sadeghi

Starling: A Scalable Query Engine on Cloud Function Services. CoRR, 2019

Matthew Perron, Raul Castro Fernandez, David J. DeWitt, Samuel Madden

Dataset-On-Demand: Automatic View Search and Presentation for Data Discovery. CoRR, 2019

Raul Castro Fernandez, Nan Tang, Mourad Ouzzani, Michael Stonebraker, Samuel Madden

RoadTagger: Robust Road Attribute Inference with Graph Neural Networks. CoRR, 2019

Songtao He, Favyen Bastani, Satvat Jagwani, Edward Park, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Samuel Madden, Mohammad Amin Sadeghi

Context-specific Language Modeling for Human Trafficking Detection from Online Advertisements. ACL, 2019

Saeideh Shahrokh Esfahani, Michael J. Cafarella, Maziyar Baran Pouyan, Gregory J. DeAngelo, Elena Eneva, Andy E. Fano

Demonstration of a Multiresolution Schema Mapping System. CIDR, 2019

Zhongjun Jin, Christopher Baik, Michael J. Cafarella, H. V. Jagadish, Yuze Lou

CLX: Towards verifiable PBE data transformation. EDBT, 2019

Zhongjun Jin, Michael J. Cafarella, H. V. Jagadish, Sean Kandel, Michael Minar, Joseph M. Hellerstein

Physical Representation-Based Predicate Optimization for a Visual Analytics Database. ICDE, 2019

Michael R. Anderson, Michael J. Cafarella, Germán Ros, Thomas F. Wenisch

The Seattle Report on Database Research. SIGMOD Rec., 2019

Daniel Abadi, Anastasia Ailamaki, David G. Andersen, Peter Bailis, Magdalena Balazinska, Philip A. Bernstein, Peter A. Boncz, Surajit Chaudhuri, Alvin Cheung, AnHai Doan, Luna Dong, Michael J. Franklin, Juliana Freire, Alon Y. Halevy, Joseph M. Hellerstein, Stratos Idreos, Donald Kossmann, Tim Kraska, Sailesh Krishnamurthy, Volker Markl, Sergey Melnik, Tova Milo, C. Mohan, Thomas Neumann, Beng Chin Ooi, Fatma Ozcan, Jignesh M. Patel, Andrew Pavlo, Raluca A. Popa, Raghu Ramakrishnan, Christopher Ré, Michael Stonebraker, Dan Suciu

Kyrix: Interactive Pan/Zoom Visualizations at Scale. Comput. Graph. Forum, 2019

Wenbo Tao, Xiaoyu Liu, Yedi Wang, Leilani Battle, Çagatay Demiralp, Remco Chang, Michael Stonebraker

Choosing A Cloud DBMS: Architectures and Tradeoffs. Proc. VLDB Endow., 2019

Junjay Tan, Thanaa M. Ghanem, Matthew Perron, Xiangyao Yu, Michael Stonebraker, David J. DeWitt, Marco Serafini, Ashraf Aboulnaga, Tim Kraska

Rethinking Database High Availability with RDMA Networks. Proc. VLDB Endow., 2019

Erfan Zamanian, Xiangyao Yu, Michael Stonebraker, Tim Kraska

Kyrix: Interactive Visual Data Exploration at Scale. CIDR, 2019

Wenbo Tao, Xiaoyu Liu, Çagatay Demiralp, Remco Chang, Michael Stonebraker

How I Learned to Stop Worrying and Love Re-optimization. ICDE, 2019

Matthew Perron, Zeyuan Shang, Tim Kraska, Michael Stonebraker

Towards an End-to-End Human-Centric Data Cleaning Framework. HILDA@SIGMOD, 2019

El Kindi Rezig, Mourad Ouzzani, Ahmed K. Elmagarmid, Walid G. Aref, Michael Stonebraker

SchengenDB: A Data Protection Database Proposal. Poly/DMAH@VLDB, 2019

Tim Kraska, Michael Stonebraker, Michael L. Brodie, Sacha Servan-Schreiber, Daniel J. Weitzner

WIP - SKOD: A Framework for Situational Knowledge on Demand. Poly/DMAH@VLDB, 2019

Servio Palacios, K. M. A. Solaiman, Pelin Angin, Alina Nesen, Bharat K. Bhargava, Zachary Collins, Aaron Sipser, Michael Stonebraker, James MacDonald

The design and implementation of INGRES. Making Databases Work, 2019

Michael Stonebraker, Eugene Wong, Peter Kreps, Gerald Held

The implementation of POSTGRES. Making Databases Work, 2019

Michael Stonebraker, Lawrence A. Rowe, Michael Hirohama

How I Learned to Stop Worrying and Love Re-optimization. CoRR, 2019

Matthew Perron, Zeyuan Shang, Tim Kraska, Michael Stonebraker

Kyrix: Interactive Visual Data Exploration at Scale. CoRR, 2019

Wenbo Tao, Xiaoyu Liu, Çagatay Demiralp, Remco Chang, Michael Stonebraker

Neo: A Learned Query Optimizer. Proc. VLDB Endow., 2019

Ryan Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, Nesime Tatbul

The SIGMOD 2019 Research Track Reviewing System. SIGMOD Rec., 2019

Anastasia Ailamaki, Periklis Chrysogelos, Amol Deshpande, Tim Kraska

VizML: A Machine Learning Approach to Visualization Recommendation. CHI, 2019

Kevin Zeng Hu, Michiel A. Bakker, Stephen Li, Tim Kraska, César A. Hidalgo

VizNet: Towards A Large-Scale Visualization Learning and Benchmarking Repository. CHI, 2019

Kevin Zeng Hu, Snehalkumar (Neil) S. Gaikwad, Madelon Hulsebos, Michiel A. Bakker, Emanuel Zgraggen, César A. Hidalgo, Tim Kraska, Guoliang Li, Arvind Satyanarayan, Çagatay Demiralp

VizCertify: A Framework for Secure Visual Data Exploration. DSAA, 2019

Lorenzo De Stefani, Leonhard F. Spiegelberg, Eli Upfal, Tim Kraska

Slice Finder: Automated Data Slicing for Model Validation. ICDE, 2019

Yeounoh Chung, Tim Kraska, Neoklis Polyzotis, Ki Hyun Tae, Steven Euijong Whang

Sherlock: A Deep Learning Approach to Semantic Data Type Detection. KDD, 2019

Madelon Hulsebos, Kevin Zeng Hu, Michiel A. Bakker, Emanuel Zgraggen, Arvind Satyanarayan, Tim Kraska, Çagatay Demiralp, César A. Hidalgo

Park: An Open Platform for Learning-Augmented Computer Systems. NeurIPS, 2019

Hongzi Mao, Parimarjan Negi, Akshay Narayan, Hanrui Wang, Jiacheng Yang, Haonan Wang, Ryan Marcus, Ravichandra Addanki, Mehrdad Khani Shirkoohi, Songtao He, Vikram Nathan, Frank Cangialosi, Shaileshh Bojja Venkatakrishnan, Wei-Hung Weng, Song Han, Tim Kraska, Mohammad Alizadeh

Designing Distributed Tree-based Index Structures for Fast RDMA-capable Networks. SIGMOD Conference, 2019

Tobias Ziegler, Sumukha Tumkur Vani, Carsten Binnig, Rodrigo Fonseca, Tim Kraska

FITing-Tree: A Data-aware Index Structure. SIGMOD Conference, 2019

Alex Galakatos, Michael Markovitch, Carsten Binnig, Rodrigo Fonseca, Tim Kraska

Democratizing Data Science through Interactive Curation of ML Pipelines. SIGMOD Conference, 2019

Zeyuan Shang, Emanuel Zgraggen, Benedetto Buratti, Ferdinand Kossmann, Philipp Eichmann, Yeounoh Chung, Carsten Binnig, Eli Upfal, Tim Kraska

Custodes: Auditable Hypothesis Testing. CoRR, 2019

Sacha Servan-Schreiber, Olga Ohrimenko, Tim Kraska, Emanuel Zgraggen

Neo: A Learned Query Optimizer. CoRR, 2019

Ryan Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, Nesime Tatbul

VizNet: Towards A Large-Scale Visualization Learning and Benchmarking Repository. CoRR, 2019

Kevin Zeng Hu, Snehalkumar (Neil) S. Gaikwad, Michiel A. Bakker, Madelon Hulsebos, Emanuel Zgraggen, César A. Hidalgo, Tim Kraska, Guoliang Li, Arvind Satyanarayan, Çagatay Demiralp

Sherlock: A Deep Learning Approach to Semantic Data Type Detection. CoRR, 2019

Madelon Hulsebos, Kevin Zeng Hu, Michiel A. Bakker, Emanuel Zgraggen, Arvind Satyanarayan, Tim Kraska, Çagatay Demiralp, César A. Hidalgo

LISA: Towards Learned DNA Sequence Search. CoRR, 2019

Darryl Ho, Jialin Ding, Sanchit Misra, Nesime Tatbul, Vikram Nathan, Vasimuddin Md, Tim Kraska

SOSD: A Benchmark for Learned Indexes. CoRR, 2019

Andreas Kipf, Ryan Marcus, Alexander van Renen, Mihail Stoian, Alfons Kemper, Tim Kraska, Thomas Neumann

Learning Multi-dimensional Indexes. CoRR, 2019

Vikram Nathan, Jialin Ding, Mohammad Alizadeh, Tim Kraska

2018 (53 publications)
Evaluating End-to-End Optimization for Data Analytics Applications in Weld. Proc. VLDB Endow., 2018

Shoumik Palkar, James Thomas, Deepak Narayanan, Pratiksha Thaker, Rahul Palamuttam, Parimarjan Negi, Anil Shanbhag, Malte Schwarzkopf, Holger Pirk, Saman P. Amarasinghe, Samuel Madden, Matei Zaharia

Optimally Leveraging Density and Locality for Exploratory Browsing and Sampling. HILDA@SIGMOD, 2018

Albert Kim, Liqi Xu, Tarique Siddiqui, Silu Huang, Samuel Madden, Aditya G. Parameswaran

MacroBase: Prioritizing Attention in Fast Data. ACM Trans. Database Syst., 2018

Firas Abuzaid, Peter Bailis, Jialin Ding, Edward Gan, Samuel Madden, Deepak Narayanan, Kexin Rong, Sahaana Suri

RoadTracer: Automatic Extraction of Road Networks From Aerial Images. CVPR, 2018

Favyen Bastani, Songtao He, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden, David J. DeWitt

Machine-assisted map editing. SIGSPATIAL/GIS, 2018

Favyen Bastani, Songtao He, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden

RoadRunner: improving the precision of road network inference from GPS trajectories. SIGSPATIAL/GIS, 2018

Songtao He, Favyen Bastani, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden

TabulaROSA: Tabular Operating System Architecture for Massively Parallel Heterogeneous Compute Engines. HPEC, 2018

Jeremy Kepner, Ron Brightwell, Alan Edelman, Vijay Gadepally, Hayden Jananthan, Michael Jones, Sam Madden, Peter Michaleas, Hamed Okhravi, Kevin T. Pedretti, Albert Reuther, Thomas L. Sterling, Mike Stonebraker

Aurum: A Data Discovery System. ICDE, 2018

Raul Castro Fernandez, Ziawasch Abedjan, Famien Koko, Gina Yuan, Samuel Madden, Michael Stonebraker

Seeping Semantics: Linking Datasets Using Word Embeddings for Data Discovery. ICDE, 2018

Raul Castro Fernandez, Essam Mansour, Abdulhakim Ali Qahtan, Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, Nan Tang

Extracting Syntactical Patterns from Databases. ICDE, 2018

Andrew Ilyas, Joana M. F. da Trindade, Raul Castro Fernandez, Samuel Madden

Building Data Civilizer Pipelines with an Advanced Workflow Engine. ICDE, 2018

Essam Mansour, Dong Deng, Raul Castro Fernandez, Abdulhakim Ali Qahtan, Wenbo Tao, Ziawasch Abedjan, Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, Nan Tang

Unthule: An Incremental Graph Construction Process for Robust Road Map Extraction from Aerial Images. CoRR, 2018

Favyen Bastani, Songtao He, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, David J. DeWitt, Sam Madden

Smallify: Learning Network Size while Training. CoRR, 2018

Guillaume Leclerc, Manasi Vartak, Raul Castro Fernandez, Tim Kraska, Samuel Madden

TabulaROSA: Tabular Operating System Architecture for Massively Parallel Heterogeneous Compute Engines. CoRR, 2018

Jeremy Kepner, Ron Brightwell, Alan Edelman, Vijay Gadepally, Hayden Jananthan, Michael Jones, Sam Madden, Peter Michaleas, Hamed Okhravi, Kevin T. Pedretti, Albert Reuther, Thomas L. Sterling, Mike Stonebraker

Ten Years of WebTables. Proc. VLDB Endow., 2018

Michael J. Cafarella, Alon Y. Halevy, Hongrae Lee, Jayant Madhavan, Cong Yu, Daisy Zhe Wang, Eugene Wu

Sledgehammer: Cluster-Fueled Debugging. OSDI, 2018

Andrew Quinn, Jason Flinn, Michael J. Cafarella

Beaver: Towards a Declarative Schema Mapping. HILDA@SIGMOD, 2018

Zhongjun Jin, Christopher Baik, Michael J. Cafarella, H. V. Jagadish

Unifacta: Profiling-driven String Pattern Standardization. CoRR, 2018

Zhongjun Jin, Michael J. Cafarella, H. V. Jagadish, Sean Kandel, Michael Minar

Physical Representation-based Predicate Optimization for a Visual Analytics Database. CoRR, 2018

Michael R. Anderson, Michael J. Cafarella, Germán Ros, Thomas F. Wenisch

Demonstration of a Multiresolution Schema Mapping System. CoRR, 2018

Zhongjun Jin, Christopher Baik, Michael J. Cafarella, H. V. Jagadish, Yuze Lou

Beagle: Automated Extraction and Interpretation of Visualizations from the Web. CHI, 2018

Leilani Battle, Peitong Duan, Zachery Miranda, Dana Mukusheva, Remco Chang, Michael Stonebraker

P-Store: An Elastic Database System with Predictive Provisioning. SIGMOD Conference, 2018

Rebecca Taft, Nosayba El-Sayed, Marco Serafini, Yu Lu, Ashraf Aboulnaga, Michael Stonebraker, Ricardo Mayerhofer, Francisco Jose Andrade

FastDAWG: Improving Data Migration in the BigDAWG Polystore System. Poly/DMAH@VLDB, 2018

Xiangyao Yu, Vijay Gadepally, Stan Zdonik, Tim Kraska, Michael Stonebraker

Towards Quantifying Uncertainty in Data Analysis & Exploration. IEEE Data Eng. Bull., 2018

Yeounoh Chung, Sacha Servan-Schreiber, Emanuel Zgraggen, Tim Kraska

Investigating the Effect of the Multiple Comparisons Problem in Visual Analysis. CHI, 2018

Emanuel Zgraggen, Zheguang Zhao, Robert C. Zeleznik, Tim Kraska

Superneurons: dynamic GPU memory management for training deep neural networks. PPoPP, 2018

Linnan Wang, Jinmian Ye, Yiyang Zhao, Wei Wu, Ang Li, Shuaiwen Leon Song, Zenglin Xu, Tim Kraska

Towards Interactive Curation & Automatic Tuning of ML Pipelines. DEEM@SIGMOD, 2018

Carsten Binnig, Benedetto Buratti, Yeounoh Chung, Cyrus Cousins, Tim Kraska, Zeyuan Shang, Eli Upfal, Robert C. Zeleznik, Emanuel Zgraggen

The Case for Learned Index Structures. SIGMOD Conference, 2018

Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, Neoklis Polyzotis

SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks. CoRR, 2018

Linnan Wang, Jinmian Ye, Yiyang Zhao, Wei Wu, Ang Li, Shuaiwen Leon Song, Zenglin Xu, Tim Kraska

A-Tree: A Bounded Approximate Index Structure. CoRR, 2018

Alex Galakatos, Michael Markovitch, Carsten Binnig, Rodrigo Fonseca, Tim Kraska

IDEBench: A Benchmark for Interactive Data Exploration. CoRR, 2018

Philipp Eichmann, Carsten Binnig, Tim Kraska, Emanuel Zgraggen

Slice Finder: Automated Data Slicing for Model Validation. CoRR, 2018

Yeounoh Chung, Tim Kraska, Neoklis Polyzotis, Steven Euijong Whang

VizML: A Machine Learning Approach to Visualization Recommendation. CoRR, 2018

Kevin Zeng Hu, Michiel A. Bakker, Stephen Li, Tim Kraska, César A. Hidalgo

Unknown Examples & Machine Learning Model Generalization. CoRR, 2018

Yeounoh Chung, Peter J. Haas, Eli Upfal, Tim Kraska

VizRec: A framework for secure data exploration via visual representation. CoRR, 2018

Lorenzo De Stefani, Leonhard F. Spiegelberg, Tim Kraska, Eli Upfal