Brent Gardner Named Official Apache Arrow Committer

Brent is Space and Time’s first full-time sponsored developer contributing to the project.

Space and Time

Decentralized Data Warehouse

Space and Time is proud to announce that Brent Gardner, a Principal Software Engineer building the Space and Time analytic data engine, is now an official committer to Apache Arrow. Committers, who are selected in recognition of their sustained contribution to the project, are authorized to merge code patches to the project’s repositories and serve as non-voting project maintainers. Brent joins an impressive roster of fewer than 50 official committers, including representatives from Google, IBM, NVIDIA, and others. Space and Time is thrilled to have a full-time sponsored developer contributing to the Apache Arrow project, with more developers on the way.

About Brent Gardner

Brent Gardner is an experienced software engineer with extensive experience spanning from big data, to scientific modeling, streaming, and data analytics. Brent has been involved in building columnar databases since 2009, working on cryptocurrency code since 2017, and using Rust since pre-1.0. In addition to development, Brent is talented in public speaking, teaching, leadership and a variety of programming languages and domains. Before joining Space and Time, he used his guidance and technical vision to take multiple startups from zero to one. Brent has been an active contributor to Apache Arrow, Apache DataFusion, and Apache Ballista since joining Space and Time.

“I’m very honored to be an Apache committer,” said Brent. “I've been interested in contributing to open-source projects throughout my career. I'm really grateful that Space and Time has sponsored me full-time to contribute to the Arrow, DataFusion, Ballista projects, as well as some upcoming research in those areas.”

Brent’s involvement with Apache

Brent first began working with Apache several years ago, when he was responsible for porting his company’s database engine onto Apache Spark alongside his then-coworker Andy Grove. Andy went on to contribute the DataFusion in-memory SQL query engine and the Ballista distributed query scheduler to the Arrow project.

“It’s been great to see Brent's involvement in the Arrow community, and his progression from contributor to committer,” said Andy. “Brent is a talented engineer that I’ve worked with for a long time, and I'm excited to see his valuable contributions to Arrow, DataFusion, and Ballista.”

About Apache Arrow

Apache Arrow offers a language-agnostic way to share memory between processes in a columnar format. Providing different databases with a standard interchange format lays the foundation for robust vectorized operations, which are becoming increasingly required of modern databases. Arrow includes vectorized operations that allow you to keep CPU cache lines full and fully utilize the wide bit lanes in modern processors.

Apache Arrow makes it easier to build high-performance systems and analytic pipelines, but it’s specifically optimized for fast, columnar data processing. If you want to run SQL, or anything higher level, you need DataFusion. Apache DataFusion is an in-memory SQL engine that allows you to query data stored in Arrow format without having to write low-level code to extract and process the data.

Finally, Apache Ballista is a distributed computing platform that stacks on top of Arrow and DataFusion to serialize execution plans and execute them in a parallel format across various nodes. The combined use of each of these projects creates a high-performance data processing platform that rivals Apache Spark for some workloads.

Apache Arrow for HTAP architecture

Query engines built on Arrow are proving to be much more efficient than those built on traditional JVM-based technologies, such as Apache Spark. Arrow allows data to be retrieved much faster for analytic queries and enables faster TPC-H benchmarks. But while Arrow is optimized for online analytical processing (OLAP), a hybrid transactional database and data warehouse (HTAP) like Space and Time is designed to be optimized for both high-throughput analytical processing and realtime transactional processing. Arrow record batches are generally immutable, which presents unique challenges for handling transactions. For the Space and Time team, building an HTAP system that leverages the high-performance analytic processing enabled by Arrow has required new, creative solutions for handling transactional processing. 

Giving back to the open-source community

Open-source software has been a hallmark of development in the blockchain ecosystem. Decentralization, transparency, and collaboration are core values of Web3, and developers simply don’t want to trust closed-source code. Space and Time is committed to advancing and giving back to the open-source community. Sponsoring Brent as a committer to the Apache Arrow project is the first of many steps in our active support of open-source innovation.

“It’s very exciting to have a talented engineer like Brent contributing to the fast-growing Apache ecosystem,” said Space and Time CTO and Co-Founder Scott Dykstra. “We see extreme value in Arrow, Ballista, and DataFusion. These technologies are the future. We're excited to be building on top of them, and we're really thrilled that a Space and Time engineer is contributing.” 

Space and Time

Decentralized Data Warehouse

Space Time Curvature - Blog Post