Analytics platform queries and maps billions of data points

People generally associate graphic processing units (GPUs) with imaging processing. Developed for video games in the 1990s, modern GPUs are specialized circuits with thousands of small, efficient processing units, or “cores,” that work simultaneously to rapidly render graphics on screen.

But for the better part of a decade, GPUs have also found general computing applications. Because of their incredible parallel-computing speeds and high-performance memory, GPUs are today used for advanced lab simulations and deep-learning programming, among other things.

Now, Todd Mostak, a former researcher at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), is using GPUs to develop an analytic database and visualization platform called MapD, which is the fastest of its kind in the world, according to Mostak.

MapD is essentially a form of a commonly used database-management system that’s modified to run on GPUs instead of the central processing units (CPUs) that power most traditional database-management systems. By doing so, MapD can process billions of data points in milliseconds, making it 100 times faster than traditional systems. Moreover, MapD visualizes all processed data points nearly instantaneously — such as, say, plotting tweets on a world map — and parameters can be modified on the fly to adjust the visualized display.

With its first product launched last March, MapD’s clients already include Verizon and other big-name telecommunications companies, a social media giant, and financial and advertising firms. In October, the investment arm of the U.S. Central Intelligence Agency, In-Q-Tel, announced that it had invested in MapD’s latest funding round to accelerate the development of certain features for the U.S. intelligence community.

“[The CIA has] a lot of geospatial data, and they need to be able to form, visualize, and query that data in real-time. It’s a real need across the intelligence community,” Mostak says.

“Making GPUs first-class citizens”

GPUs are designed specifically for parallel computing, with thousands of energy-efficient cores that can, for example, simultaneously determine the color of each pixel on a computer screen to render an image. GPUs also use high bandwidth memory, a form of random access memory (RAM) that’s about an order of magnitude faster than CPUs.

Today, some databases are being powered by GPUs. But these systems suffer from a major design flaw, Mostak says: “In most implementations, the data is initially stored on a CPU, moved to the GPU for a query, and results are moved back to the CPU for storage. Even if you speed up the computation time of a query [by using a GPU], you lose most of the speed by transferring from CPU to GPU and back.”