Undergraduate Upends a 40-Year-Old Data Science Conjecture

Undergraduate Upends a 40-Year-Old Data Science Conjecture

A tool chest with a tray full of different sized arrows against a red-orange backdrop.

Sometime in the fall of 2021, Andrew Krapivin, an undergraduate at Rutgers University, came across a paper that would change his life. At the time, Krapivin didn't think much of it. However, two years later, when he finally took the time to read the paper ("just for fun," as he said), his efforts led to a rethinking of a widely used tool in computer science.

The paper titled “Tiny Pointers” referred to arrow-like entities that can direct you to a piece of information or element in a computer’s memory. Krapivin soon thought of a way to make these pointers even smaller so they used less memory. However, to do that, he needed a better way to organize the data the pointers would point to.

He turned to a common method for storing data called a hash table. While experimenting, Krapivin realized he had invented a new kind of hash table, one that worked faster than expected, taking less time and fewer steps to find specific elements.

Martín Farach-Colton, a co-author of the “Tiny Pointers” paper and Krapivin’s former professor at Rutgers, was initially skeptical of Krapivin’s new design. Hash tables are among the most thoroughly studied data structures in computer science; the improvement seemed too good to be true. But to be sure, he asked a frequent collaborator and “Tiny Pointers” co-author, William Kuszmaul of Carnegie Mellon University, to review his student’s invention. Kuszmaul had a different reaction. “You didn’t just come up with a cool hash table,” he remembers telling Krapivin. “You’ve actually completely disproved a 40-year-old conjecture!”