Tomáš Faltín

I'm a computer systems researcher and Assistant Professor at Charles University, focused on building and analyzing high-performance, scalable software. My core research investigates large-scale distributed graph processing and the underlying performance mechanics of concurrent systems through rigorous analysis and benchmarking. I bring industry and academic experience from my past work at Oracle Labs and Inria Paris.

Posts by Collection

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

projects

publications

BDgen: A Universal Big Data Generator

Published in IDEAS, 2017

This paper introduces BDgen, a generator of Big Data targeting various types of users, implemented as a general and easily extensible framework. It is divided into a scalable backend designed to generate Big Data on clusters and a frontend for user-friendly definition of the structure of the required data, or its automatic inference from a sample data set. In the first release we have implemented generators of two commonly used formats (JSON and CSV) and the support for general grammars. We have also performed preliminary experimental comparisons confirming the advantages and competitiveness of the solution.

Recommended citation: Tomáš Faltín, Michal Hanzeli, Vojtěch Šípek, Jan Škvařil, Dušan Variš, and Irena Holubová Mlýnková. 2017. BDgen: A Universal Big Data Generator. In Proceedings of the 21st International Database Engineering & Applications Symposium (IDEAS '17). Association for Computing Machinery, New York, NY, USA, 200–208. https://doi.org/10.1145/3105831.3105847
Download Paper

aDFS: An Almost Depth-First-Search Distributed Graph-Querying System

Published in USENIX ATC, 2021

In this paper, we introduce aDFS: A distributed graph-querying system that can process practically any query fully in memory, while maintaining bounded runtime memory consumption. To achieve this behavior, aDFS relies on (i) almost depth-first (aDFS) graph exploration with some breadth-first characteristics for performance, and (ii) non-blocking dispatching of intermediate results to remote edges. We evaluate aDFS against state-of-the-art graph-querying (Neo4J and GraphFrames for Apache Spark), graph-mining (G-Miner, Fractal, and Peregrine), as well as dataflow joins (BiGJoin), and show that aDFS significantly outperforms prior work on a diverse selection of workloads.

Recommended citation: Trigonakis, V., Lozi, J.P., Faltín, T., Roth, N.P., Psaroudakis, I., Delamare, A., Haprian, V., Iorgulescu, C., Koupy, P., Lee, J. and Hong, S., 2021. aDFS: An Almost Depth-First-Search Distributed Graph-Querying System. In 2021 USENIX Annual Technical Conference (USENIX ATC 21) (pp. 209-224).
Download Paper | Download Slides | Download Bibtex

Space-efficient methodology for representing label information in large graph data for fast distributed graph query

Published in US, 2021

Techniques are described herein for space-efficient encoding of label information of property graphs. In an embodiment, an input graph is received. The input graph comprises a plurality of entities and a plurality of label sets. Each entity of said plurality of entities is associated with a label set of the plurality of label sets and each label set of the plurality of label sets comprises zero or more labels of a plurality of labels. A first mapping is generated that maps each label of the plurality of labels to a label code. A second mapping is generated that maps each label integer set of a plurality of label integer sets to a label code. Each label integer set of the plurality of label integer sets corresponds to a label set of the plurality of label sets, wherein each label integer set of the plurality of label integer sets comprises label codes from the first mapping that are mapped to each label included in the corresponding label set. A compressed label set is generated for each entity of the plurality of entities. Each compressed label set comprises a plurality of bits that indicate a zeroth state, a first state, a second state, or a third state. The compressed label sets and the first and second mappings are used to efficiently evaluate graph label queries.

Recommended citation: Delamare, A., Trigonakis, V., Haprian, V.I., Van Rest, O., Hong, S., Chafi, H., Faltin, T. and Lozi, J.P., Oracle International Corp, 2021. Space-efficient methodology for representing label information in large graph data for fast distributed graph query. U.S. Patent 11,074,260.
Download Paper

Regular path queries (RPQS) for distributed graphs

Published in US, 2022

A pattern matching engine interprets a query into a data structure resembling a finite state machine. Vertices in the query pattern are treated as states or stages, while edges connecting them are treated as state transitions or hops. To match the full pattern, the first stage is first matched by applying vertex filters, if any. If the vertex is eligible, its edges that satisfy the edge filters, if any, are followed to move to the neighbors that can potentially produce results, thus progressing to the next stage. This process is repeated; if all stages are matched, then the whole pattern has been matched successfully.

Recommended citation: Tonkovic, P., Trigonakis, V., Faltin, T., Hong, S. and Chafi, H., Oracle International Corp, 2022. Regular path queries (RPQS) for distributed graphs. U.S. Patent 11,456,946.
Download Paper

Dynamic asynchronous traversals for distributed graph queries

Published in US, 2023

Techniques are described for enabling in-memory execution of any-sized graph data query by utilizing both depth first search (DFS) principles and breadth first search (BFS) principles to control the amount of memory used during query execution. Specifically, threads implementing a graph DBMS switch between a BFS mode of data traversal and a DFS mode of data traversal. For example, when a thread detects that there are less than a configurable threshold number of intermediate results in memory, the thread enters BFS-based traversal techniques to increase the number of intermediate results in memory. When the thread detects that there are at least the configurable threshold number of intermediate results in memory, the thread enters DFS mode to produce final results, which generally works to move the intermediate results that are currently available in memory to final query results, thereby reducing the number of intermediate results in memory.

Recommended citation: Trigonakis, V., Faltin, T., Lozi, J.P., Haprian, V.I., Hong, S. and Chafi, H., Oracle International Corp, 2023. Dynamic asynchronous traversals for distributed graph queries. U.S. Patent 11,675,785.
Download Paper

Better Distributed Graph Query Planning With Scouting Queries

Published in GRADES-NDA, 2023

This paper introduces scouting queries, a lightweight mechanism to gather runtime information about different query plans, which can then be used to choose the “best” plan. In a pipelined, depth-first-oriented graph processing engine, scouting queries typically execute for a brief amount of time with negligible overhead. Partial results can be reused to avoid redundant work. We evaluate scouting queries and show that they bring speedups of up to 8.7× for heavy queries, while adding low overhead for those queries that do not benefit.

Recommended citation: Tomáš Faltín, Vasileios Trigonakis, Ayoub Berdai, Luigi Fusco, Călin Iorgulescu, Sungpack Hong, and Hassan Chafi. 2023. Better Distributed Graph Query Planning With Scouting Queries. In Proceedings of the 6th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA) (GRADES-NDA '23). Association for Computing Machinery, New York, NY, USA, Article 3, 1–9. https://doi.org/10.1145/3594778.3594884
Download Paper

Distributed Asynchronous Regular Path Queries (RPQs) on Graphs

Published in Middleware, 2023

In this paper, we introduce a novel design for distributed RPQs that builds on top of distributed asynchronous pipelined traversals to enable (i) memory control of path explorations, with (ii) great performance and scalability. Through our evaluation, we show that with sixteen machines, it outperforms Neo4j by 91× on average and a relational implementation of the same queries in PostgreSQL by 230×, while maintaining low memory consumption.

Recommended citation: Tomáš Faltín, Vasileios Trigonakis, Ayoub Berdai, Luigi Fusco, Călin Iorgulescu, et al.. Distributed Asynchronous Regular Path Queries (RPQs) on Graphs. Middleware 2023: 24th International Middleware Conference, Dec 2023, Bologna, Italy, Italy. pp.35-41, ⟨10.1145/3626562.3626833⟩. ⟨hal-04355309⟩
Download Paper

Concurrency and cancellation in distributed asynchronous graph processing

Published in US, 2024

Techniques to efficiently assign available workers to executing multiple graph queries concurrently on a distributed graph database are disclosed. The techniques comprise a runtime engine assigning multiple workers to executing portions of multiple graph queries, each worker in each assignment asynchronously executing a portion of a graph query within a parallel-while construct that includes return statements at different locations, and the runtime engine reassigning a worker to executing another portion of the same or a different graph query to optimize the overall performance of all workers.

Recommended citation: Trigonakis, V., Iorgulescu, C., Faltin, T., Hong, S. and Chafi, H., Oracle International Corp, 2024. Concurrency and cancellation in distributed asynchronous graph processing. U.S. Patent 11,947,539.
Download Paper

Duplication elimination in depth based searches for distributed systems

Published in US, 2024

Systems and methods for improving evaluation of graph queries through depth first traversals are described herein. In an embodiment, a multi-node system evaluates against graph data a graph query that specifies a particular pattern to match by determining, at a first node of the multi-node system, in a particular instance of evaluating the graph query, that one or more first vertices on the first node match a first portion of the graph query and that a second vertex that is to be evaluated next is stored on a second node separate from the first node. In response to determining that the next vertex to be evaluated is stored on the second node separate from the first node, the first node generates a message to the second node comprising one or more results of the first portion of the graph query based on the one or more first vertices, an identifier of the next vertex, and a current stage of evaluating the graph query. In response to generating the message from the first node to the second node, the first node ceases the particular instance of evaluating the graph query.

Recommended citation: Faltin, T., Trigonakis, V., Lozi, J.P., Hong, S. and Chafi, H., Oracle International Corp, 2024. Duplication elimination in depth based searches for distributed systems. U.S. Patent 12,001,425.
Download Paper

Offloading graph components to persistent storage for reducing resident memory in distributed graph processing

Published in US, 2024

A storage manager for offloading graph components to persistent storage for reducing resident memory in a distributed graph processing engine is provided. The storage manager identifies a set of graph components required to execute a graph processing operation on a graph in a graph processing engine of a database system and reserves an amount of memory needed to load the set of graph components into memory. The storage manager loads the set of graph components into memory and initiates execution of the graph processing operation using the set of graph components in memory. The storage manager evicts one or more unused graph components from memory in response to receiving a request to free a requested amount of memory from memory.

Recommended citation: Delamare, A., Bunjaku, I., Trigonakis, V., Iorgulescu, C., Faltin, T., Hong, S. and Chafi, H., Oracle International Corp, 2024. Offloading graph components to persistent storage for reducing resident memory in distributed graph processing. U.S. Patent 12,174,835.
Download Paper

Subqueries in distributed asynchronous graph queries

Published in US, 2025

A graph processing engine is provided for executing a graph query comprising a parent query and a subquery nested within the parent query. The subquery uses a reference to one or more correlated variables from the parent query. Executing the graph query comprises initiating execution of the parent query, pausing the execution of the parent query responsive to the parent query matching the one or more correlated variables in an intermediate result set, generating a subquery identifier for each match of the one or more correlated variables, modifying the subquery to include a subquery aggregate function and a clause to group results by subquery identifier, executing the modified subquery using the intermediate result set and collecting subquery results into a subquery results table responsive to pausing execution of the parent query, and resuming execution of the parent query using the subquery results table.

Recommended citation: Trigonakis, V., Ragot, A., Ez-Zainabi, Y., Faltin, T., Hong, S. and Chafi, H., Oracle International Corp, 2025. Subqueries in distributed asynchronous graph queries. U.S. Patent 12,197,436.
Download Paper

Tapestry: Revealing Wait-For Dependencies Between Application Threads

Published in PLOS, 2025

Application slowdowns that accumulate on the critical path significantly impact performance. Identifying them requires visualizing inter-thread wait-for dependencies, a task neglected by existing tools, especially regarding ad-hoc, busywaiting synchronization. We present Tapestry, which efficiently traces these dependencies by dynamically switching between hardware watchpoints and software breakpoints. Using three use cases, we show that Tapestry is able to reveal NUMA effects and explain performance anomalies in oversubscribed and virtualized environments.

Recommended citation: Tomáš Faltin, Himadri Chhaya-Shailesh, Julia Lawall, and Jean-Pierre Lozi. 2025. Tapestry: Revealing Wait-For Dependencies Between Application Threads. In Proceedings of the 13th Workshop on Programming Languages and Operating Systems (PLOS '25). Association for Computing Machinery, New York, NY, USA, 93–100. https://doi.org/10.1145/3764860.3768339
Download Paper

Tomáš Faltín

Posts by Collection

portfolio

Portfolio item number 1

Portfolio item number 2

projects

publications

BDgen: A Universal Big Data Generator

aDFS: An Almost Depth-First-Search Distributed Graph-Querying System

Space-efficient methodology for representing label information in large graph data for fast distributed graph query

Regular path queries (RPQS) for distributed graphs

Dynamic asynchronous traversals for distributed graph queries

Better Distributed Graph Query Planning With Scouting Queries

Distributed Asynchronous Regular Path Queries (RPQs) on Graphs

Concurrency and cancellation in distributed asynchronous graph processing

Duplication elimination in depth based searches for distributed systems

Offloading graph components to persistent storage for reducing resident memory in distributed graph processing

Subqueries in distributed asynchronous graph queries

Tapestry: Revealing Wait-For Dependencies Between Application Threads

teaching

Bitcoin and Cryptocurrency Technologies

Programming in C++

Principles of Distributed Systems