Authors: Wenfei Fan, Tao He, Longbin Lai, Xue Li, Yong Li, Zhao Li, Zhengping Qian, Chao Tian, Lei Wang, Jingbo Xu, Youyang Yao, Qiang Yin, Wenyuan Yu, Jingren Zhou, Diwen Zhu, Rong Zhu
Name of Conference: International Conference on Very Large Data Bases (VLDB 2021)
Date of Publication: Aug, 2021
GraphScope is a system and a set of language extensions that enable a new programming interface for large-scale distributed graph computing. It generalizes previous graph processing frameworks (e.g., Pregel, GraphX) and distributed graph databases (e.g., JanusGraph, Neptune) in two important ways: by exposing a unified programming interface to a wide variety of graph computations such as graph traversal, pattern matching, iterative algorithms and graph neural networks within a high-level programming language; and by supporting the seamless integration of a highly optimized graph engine in a general purpose data-parallel computing system. A GraphScope program is a sequential program composed of declarative data-parallel operators, and can be written using standard Python development tools. The system automatically handles the parallelization and distributed execution of programs on a cluster of machines. It outperforms current state-of-the-art systems by enabling a separate optimization (or family of optimizations) for each graph operation in one carefully designed coherent framework. We describe the design and implementation of GraphScope and evaluate system performance using several real-world applications.