Yuanjing Shi

Software Engineer at Snowflake & MSCS at UIUC

Contact

LinkedIn shingjan
Github shingjan

Location

500 108th Ave NE
Bellevue, WA 98004 US

Work & Experience

Snowflake Inc.

Software Engineer

2020/06 — Present
  • Built a modern, cloud-built data lake to provide instant elasticity, high availability and on-demand storage with transactional consistency and support for semi-structured data, including JSON, CSV, Parquet, ORC, etc., in the Data Lake and Storage team
  • Worked in close cooperation with project managers, support engineers and other functional team members to form a team effort and deliver key features of SnowflakeDB, like Data Export and External Table, for existing and potential customers

Purdue Center for Programming Principles and Software Systems (PurPL)

Research Assistant

2019/05 — 2019/08
  • Worked with Prof. Tiark Rompf on building Flare (OSDI’ 18, SIGMOD’ 18, & Spark + AI Summit 2018) - an accelerator for Apache Spark with native compilation on Scala front-end and support for distributed data processing, data pipelining and streaming
  • Use Lightweight Modular Staging (LMS) framework to add distributing data processing support with Message Passing Interface (MPI) and outperform Apache Spark by up tp 10 times on standalone and computation-intensive workloads like TPC-H

Microsoft Research Lab - Asia

Summer Intern

2018/05 — 2018/08
  • Designed and developed the transactional branch of GraphView, a middleware and DLL library for the Graph Gremlin API of Microsoft’s Azure CosmosDB, in collaboration with the Intelligent Cloud & Edge Group and the Azure team
  • Constructed the testing framework based on YCSB & TPC-C in order to benchmark GraphView and other comparable offerings on multi-node clusters

Education

University of Illinois at Urbana-Champaign

2018/08 — 2020/05
Master of Science, Computer Science

Advisor: Vikram Adve


The Hong Kong Polytechnic University

2013/08 — 2018/05
Bachelor of Science (Honours), Computing

Advisor: Zili Shao


Publications

An Optimizing Compiler For ONNX Models On Heterogeneous Systems

Published by IDEALS - University of Illinois, 2020

Yuanjing Shi

Leyenda: An Adaptive, Hybrid Sorting Algorithm for Large Scale Data with Limited Memory

Published by arXiv, 2019

Yuanjing Shi, Zhaoxing Li

An Efficient LSM-tree-based SQLite-like Database Engine for Mobile Devices

Published by IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (IEEE TCAD), 2018

Zhaoyan Shen, Yuanjing Shi, Zili Shao, Yong Guan

SQLiteKV: An Efficient LSM-tree-based Lightweight Database Engine for Mobile Devices

Published by 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), 2018

Yuanjing Shi, Zhaoyan Shen, Zili Shao

Skills

Data Intensive Computing


Apache Arrow, Hadoop, Parquet, and Spark

Deep Learning Frameworks


TensorFlow/Keras, PyTorch, ONNX, and TVM

Programming Lauguages


Over 20,000 lines
C/C++, Java, Scala, and Python

Compiler Infrastructure


LLVM, TVM, and LMS