Software Engineer at Snowflake & MSCS at UIUC
Contact
Location
500 108th Ave NE
Bellevue, WA 98004 US
Bellevue, WA 98004 US
Work & Experience
Snowflake Inc.
Software Engineer
2020/06
— Present
- Built a modern, cloud-built data lake to provide instant elasticity, high availability and on-demand storage with transactional consistency and support for semi-structured data, including JSON, CSV, Parquet, ORC, etc., in the Data Lake and Storage team
- Worked in close cooperation with project managers, support engineers and other functional team members to form a team effort and deliver key features of SnowflakeDB, like Data Export and External Table, for existing and potential customers
Purdue Center for Programming Principles and Software Systems (PurPL)
Research Assistant
2019/05
— 2019/08
- Worked with Prof. Tiark Rompf on building Flare (OSDI’ 18, SIGMOD’ 18, & Spark + AI Summit 2018) - an accelerator for Apache Spark with native compilation on Scala front-end and support for distributed data processing, data pipelining and streaming
- Use Lightweight Modular Staging (LMS) framework to add distributing data processing support with Message Passing Interface (MPI) and outperform Apache Spark by up tp 10 times on standalone and computation-intensive workloads like TPC-H
Microsoft Research Lab - Asia
Summer Intern
2018/05
— 2018/08
- Designed and developed the transactional branch of GraphView, a middleware and DLL library for the Graph Gremlin API of Microsoft’s Azure CosmosDB, in collaboration with the Intelligent Cloud & Edge Group and the Azure team
- Constructed the testing framework based on YCSB & TPC-C in order to benchmark GraphView and other comparable offerings on multi-node clusters
Education
University of Illinois at Urbana-Champaign
2018/08
— 2020/05
Master of Science, Computer Science
Advisor: Vikram Adve
The Hong Kong Polytechnic University
2013/08
— 2018/05
Bachelor of Science (Honours), Computing
Advisor: Zili Shao
Publications
An Optimizing Compiler For ONNX Models On Heterogeneous Systems
Published by IDEALS - University of Illinois, 2020
Yuanjing Shi
Leyenda: An Adaptive, Hybrid Sorting Algorithm for Large Scale Data with Limited Memory
Published by arXiv, 2019
Yuanjing Shi, Zhaoxing Li
An Efficient LSM-tree-based SQLite-like Database Engine for Mobile Devices
Published by IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (IEEE TCAD), 2018
Zhaoyan Shen, Yuanjing Shi, Zili Shao, Yong Guan
SQLiteKV: An Efficient LSM-tree-based Lightweight Database Engine for Mobile Devices
Published by 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), 2018
Yuanjing Shi, Zhaoyan Shen, Zili Shao
Skills
Data Intensive Computing
Apache Arrow, Hadoop, Parquet, and Spark
Deep Learning Frameworks
TensorFlow/Keras, PyTorch, ONNX, and TVM
Programming Lauguages
Over 20,000 lines C/C++, Java, Scala, and Python
Compiler Infrastructure
LLVM, TVM, and LMS