Y

Yefan Li

About

Detail

Data Engineer
Champaign, Illinois, United States

Timeline


Job
Education

Résumé


Jobs 0% verified
  • C
    Data Scientist & Engineer, Intern
    Caring.com
    Jun 2023 - Sep 2023 (4 months)
    Gathered requirements and conducted source-to-target mapping to ensure effective data integration and alignment with business objectives. Developed and monitored ELT data pipelines using Fivetran and DBT, optimizing data transformations and ensuring data accuracy. Designed and implemented database schema/tables in Snowflake, contributing to a robust data warehouse infrastructure. Session Volume Forecasting • Utilized Snowflake and PySpark to analyze patterns (e.g., Day of Week, Holiday, Amenities Impact) across multiple websites, enabling data-driven decision-making for optimizing web performance. • Developed advanced Time Series forecasting models (SARIMA, ARIMA, Prophet) in Python, achieving a remarkable average forecasting error of
  • O
    Data Engineer, Intern
    OVERTURE LLC
    Jan 2023 - Jun 2023 (6 months)
    Worked with Databricks Platform and AWS Glue, RDS, S3, and Redshift to design, develop, test, and monitor the ETL process. Implemented Python, SQL, and PySpark for both real-time and batch data processing and data analytics. Spark Structured Streaming job • Wrote a streaming application, Streaming data from the API and sent to the created 5+ Kafka topic, built Kafka procedure and consumer to process real-time streaming data. • Extracted data from specific Kafka topics, cleaned and analyzed batch user data using PySpark DataFrame API and Spark SQL commend including aggregate/window function. • Scheduled workflows in Databricks and manually ran the job to get the last 30 minutes user information. Ensured the operation of the workflow, an
  • D
    Data Engineer
    Jun 2020 - Dec 2021 (1 year 7 months)
    Database Migration • Migrated 1TB of sales data from MySQL database to Delta Lake and kept them in sync using Delta table merge. • Implemented spark.read.jdbc to merge MySQL database changes (Delete, Update, Insert) into the Databricks table SCD1. Created a CDF table to track MySQL database modifications for input into the SCD Type 2 table.
Education 0% verified
  • University of Illinois at Urbana-Champaign
    Master of Science in Statistics
    University of Illinois at Urbana-Champaign
    Jan 2022 - Dec 2023 (2 years)
  • Northern Arizona University
    BSc in Chemistry and Minor in Math
    Northern Arizona University
    Sep 2017 - Jun 2021 (3 years 10 months)