• 5 years of professional experience across various domains with expertise including Big data engineering, software programming, Automation testing, API testing, full stack developement.
• Programming experience with Python, pyspark, javascript, SQL,
• Experience in Big data tools/related technologies: Hadoop, Spark, Hive, MapReduce, Kafka, AWS.
• Exposure to aws services like S3, Apache Spark, AWS Glue, AWS Lambda, CloudWatch, Athena, EMR, terraforms, Airflow, IAM roles and permissions, Secrets Manager, Lake formation.
• Experience in AWS cloud environment.
• Experience working with GitHub source and version control systems.
• Experience in deployments using CICD tool like CircleCI.
• Used IDEs like Eclipse, IntelliJ IDE, PyCharm IDE, Jupyter Notebook, Visual Studio Code for development.
• Design, build and integrate data from various sources like databases, APIs and file systems.
• Building modern data platforms on AWS cloud using various services like lambda, glue, step functions, athena.
Led the successful implementation of a data lake ingestion project, overseeing the end-to-end process of extracting, transforming, and loading diverse data sources into the organization's data lake. Utilized Informatica Big Data Management to streamline the ingestion pipeline, ensuring data quality, scalability, and efficient integration of structured and unstructured data. Improved data accessibility for analytics and reporting teams, resulting in a 20% reduction in data latency and enhanced decision-making capabilities.
• Experience working in projects confirming to SDLC, agile methodologies.
• Working knowledge of data architecture including data ingestion pipeline design, data modeling, data mining and machine learning.
• REST API design by using node js , express.js and mongo db.
• Front end development using Angular, jQuery and other libraries
• Collaboration with various teams involving data scientists and analysts to provide various data solutions.
• Strong collaborator and having work experience with cross-functional teams.
•Strong focus on documenting the standard procedures, architecture and data definitions using GitHub documentation tools.