Simera Professional Key (SPK)

Jyothi R

Senior Data Engineer

$ 9,200-$ 10,200/month

8 yrs exp

Skills

Ansible
Azure
Bash
C
C++
Docker
Git
GitHub
Google Analytics
hadoop
Java
JavaScript
Kubernetes
MySQL
Oracle SQL
PostgreSQL
Python
QlikView
R
React
Redshift
SAS
Scala
Spark
SQL

Jyothi R

Senior Data Engineer

$ 9,200-$ 10,200/month

8 yrs exp

Skills

Ansible
Azure
Bash
C
C++
Docker
Git
GitHub
Google Analytics
hadoop
Java
JavaScript
Kubernetes
MySQL
Oracle SQL
PostgreSQL
Python
QlikView
R
React
Redshift
SAS
Scala
Spark
SQL

Experience
Education

Senior Data Engineer

M&T Bank

June 2023 - present

Engineered and orchestrated scalable ELT pipelines in Azure Data Factory to ingest data from diverse sources like Oracle/DB2, SFTP, and REST APIs, incorporating incremental watermarking and schema-drift handling. Transformed data using PySpark in Databricks across bronze/silver/gold Delta layers, publishing curated data marts to Azure Synapse Analytics for enterprise reporting. Modeled star-schema marts in Synapse/SQL Server for Finance, Cards, and Deposits, building a robust semantic layer to support ad-hoc analysis and certified Power BI datasets. Partnered with Risk, Finance, and Product teams to translate regulatory and management reporting needs (Basel III/CCAR) into scalable data models and SLAs. Delivered actionable Power BI dashboards and KPI scorecards for key business metrics, including delinquency and fraud rates, and implemented Row-Level Security (RLS) to protect sensitive PII/PCI data. Developed and automated data quality validation using Python in Databricks to enforce audit/balance controls (row counts, referential integrity), ensuring data accuracy and reliability. Tuned Spark and T-SQL queries using techniques like partitioning, Z-ORDER, and indexing to reduce runtimes and cut compute/storage costs while meeting tight end-of-day and T+1 SLAs. Deployed and managed pipelines via Azure DevOps CI/CD, leveraging parameterized ARM templates and Azure Key Vault for secure secrets management. Owned on-call rotations and monitored pipelines with Azure Monitor and Log Analytics, driving root cause analysis to prevent recurring incidents and maintain platform uptime.

Data Engineer

PepsiCo

February 2021 - January 2023

Managed GCP projects using Terraform for IaC. Enhanced BigQuery Tableau reports with column partitioning and scenario-based testing. Built ELT processes integrating Ab Initio, Google Sheets, Dataprep, Dataproc (PySpark), and BigQuery. Implemented CI/CD pipelines with Terraform and Git, automating cloud infrastructure deployment. Migrated Oracle SQL ETL to GCP using Dataproc, BigQuery, and Pub/Sub, triggering Airflow jobs. Developed analytics platforms with Presto, Hive, Spark-SQL, and BigQuery, using Python libraries. Built and optimized data pipelines with Apache Airflow in GCP Composer, using Bash and Python. Automated infrastructure deployment with Terraform and Ansible, enabling reproducible and fault-tolerant pipeline delivery. Conducted Root Cause Analysis (RCA) to resolve data quality and pipeline failures. Managed Airflow pipelines with environment variables and secure password encryption. Converted legacy SAS jobs to Python/Spark workflows on Dataproc and BigQuery. Troubleshot ETL issues, optimizing Spark and SQL queries. Integrated Pub/Sub and Cloud Functions for real-time workflow triggers. Built scalable clustering algorithms with Dataflow and Composer. Optimized Cloud Dataprep jobs in BigQuery for better performance. Enhanced security by creating custom roles with Terraform in sandbox environments.

Data Engineer

Trinity Health

January 2018 - February 2021

Increased data ingestion efficiency by 30% using MSBI, Talend, Informatica, and Big Data tools. Built data pipelines with AWS Lambda, Kinesis, and SQS for seamless processing. Led metadata standardization and reference data frameworks to improve data discoverability and lineage across clinical pipelines. Implemented ETL with SSIS, Talend, and Informatica, ensuring data accuracy. Optimized SSRS reports, enhancing decision-making for business and healthcare. Orchestrated on-premises to AWS migration, reducing costs and improving scalability. Designed real-time ETL systems with data quality metrics. Developed and optimized data integration pipelines using HL7 v2 and FHIR standards to enable real-time clinical data exchange between EHR systems, ensuring interoperability and compliance with HIPAA regulations. Designed and implemented FHIR-based RESTful APIs to standardize access to patient records, lab results, and encounter data, improving care coordination and supporting value-based healthcare initiatives. Tuned SQL and developed advanced scripts for profiling and performance optimization. Built efficient dbt models in Snowflake for data transformation. Implemented monitoring and logging practices, with familiarity in Prometheus/Grafana for observability. Used Python, Scala, and LINQ for processing and integrated data into HDFS with Flume and Sqoop. Built real-time data pipelines on AWS (Lambda, Kinesis, SQS) with Hadoop and Spark for clinical data processing. Developed data models and ETL workflows on Redshift and Snowflake, ensuring scalable analytics. Automated ML workflows with SageMaker and delivered Power BI training. Developed SSAS models and Excel/PowerPoint reports. Contributed to Agile/Scrum with regular updates and feedback.

Smart Scores

Communication

Role Fit

Loyalty

Adaptability

Problem-solving

Smart Skills

beta