Simera Professional Key (SPK)

Gerson V

Costa Rica

Data Engineer | Analytics Engineer

$ 6,700-$ 7,700/month

10+ yrs exp

Senior Data Engineer, Analytics Engineer and Consultant with over 13 years of experience modernizing data ecosystems and architecting scalable cloud solutions. Highly proficient in leveraging AWS and Azure services—including Azure Data Factory and Data Lake—along with dbt-driven transformation frameworks and cloud data warehouses (Snowflake, Redshift) to build robust ETL/ELT pipelines. Adept at using Python and PySpark for advanced data processing, analytics, and machine learning operationalizat…

Skills

AWS
C
Git
GitHub
hadoop
Java
JavaScript
Jenkins
MongoDB
MySQL
Python
QlikView
Redshift
Spark
SQL
Tableau
JavaScript ES6
AirFlow
C#
Data Engineer
DataBricks
GitLab
Looker
Oracle
Pandas

Gerson V

Costa Rica

Data Engineer | Analytics Engineer

$ 6,700-$ 7,700/month

10+ yrs exp

Skills

AWS
C
Git
GitHub
hadoop
Java
JavaScript
Jenkins
MongoDB
MySQL
Python
QlikView
Redshift
Spark
SQL
Tableau
JavaScript ES6
AirFlow
C#
Data Engineer
DataBricks
GitLab
Looker
Oracle
Pandas

Experience
Education

Data Engineer | Analytics Engineer

Varsity Tutors

June 2023 - July 2025

Architect and maintain scalable AWS infrastructure for data platforms using Terraform, with heavy automation across EC2, EKS, DMS, and S3 to support ingestion, replication, and cross-region synchronization for MySQL, Redshift, and SingleStore environments. Led data migrations and pipeline deployments across MySQL, read replicas, Segment, and Confluent Kafka, ensuring high availability, low-latency delivery, and integration with downstream analytics systems and microservices. Designed real-time data ingestion pipelines using Kafka Connect and custom Python services to extract operational metrics and customer events from MongoDB, enabling downstream analytics in Redshift and SingleStore. Designed and managed dbt transformations and data models across Redshift, SingleStore, and Nexla, coordinating with analytics teams to enforce standards, automate testing, and optimize model performance in production. Developed infrastructure-as-code and CI/CD workflows for data systems and API integrations using Python, Terraform, and GitHub Actions, enabling consistent deployments and version-controlled environment management. Supported real-time data ingestion and routing across services using Confluent Platform, Kafka Connect, and custom Python microservices, integrating APIs, change data capture (CDC), and third-party tools in a secure, resilient architecture.

Data Engineer | Analytics Engineer

Freelance

June 2022 - present

Provided occasional, short-term consulting support between full-time roles, typically a few hours per week. Assisted with small ETL tasks in AWS Glue to move datasets into Snowflake for quicker analytics setup. Helped tune Spark and Python jobs for faster performance on specific workflows. Added simple Python Lambda checks to monitor data quality and flag issues early. Offered brief guidance on big data infrastructure and model deployment best practices in AWS. Created a few targeted Power BI dashboards to present data more clearly for decision-makers.

Data Engineer | Analytics Engineer

Swimply

April 2022 - June 2023

Founding Data member, in charge of meeting with sales representatives to choose different tools and define the overall architecture. Managed all migrations from on-premises instances/AWS Redshift instances to new Snowflake environment. Served as Snowflake administrator handling all permissions for new users and third-party sources. Designed pipelines to handle sync for different sources using Lambda (Python) Functions/Step Functions as orchestrator and AWS Glue. Built custom AWS Lambda pipelines to ingest property and booking data stored in MongoDB, transforming and syncing it into Snowflake for use in Power BI dashboards and machine learning workflows. Researched new sources of data such as Market Data, district data, etc. Created data framework for alerts in case of errors and ensured data quality in synchronizations. Managed all dbt transformations and configurations. Assisted Engineering team with maintenance of Elastic Search Indexes doing upsert operations. Helped Data Science to make Machine Learning Models production ready in HEX and Databricks. Managed all Power BI Reports infrastructure and development. Configured all Data Dog alerts.

Data Engineer | Analytics Engineer

Customer Times

May 2020 - April 2022

Led migration and modernization of legacy ETL systems to Databricks, Snowflake, Azure Data Lake, and Redshift, utilizing Spark, Delta Lake, and dbt for scalable, cloud-native data transformations and model versioning. Developed and orchestrated end-to-end data pipelines using Apache Airflow (custom Python DAGs), integrating diverse data sources (REST APIs, MemSQL, Oracle, flat files) and deploying workflows via Jenkins, TeamCity, and CircleCI. Designed and maintained advanced Power BI dashboards with real-time metrics on pipeline health, leveraging DAX, data modeling, and connections to Snowflake, Azure Data Lake, and API layers to deliver unified analytics across stakeholders. Engineered secure data access workflows and user controls across services, including API quota enforcement and web service access governance; collaborated with DBAs on contingency plans and system-level access policies. Refactored and optimized legacy data logic across multiple stacks (C#, Python, Ruby on Rails, Java, JavaScript) for better performance, reliability, and compliance; coached new hires and participated in technical planning with management.

Data Engineer | Analytics Engineer

Gorilla Logic

December 2018 - April 2020

Led cloud migration planning and implementation for multiple clients, transitioning ETL pipelines from SSIS to AWS Glue, Databricks, Snowflake, and Redshift, with secure storage layers on S3 and bucket-based architecture. Developed distributed data processing pipelines using Apache Spark (Scala/PySpark), Hadoop, and Hive, migrating legacy Hive-based systems to modern cloud platforms for high-volume processing in healthcare, automotive, and delivery industries. Built real-time data streaming solutions using Apache Kafka and Spark Streaming, delivering low-latency analytics and event-driven triggers for production environments. Created robust ingestion frameworks in Python and Scala to integrate third-party APIs using Pandas, NumPy, and web service clients, powering downstream analytics and machine learning use cases. Acted as a technical lead across integration teams: coordinated test environments, roadmap planning, secure access provisioning (API rate-limiting, web service policies), and enforced data engineering standards through code reviews and CI/CD pipelines.

BI Developer | Data Analyst

Intel Corporation

July 2016 - December 2018

Led the migration of legacy on-premise SQL Server data infrastructure to Azure SQL and Data Lake, designing scalable architectures that improved data accessibility, performance, and integration with cloud-native tools. Built and managed reporting suites using Power BI, SSRS, and Tableau, integrating with Azure and on-premise sources to deliver interactive dashboards, scheduled reports, and ad hoc analysis. Developed distributed ETL/ELT pipelines using PySpark on Apache Spark, processing large-scale data for reporting, feature engineering, and downstream analytics in collaboration with data science and ML teams. Operationalized model outputs and real-time data feeds using Spark jobs and lightweight orchestration layers, supporting early-stage streaming workloads and automated reporting refreshes. Implemented reporting automation strategies that replaced manual processes, reduced SLA breaches, and increased the reliability and traceability of scheduled data flows.

BI Developer | Data Engineer

Indecomm Global Services

June 2014 - July 2016

Re-engineered and automated legacy ETL workflows using SSIS, streamlining data pipelines across multiple systems and reducing processing times and operational errors by over 60%. Designed and implemented complex stored procedures, views, and T-SQL scripts to support high-performance data transformations, aggregations, and business rule enforcement within SQL Server. Built and maintained SSRS reports and Power BI dashboards for operational and executive stakeholders, enabling real-time visibility into key metrics and reducing reporting turnaround time. Tuned long-running SQL queries and optimized indexing strategies, significantly improving query performance and reducing resource contention across large datasets. Collaborated with business analysts to perform data validation, root cause analysis, and anomaly detection, ensuring consistency and reliability across transactional and reporting layers.

Business Intelligence Developer | Data Analyst

Intel Corporation

January 2013 - June 2014

Architected and executed a hybrid data platform migration from Microsoft SQL Server to Cloudera Hadoop (CDH 4/5), integrating Sqoop, Hive, and Impala to support distributed processing and scalable data warehousing. Engineered and optimized SSIS-driven ETL pipelines for high-volume data ingestion from transactional systems into both SQL Server and Hadoop environments, improving pipeline performance and fault tolerance. Developed multidimensional (OLAP) and tabular models with SSAS, enabling low-latency analytics across billions of rows; exposed semantic layers to Power BI and Tableau via live connections and DAX/MDX measures. Automated enterprise reporting with SSRS and parameterized report packs, integrating with scheduling agents and Active Directory for secure, role-based delivery. Collaborated with infrastructure and analytics teams to standardize metadata, enforce data quality, and establish governance around semantic models and KPI definitions in a fully on-premise environment.

Smart Scores

Communication

Role Fit

Loyalty

Adaptability

Problem-solving

Smart Skills

beta