FIND_THE_RIGHTJOB.

Webkorps Optimal Solutions

Data Engineer

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

A Senior Data Engineer specializing in Python and SQL is responsible for designing, developing, and maintaining scalable data pipelines, ensuring the efficient flow of data across systems, and enabling data-driven decision-making within the organization.

Key Responsibilities

Data Pipeline Development

Design and implement robust ETL/ELT pipelines for structured, semi-structured, and unstructured data.
Automate data workflows using Python, Airflow and SQL-based tools.
Optimize pipelines for performance, scalability, and reliability.

Data Modeling

Create and maintain efficient database schemas and data models for analytical and operational systems.
Implement dimensional modeling, star/snowflake schemas for data warehouses.

Data Integration

Integrate data from multiple sources such as APIs, databases, and cloud platforms.
Ensure seamless data flow across data lakes, warehouses, and reporting tools.

Performance Optimization

Optimize SQL queries and database performance using indexing, partitioning, and query tuning techniques.
Monitor and improve pipeline performance, reducing latency and bottlenecks.

Collaboration

Work closely with data analysts, data scientists, and business teams to understand requirements and deliver solutions.
Collaborate with DevOps teams for CI/CD pipeline integration.

Monitoring and Maintenance

Implement logging, monitoring, and alerting for data pipelines and workflows.
Troubleshoot and resolve pipeline or data-related issues proactively.

Data Governance and Security

Ensure data quality, integrity, and security by implementing best practices.
Work with teams to enforce policies like encryption, masking, and access control.

Leadership and Mentoring

Mentor junior data engineers and lead technical discussions within the team.
Drive architectural decisions and recommend best practices.

Skills and Expertise

Core Technical Skills

Programming

Expert in Python: Data manipulation (Pandas, NUMPY and other library), API integration, and scripting.
Knowledge of frameworks like PySpark for distributed data processing.

SQL

Advanced SQL skills for query writing, optimization, and database management.
Experience with relational (PostgreSQL, MySQL, SQL Server) and columnar databases (Snowflake, Redshift, Clickhouse).

Data Engineering Tools

ETL Tools: Apache Airflow, Talend.
Big Data Tools: Hadoop, Spark (preferred but not mandatory).
Cloud Platforms: AWS (Glue, Redshift, S3), Azure, GCP.

Data Storage

Proficiency with data lakes (e.g., S3, Delta Lake) and data warehouses (e.g., Snowflake, Redshift, Clickhouse).

Data Formats

Expertise in handling file formats like Parquet, Avro, ORC, JSON, and CSV.

Data Pipeline Orchestration

Build and manage workflows using Airflow, Talend.

Additional Skills

CI/CD: Knowledge of CI/CD pipelines and version control (Git, Jenkins).
Monitoring: Familiarity with logging and monitoring tools like Grafana, Prometheus, or CloudWatch.
APIs: Experience with REST and GraphQL APIs for data integration.

Experience Requirements

Mid-Level: 5–8 years of data engineering experience.
Senior-Level: 8+ years with proven expertise in designing and scaling data pipelines.

Tools/Platforms Knowledge

Programming: Python, SQL.
Data Processing: Pandas, Numpy, PySpark.
Databases: PostgreSQL, Snowflake, Redshift, Clickhouse.
Workflow Orchestration: Apache Airflow, Talend.
Monitoring: CloudWatch, Grafana, Prometheus.
DevOps: Git, Jenkins, Docker (optional).

Job Type: Full-time

Experience:

Data Engineer: 5 years (Required)
Python: 3 years (Required)

Work Location: In person

Similar jobs

No similar jobs found

Term of use Privacy policy