KaamIndia: Bridging Opportunities for Jobs & Employment Solutions

Principal Engineer Data Analytics Engineering

Stargate

9 - 10 yrs
Not Mentioned

Bengaluru

Python Pyspark SQL

Full Time

graduate

9 - 10 yrs

No required

Stargate

JOB TYPE

Full Time

Working Type : Work From Office

Job Description :

Job Overview

We are seeking a passionate candidate dedicated to building robust data pipelines and handling large-scale data processing. The ideal candidate will thrive in a dynamic environment and demonstrate a commitment to optimizing and maintaining efficient data workflows. The ideal candidate will have hands-on experience with Python, MariaDB, SQL, Linux, Docker, Airflow administration, and CI/CD pipeline creation and maintenance. The application is built using Python Dash, and the role will involve application deployment, server administration, and ensuring the smooth operation and upgrading of the application.

Key Responsibilities:

Minimum of 9+ years of experience in developing data pipelines using Spark.
Ability to design, develop, and optimize Apache Spark applications for large-scale data processing.
Ability to implement efficient data transformation and manipulation logic using Spark RDDs and Data Frames.
Manage server administration tasks, including monitoring, troubleshooting, and optimizing performance. Administer and manage databases (MariaDB) to ensure data integrity and availability.
Ability to design, implement, and maintain Apache Kafka pipelines for real-time data streaming and event-driven architectures.
Development and deep technical skill in Python, PySpark, Scala and SQL/Procedure.
Working knowledge and understanding on Unix/Linux operating system like awk, ssh, crontab, etc.,
Ability to write transact SQL, develop and debug stored procedures and user defined functions in python.
Working experience on Postgres and/or Redshift/Snowflake database is required.
Exposure to CI/CD tools like bit bucket, Jenkins, ansible, docker, Kubernetes etc. is preferred.
Ability to understand relational database systems and its concepts.
Ability to handle large table/dataset of 2+TB in a columnar database environment.
Ability to integrate data pipelines with Splunk/Grafana for real-time monitoring, analysis, and Power BI visualization.
Ability to create and schedule the Airflow Jobs.

Find The Perfect Job

All Filters

Job type

Experience

Salary

Gender

Education

Working Type

Principal Engineer Data Analytics Engineering

Working Type :

Job Description :

Principal Engineer Data Analytics Engineering

Working Type : Work From Office

Job Description :

Sandisk