Find The Perfect Job

All Filters


25+

1000k+


View all
Education
Apply

Principal Engineer Data Analytics Engineering ×
Showing 1-1 of 1 jobs
Full Time
Part Time
0 year
0k+
Male
Female
Both
Work From Office
Work From Home
Field Job
Apply

  • 9 - 10 yrs
  • Not Mentioned
  • Bengaluru
  • Python Pyspark SQL
    • Full Time
    graduate
    9 - 10 yrs
    No required
    3
    Stargate
    Full Time

    Working Type : Work From Office
    Job Description :

    Sandisk

    Job Overview

    We are seeking a passionate candidate dedicated to building robust data pipelines and handling large-scale data processing. The ideal candidate will thrive in a dynamic environment and demonstrate a commitment to optimizing and maintaining efficient data workflows. The ideal candidate will have hands-on experience with Python, MariaDB, SQL, Linux, Docker, Airflow administration, and CI/CD pipeline creation and maintenance. The application is built using Python Dash, and the role will involve application deployment, server administration, and ensuring the smooth operation and upgrading of the application.

    Key Responsibilities:

    • Minimum of 9+ years of experience in developing data pipelines using Spark.
    • Ability to design, develop, and optimize Apache Spark applications for large-scale data processing.
    • Ability to implement efficient data transformation and manipulation logic using Spark RDDs and Data Frames.
    • Manage server administration tasks, including monitoring, troubleshooting, and optimizing performance. Administer and manage databases (MariaDB) to ensure data integrity and availability.
    • Ability to design, implement, and maintain Apache Kafka pipelines for real-time data streaming and event-driven architectures.
    • Development and deep technical skill in PythonPySparkScala and SQL/Procedure.
    • Working knowledge and understanding on Unix/Linux operating system like awk, ssh, crontab, etc.,
    • Ability to write transact SQL, develop and debug stored procedures and user defined functions in python.
    • Working experience on Postgres and/or Redshift/Snowflake database is required.
    • Exposure to CI/CD tools like bit bucket, Jenkins, ansible, docker, Kubernetes etc. is preferred.
    • Ability to understand relational database systems and its concepts.
    • Ability to handle large table/dataset of 2+TB in a columnar database environment.
    • Ability to integrate data pipelines with Splunk/Grafana for real-time monitoring, analysis, and Power BI visualization.
    • Ability to create and schedule the Airflow Jobs.
    Powered by XEAM Ventures Private Limited