Experience: 3–6 Years
Location: Noida
Employment Type: Full-Time
Work Mode: 5 days WFO
We are looking for a skilled Data Engineer with 3–6 years of experience to design, build, and maintain scalable data pipelines and data processing systems. The ideal candidate should have strong experience in PySpark, SQL, AWS services, and workflow orchestration tools like Airflow, along with exposure to big data technologies such as Hadoop.
Key ResponsibilitiesDesign, develop, and maintain scalable data pipelines for processing large datasets.
Build and optimize ETL/ELT workflows using PySpark and SQL.
Develop and manage data workflows using Apache Airflow for scheduling and orchestration.
Work with AWS data services to build robust and scalable data platforms.
Integrate and process data from multiple sources including structured and unstructured data.
Perform data transformation, cleansing, and aggregation to support analytics and reporting.
Optimize data processing jobs for performance, reliability, and scalability.
Collaborate with data scientists, analysts, and engineering teams to support data requirements.
Ensure data quality, governance, and security across pipelines.
Strong programming experience in PySpark and Python.
Strong knowledge of SQL and database concepts.
Hands-on experience with AWS services such as S3, Glue, EMR, Redshift, Lambda, or EC2.
Experience building data pipelines and ETL workflows.
Experience with Apache Airflow for workflow orchestration.
Knowledge of Hadoop ecosystem (HDFS, Hive, Spark).
Experience handling large-scale data processing and distributed systems.
Understanding of data modeling and data warehousing concepts.
Experience with Kafka or streaming data pipelines.
Experience with Docker or containerized environments.
Exposure to CI/CD pipelines and DevOps practices.
Experience with data lake architecture.
Bachelor’s or Master’s degree in Computer Science, Information Technology, or related field.
