Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 1

Snr.

Data Engineer #24-00017

Location: 100% REMOTE in India

Responsibilities:

 Designing and implementing large-scale, distributed data processing systems using technologies such as Apache
Hadoop, Apache Spark, or Apache Flink.
 Developing and optimizing data pipelines and workflows for ingesting, storing, processing, and analyzing large
volumes of structured and unstructured data.
 Collaborating with data scientists, data analysts, and other stakeholders to understand data requirements and
translate them into technical solutions.
 Building and maintaining data infrastructure, including data lakes, data warehouses, and real-time streaming
platforms.
 Designing and implementing data models and schemas for efficient data storage and retrieval.
 Ensuring the scalability, availability, and fault-tolerance of big data systems through proper configuration,
monitoring, and performance tuning.
 Identifying and evaluating new technologies, tools, and frameworks to improve the efficiency and effectiveness
of big data processing.
 Implementing data security and privacy measures to protect sensitive information throughout the data lifecycle.
 Collaborating with cross-functional teams to integrate data from various sources, including structured databases,
unstructured files, APIs, and streaming data.
 Developing and maintaining documentation, including data flow diagrams, system architecture, and technical
specifications.

Requirements:

 Bachelor's or higher degree in Computer Science, Engineering, or a related field.


 Proven experience as a big data engineer or a similar role, with a deep understanding of big data technologies,
frameworks, and best practices.
 Strong programming skills in languages such as Java, Scala, or Python for developing big data solutions.
 Experience with big data processing frameworks like Apache Hadoop, Apache Spark, Apache Flink, or similar.
 Proficiency in SQL and NoSQL databases, as well as data modeling and database design principles.
 Familiarity with cloud platforms and services, such as Amazon Web Services (AWS), Microsoft Azure, or Google
Cloud Platform (GCP).
 Knowledge of distributed computing principles and technologies, such as HDFS, YARN, and containerization (e.g.,
Docker, Kubernetes).
 Understanding of real-time streaming technologies and frameworks, such as Apache Kafka or Apache Pulsar.
 Strong problem-solving skills and ability to optimize and tune big data processing systems for performance and
scalability.
 Excellent communication and teamwork skills to collaborate with cross-functional teams and stakeholders.

You might also like