Job Title: Data Engineer
Experience-10+
Job Description:
We're looking for a highly skilled and motivated Cloud Data Engineer to design, build, and
manage scalable data infrastructure and pipelines on cloud platforms. You'll be a key player
in transforming raw data into valuable business insights, leveraging a wide array of cloud-
native and open-source tools for data storage, processing, and analysis.
Design, develop, and maintain robust and scalable ETL/ELT processes using cloud
data integration services.
∙Implement and manage data lakes and data warehouses for storing and analysing large
datasets.
∙Develop and optimize big data solutions for processing massive volumes of data using
frameworks like Spark.
∙Build systems for real-time data streaming and analysis.
∙Work with various database systems, including relational databases and NoSQL
databases.
∙Utilize workflow orchestration tools like Apache Airflow to automate complex data
pipelines.
∙Ensure data security and governance by implementing strong identity, access, and
encryption controls.
∙Monitor cloud resources and data workflows to ensure reliability, performance, and
cost-efficiency.
∙Collaborate with data scientists and business intelligence teams to prepare data for
analysis and reporting.
∙Use programming languages like Python (including PySpark) to script and automate
data tasks.
∙Manage data migration projects between different environments.
∙Leverage containerization technologies for deploying data processing applications.
∙Work with BI tools to create and maintain data models and dashboards.
∙Handle networking configurations and connectivity within the cloud environment.
Proven experience as a Data Engineer or in a similar role.
∙Strong hands-on experience with core cloud computing services for compute, storage,
and identity management.
∙Proficiency in designing and implementing ETL/ELT processes and data pipelines.
∙Expertise in Python and PySpark for data manipulation and analysis.
∙Experience with various databases, including MySQL, PostgreSQL, and other
relational databases, as well as NoSQL and data warehouse solutions.
∙Familiarity with big data processing frameworks (e.g., Spark) and analytics services.
∙Knowledge of data orchestration tools like Apache Airflow.
∙Understanding of data security, encryption, and access management best practices.
∙Experience with infrastructure as code for managing cloud resources.
∙Solid problem-solving skills and the ability to work in a collaborative environment.