About the Role
Responsibilities
• Create and maintain optimal data pipeline architecture; assemble large, complex data sets that meet functional / non-functional requirements.
• Design the right schema to support the functional requirement and consumption patter.
• Design and build production data pipelines from ingestion to consumption.
• Create necessary preprocessing and postprocessing for various forms of data for training/ retraining and inference ingestions as required.
• Create data visualization and business intelligence tools for stakeholders and data scientists for necessary business/ solution insights.
• Identify, design, and implement internal process improvements: automating manual data processes, optimizing data delivery, etc.
• Ensure our data is separated and secure across national boundaries through multiple data centers Requirements and Skills
• You should have a bachelors or master’s degree in computer science, Information Technology or other quantitative fields
• You should have at least 8 years working as a data engineer in supporting large data transformation initiatives related to machine learning, with experience in building and optimizing pipelines and data sets
• Strong analytic skills related to working with unstructured datasets.
• Experience with Azure cloud services, ADF, ADLS, HDInsight, Data Bricks, App Insights etc
• Experience in handling ETL’s using Spark.
• Experience with object-oriented/object function scripting languages: Python, Pyspark, etc.
• Experience with big data tools: Hadoop, Spark, Kafka, etc.
• Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
• You should be a good team player and committed for the success of team and overall project.
Requirements
Responsibilities
· Create and maintain optimal data pipeline architecture; assemble large, complex data sets that meet functional / non-functional requirements.
· Design the right schema to support the functional requirement and consumption patter.
· Design and build production data pipelines from ingestion to consumption.
· Create necessary preprocessing and postprocessing for various forms of data for training/ retraining and inference ingestions as required.
· Create data visualization and business intelligence tools for stakeholders and data scientists for necessary business/ solution insights.
· Identify, design, and implement internal process improvements: automating manual data processes, optimizing data delivery, etc.
· Ensure our data is separated and secure across national boundaries through multiple data centers Requirements and Skills
· You should have a bachelors or master’s degree in computer science, Information Technology or other quantitative fields
· You should have at least 8 years working as a data engineer in supporting large data transformation initiatives related to machine learning, with experience in building and optimizing pipelines and data sets
· Strong analytic skills related to working with unstructured datasets.
· Experience with Azure cloud services, ADF, ADLS, HDInsight, Data Bricks, App Insights etc
· Experience in handling ETL’s using Spark.
· Experience with object-oriented/object function scripting languages: Python, Pyspark, etc.
· Experience with big data tools: Hadoop, Spark, Kafka, etc.
· Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
· You should be a good team player and committed for the success of team and overall project.
About the Company
%20(1).png)