Hadoop Big Data Administrator/Engineer
Capco
Warsaw, Poland
We are seeking a highly skilled and motivated Big Data Administrator/Engineer with 5+ years of experience in Hadoop administration and expertise in automation tools like Ansible, shell scripting, or Python scripting. The ideal candidate will have strong DevOps skills and proficiency in coding, particularly in Python. This is a dynamic role focused on managing and engineering Big Data solutions across multiple open-source platforms such as Hadoop, Kafka, HBase, and Spark.
You will be responsible for performing critical Big Data administration, troubleshooting, debugging, and ensuring the seamless operation of various data processing frameworks. If you are a hands-on, results-driven individual with a passion for Big Data technologies, this is the role for you.
THINGS YOU WILL DO
Big Data Administration:
- Administer and manage Hadoop clusters, ensuring high availability, performance, and scalability.
- Maintain and troubleshoot Hadoop ecosystem components such as HDFS, MapReduce, YARN, and related tools.
- Ensure Kafka, HBase, and Spark systems are optimized and running smoothly.
- Implement monitoring and alerting for Big Data infrastructure.
Automation and Scripting:
- Develop and automate scripts using tools such as Ansible, Shell scripting, or Python to streamline Big Data administration tasks.
- Create reusable automation frameworks to reduce manual efforts and improve operational efficiency.
- Work on CI/CD pipelines for deployment automation and system integration.
DevOps Practices:
- Apply DevOps principles to the Big Data environment, focusing on continuous integration and continuous delivery (CI/CD).
- Build and manage automated deployment processes for Big Data clusters and services.
- Collaborate with development teams to integrate automation in Big Data workflows.
Troubleshooting and Debugging:
- Identify, troubleshoot, and resolve issues related to Big Data platforms, including system performance, resource utilization, and service failures.
- Work with logs, monitoring tools, and other debugging techniques to diagnose and resolve complex issues.
Collaboration and Support:
- Work closely with other teams to support data pipelines, data quality checks, and performance optimizations.
- Provide ongoing technical support to ensure that Big Data systems are stable, secure, and aligned with business objectives.
SKILLS & EXPERIENCES TO GET THE JOB DONE
- 5+ years of hands-on experience in Big Data administration, automation, and DevOps practices, including Hadoop ecosystem tools such as HDFS, YARN, and MapReduce.
- Proficient in managing and troubleshooting Kafka, HBase, and Spark.
- Experience with performance tuning, cluster optimization, and high-availability configurations.
- Strong experience with automation tools such as Ansible, Shell scripting, and Python scripting.
- Ability to automate deployment, monitoring, and administration tasks to increase operational efficiency
- Solid understanding of DevOps concepts and practices, including CI/CD, version control, and deployment pipelines.
- Hands-on experience with automation of infrastructure provisioning and management using tools like Terraform, Docker, and Kubernetes (optional but desirable).
- Proficiency in at least one programming language, with a preference for Python.
- Ability to write efficient, maintainable code for automation and integration tasks.
- Strong debugging skills and experience in resolving complex issues across distributed systems.
- Ability to think critically and provide practical solutions in high-pressure situations.
- Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience).
Nice to Have:
- Experience with cloud platforms (AWS, Azure, GCP) for Big Data solutions.
- Knowledge of containerization and orchestration tools (Docker, Kubernetes).
- Familiarity with data pipeline frameworks and tools like Apache NiFi, Airflow, or similar.
Apply Now
Don't forget to mention EuroTechJobs when applying.