
Senior Data Engineer
- Hyderabad, India
- Full time
- Competitive
- 28th March 2025
Full Description
Job Overview:
We are looking for a Senior Data Engineer with a deep understanding of Apache Spark (Scala & PySpark), Kafka Streams (Java), AWS services, Snowflake, Apache Iceberg, Tableau, and Data Lake architectures. As a senior member of our team, you will be responsible for leading the design, implementation, and optimization of large-scale data systems, real-time streaming solutions, and cloud-based data platforms. You will work with other engineers to deliver high-quality data solutions, mentor junior team members, and collaborate closely with cross-functional teams to solve complex business problems.Key Responsibilities:
- Lead the design and development of scalable, high-performance data architectures on AWS, leveraging services such as S3, EMR, Glue, Redshift, Lambda, and Kinesis. Architect and manage Data Lakes for handling structured, semi-structured, and unstructured data.
- Design and build complex data pipelines using Apache Spark (Scala & PySpark), Kafka Streams (Java), and cloud-native technologies for batch and real-time data processing. Optimize these pipelines for high performance, scalability, and cost-effectiveness.
- Develop and optimize real-time data streaming applications using Kafka Streams in Java. Build reliable, low-latency streaming solutions to handle high-throughput data, ensuring smooth data flow from sources to sinks in real-time.
- Manage Snowflake for cloud data warehousing, ensuring seamless data integration, optimization of queries, and advanced analytics. Implement Apache Iceberg in Data Lakes for managing large-scale datasets with ACID compliance, schema evolution, and versioning.
- Design and maintain highly scalable Data Lakes on AWS using S3, Glue, and Apache Iceberg. Ensure data is easily accessible, stored in optimal formats, and well-integrated with downstream analytics systems.
- Work with business stakeholders to create actionable insights using Tableau. Build data models and dashboards that drive key business decisions, ensuring that data is easily accessible and interpretable.
- Continuously monitor and optimize Spark jobs, Kafka Streams processing, and other cloud-based data systems for performance, scalability, and cost. Implement best practices for stream processing, batch processing, and cloud resource management.
- Lead and mentor junior engineers, fostering a culture of collaboration, continuous learning, and technical excellence. Ensure high-quality code delivery, adherence to best practices, and optimal use of resources.
- Work closely with Data Scientists, Product Managers, and DevOps teams to understand business needs and deliver impactful data solutions. Participate in technical discussions, from system design to data governance.
- Ensure that data pipelines, architectures, and systems are thoroughly documented and follow coding and design best practices. Promote knowledge-sharing across the team to maintain high standards for quality and scalability.
Required Skills & Qualifications:
Education:
Bachelor's or Master's degree in Computer Science, Engineering, or related field (or equivalent work experience).Experience:
- 5+ years of experience in Data Engineering or a related field, with a proven track record of designing, implementing, and maintaining large-scale distributed data systems.
- Proficiency in Apache Spark (Scala & PySpark) for distributed data processing and real-time analytics.
- Hands-on experience with Kafka Streams using Java for real-time data streaming applications.
- Strong experience in Data Lake architectures on AWS, using services like S3, Glue, EMR, and data management platforms like Apache Iceberg.
- Proficiency in Snowflake for cloud-based data warehousing, data modeling, and query optimization.
- Expertise in SQL for querying relational and NoSQL databases, and experience with database design and optimization.
Technical Skills:
- Strong Experience in building ETL pipelines using Spark(Scala & Pyspark) and maintain them.
- Proficiency in Java, particularly in the context of building and optimizing Kafka Streams applications for real-time data processing.
- Experience with AWS services (e.g., Lambda, Redshift, Athena, Glue, S3) and managing cloud infrastructure.
- Expertise with Apache Iceberg for handling large-scale, transactional data in Data Lakes, supporting versioning, schema evolution, and partitioning.
- Experience with Tableau for business intelligence, dashboard creation, and data visualization is a plus.
- Knowledge of CI/CD tools and practices, particularly in data engineering environments.
- Familiarity with containerization tools like Docker and Kubernetes for managing cloud-based services.
Soft Skills:
- Excellent problem-solving skills, with a strong ability to debug and optimize large-scale distributed systems.
- Strong communication skills to engage with both technical and non-technical stakeholders.
- Proven leadership ability, including mentoring and guiding junior engineers.
- A collaborative mindset and the ability to work across teams to deliver integrated solutions.
Preferred Qualifications:
- Experience with stream processing frameworks like Apache Flink or Apache Beam.
- Knowledge of machine learning workflows and integration of ML models in data pipelines.
- Familiarity with data governance, security, and compliance practices in cloud environments.
- Experience with DevOps practices and infrastructure automation tools such as Terraform or CloudFormation.
The organisation
- Data & Technology
- New York, USA
- 2000+ employees
- Website
Relentlessly Enhancing the Fan Experience
More jobs from Fanatics

- Cleveland, USA
- Part time
- Competitive

- Milan, Italy
- Full time
- Competitive

- Hyderabad, India
- Full time
- Competitive

- Las Vegas, USA
- Full time
- Competitive

- Jacksonville, USA
- Full time
- Competitive
Create a job alert
Get notified as soon as new jobs matching your ambitions go live.