Click here to join our community of experts to get information on job search, salaries and more.

Senior Java Spark Developer Or Big Data With Spark And Java

Company: Culinovo

Location: Austin, TX

Posted on: April 23

Job Title: Senior Java Spark Developer or Big Data with Spark and Java

Location: Austin, TX or Sunnyvale, CA (Hybrid-3 Days Onsite)

Work Mode: Hybrid

Duration: Long Term

Note: We require candidates on our W2. Independent Visa candidates can apply like EAD's, Green Cards, US Citizens, etc.

Experience Level: 9+ years

Job Summary

We are seeking a Senior Java Spark Developer with expertise in Java, Apache Spark, and the Cloudera Hadoop Ecosystem to design and develop large-scale data processing applications. The ideal candidate will have strong hands-on experience in Java-based Spark development, distributed computing, and performance optimization for handling big data workloads.

Key Responsibilities

Java & Spark Development:

  • Develop, test, and deploy Java-based Apache Spark applications for large-scale data processing.
  • Optimize and fine-tune Spark jobs for performance, scalability, and reliability.
  • Implement Java-based microservices and APIs for data integration.

Big Data & Cloudera Ecosystem:

  • Work with Cloudera Hadoop components such as HDFS, Hive, Impala, HBase, Kafka, and Sqoop.
  • Design and implement high-performance data storage and retrieval solutions.
  • Troubleshoot and resolve performance bottlenecks in Spark and Cloudera platforms.

Collaboration & Data Engineering:

  • Collaborate with data scientists, business analysts, and developers to understand data requirements.
  • Implement data integrity, accuracy, and security best practices across all data processing tasks.
  • Work with Kafka, Flume, Oozie, and Nifi for real-time and batch data ingestion.

Software Development & Deployment:

  • Implement version control (Git) and CI/CD pipelines (Jenkins, GitLab) for Spark applications.
  • Deploy and maintain Spark applications in cloud or on-premises Cloudera environments.

Required Skills & Experience

  • 8+ years of experience in application development, with a strong background in Java and Big Data processing.
  • Strong hands-on experience in Java, Apache Spark, and Spark SQL for distributed data processing.
  • Proficiency in Cloudera Hadoop (CDH) components such as HDFS, Hive, Impala, HBase, Kafka, and Sqoop.
  • Experience building and optimizing ETL pipelines for large-scale data workloads.
  • Hands-on experience with SQL & NoSQL databases like HBase, Hive, and PostgreSQL.
  • Strong knowledge of data warehousing concepts, dimensional modeling, and data lakes.
  • Proven ability to troubleshoot and optimize Spark applications for high performance.
  • Familiarity with version control tools (Git, Bitbucket) and CI/CD pipelines (Jenkins, GitLab).
  • Exposure to real-time data streaming technologies like Kafka, Flume, Oozie, and Nifi.
  • Strong problem-solving skills, attention to detail, and ability to work in a fast-paced environment.