Overview:
About the Role
We are seeking a skilled Data Engineer to join the PDP team and contribute to building scalable, reliable data pipelines that power PayPal's payments ecosystem. You will work on high-volume data processing, real-time eventing, and analytics infrastructure that directly impacts PayPal's financial operations and business intelligence.
Responsibilities:
Key Responsibilities
-
Design, develop, and maintain large-scale data pipelines processing millions of payment events daily
-
Build and optimize Apache Spark jobs for batch and streaming data processing
-
Develop complex SQL queries and transformations for data analysis and reporting
-
Implement data models and schemas in GCP BigQuery for analytics and downstream consumption
-
Ensure data quality, completeness, and correctness through validation and reconciliation frameworks
-
Collaborate with cross-functional teams (Finance, Risk, Analytics) to understand data requirements and deliver solutions
-
Troubleshoot and resolve data pipeline issues with minimal supervision
-
Contribute to platform modernization and cloud migration initiatives
-
Participate in code reviews, design discussions, and technical documentation
Requirements:
Required Qualifications (Must Have)
-
Experience: 3+ years of experience as a Data Engineer or similar role
-
Apache Spark: Proven hands-on experience with Spark for big data processing (Spark SQL, DataFrames, Datasets)
-
SQL: Strong ability to write complex SQL queries for data manipulation, transformation, and analysis
-
Programming: Expertise in Scala (highly preferred) or Python
-
Cloud Data Stores: Experience with Google Cloud Platform (GCP), particularly BigQuery for data warehousing and analytics
-
Problem Solving: Excellent analytical and problem-solving skills with ability to work independently with minimal support
Preferred Qualifications (Nice to Have)
-
GCP certification (e.g., Professional Data Engineer)
-
Familiarity with distributed systems concepts and architecture
-
Experience with big data processing tools and techniques (Kafka, Pub/Sub, Dataflow)
-
Experience with real-time streaming data pipelines
-
Knowledge of data modeling and schema design best practices
-
Exposure to AI/ML concepts and applications
-
Experience in payments, fintech, or financial services domain