Position Description
The Cloud Data Engineer is part of a Data team that is responsible for supporting, modernizing, and transforming our data and reporting capabilities across our products by implementing a new modernized data architecture. The position will be responsible for day-to-day data collection, transportation, maintenance/curation, and access to the GOVX corporate data assets. The Cloud Data Engineer will work cross-functionally across the enterprise to centralize data and standardize it for use by business reporting, machine learning, AI, data science, and other stakeholders. This position plays a critical role in increasing awareness about available data and democratizing access to it across GOVX and our data partners.
This position will be under the supervision of the Business Intelligence Manager.
RESPONSIBLITIES
-
Supporting and modernizing existing data integrations.
-
Crafting and maintaining efficient data pipeline architecture.
-
Assembling large, complex data sets that meet business requirements.
-
Create and maintain optimal data pipeline/flow architecture.
-
Identifying, crafting, and implementing internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
-
Partner with business analysts, data scientists, and IT teams to translate business requirements into scalable data solutions using Fabric, Spark, using Delta Lake. Develop Spark notebooks for both batch and streaming ETL pipelines leveraging Delta Lake.
-
Implement and optimize Delta Lake features including schema enforcement, schema evolution, and time travel for robust data management. Optimize Delta Lake tables for performance using Z-ordering, compaction, and partitioning strategies.
-
Working with the team to strive for clean and meaningful data, and greater functionality and flexibility within the team's data systems.
-
Design processes supporting data transformation, data structures, metadata, dependency, and workload management.
Requirements
SKILL/REQUIREMENTS
-
Hands-on experience developing, debugging, and optimizing Spark notebooks for ETL and analytics in Microsoft Fabric and Azure.
-
Deep expertise in Microsoft Fabric, Dataflows Gen2, and Power BI integration.
-
Hands-on experience with Delta Lake table management, including schema evolution, versioning, and data compaction
-
Experience with Data Lakehouse and Medallion Architecture.
-
Experience with CI/CD and version control using Git.
-
Advanced SQL and NoSQL query authoring; Python and Spark scripting.
-
Proficiency with object-oriented/object function scripting languages: Python, Spark, etc.
-
Proficiency with Metadata-Driven Design and JSON.
-
Experience working with streams such as Event Hubs and Event Driven Architectures.
-
Experience building, maintaining, and optimizing ‘big data' data pipelines, architectures, and data sets.
-
Experience cleaning, testing, and evaluating data quality from a wide variety of ingestible data sources.
-
Knowledge Microsoft Power Platform including Copilot Studio and Power Apps.
-
Strong collaboration and communication skills with business and technical teams.
SUPERVISORY RESPONSIBILITY
This position has no supervisory responsibilities.
WORK ENVIRONMENT
This job operates in a professional office environment. This role routinely uses standard office equipment such as computers, phones, photocopiers, filing cabinets and fax machines. This role occasionally must lift and carry the office equipment. Occasional evening, night and weekend shifts are required.
Work Location
Due to state law and tax implications, remote work candidates must live and work in one of the following states: California, Texas, Washington, Florida, Tennessee, New York, or Colorado.
PHYSICAL/MENTAL DEMANDS
-
Physical - This is a sedentary role.
-
Mental - Problem solving, making decisions, interpreting data, organize, read/write.
-
Reasonable accommodation may be made to enable individuals with disabilities to perform the essential functions.
TRAVEL
Annual travel to the San Diego office headquarters is expected for this position.
Preferred Education And Experience
-
Bachelor's degree or equivalent experience.
-
7+ years of proven experience deploying and maintaining always-on data services.
-
2+ years building and maintaining Spark notebook-based pipelines in Microsoft Fabric.
-
2+ years of experience working with Microsoft Fabric
-
2+ years of experience working with Microsoft Azure
-
2+ years of experience with Delta Lake.
-
1+ years of experience with SQL Server.
Benefits
-
Paid Time Off, Paid Sick Leave, Paid Holidays
-
Competitive Medical, Dental, Vision, Short Term Disability, and Life Insurance
-
401(k) plan with discretionary match available
-
Flexible Spending Account (FSA), Health Savings Account (HSA)
-
Voluntary benefits including Critical Illness, Group Accident, and Voluntary Life
-
Employee Referral Program
-
Exposure to a growing ecommerce company
-
Discounts on the GOVX website
Salary Range
$120,000 - 140,000 Annually
AAP/EEO Statement
EOE. Veterans/Disabled. Reasonable accommodation may be made to enable individuals with disabilities to perform the essential functions.
Position will require successful completion of a background check and drug testing prior to starting employment.
About GOVX, Inc.
Savings for Those Who Serve
GOVX was founded in 2011 to offer exclusive benefits to those who serve our country. The GOVX membership is comprised of current and former members of the United States military, law enforcement, firefighting, medical services, and government personnel. We are dedicated to supporting these communities and to offering unique value to our members, while delivering an authentic platform for brands to reach our growing customer base. As the largest and fastest growing digital platform serving this deserving audience, we are committed to stretching the limits of ecommerce to deliver the best assortment for our members' on-duty and off-duty needs.