We are seeking a highly skilled and motivated Full Stack Python Developer to join our team . The ideal candidate will be flexible, thrive in a demanding startup environment, and report directly to the Lead Engineer. This is a work from office’ position at our office in Bangalore.
Key Responsibilities
- Design and develop web scraping frameworks to extract structured and unstructured data from a variety of sources (government, regulatory, financial, and open-web).
- Build and maintain ETL pipelines for large-scale ingestion, transformation, and loading of data into data warehouses or document stores.
- Write efficient, scalable, and resilient Python code, including retry logic, job orchestration, and error handling.
- Work with Celery / Airflow / Dagster (or equivalent) to automate and schedule scraping and ETL jobs.
- Implement anti-bot bypass mechanisms including CAPTCHA solving integrations, headless browsers (e.g., Playwright), and session management.
- Ensure data quality, deduplication, and validation at every stage of the pipeline.
- Monitor and optimize scraping & ETL performance for speed, cost, and reliability.
- Collaborate with DevOps and Infra teams to deploy and scale workloads on AWS (ECS/Lambda/Batch) or similar cloud environments.
Must have
- 3+ years hands-on experience in Python with a strong focus on data engineering.
- Proven experience with web scraping frameworks (Playwright, Requests, BeautifulSoup, Selenium, Scrapy).
- Strong understanding of ETL design patterns and best practices.
- Experience with task orchestration tools (Celery, Airflow, Dagster).
- Proficiency with databases and data stores (MySQL, PostgreSQL, MongoDB, S3, or Data Lakes).
- Working knowledge of Docker and containerized deployments.
- Familiarity with cloud environments (AWS preferred).
- Strong debugging, logging, and monitoring skills.
- Uses Git for source code management.
- Continuous Deployment - Writing new application features so they can be deployed with zero downtime.
- Strong ability to articulate architectures and problem statements.
- Experience with Kubernetes deployment.
Good to have
- Experience with OCR tools (e.g., EasyOCR, Tesseract) for scraping complex sources.
- Knowledge of distributed scraping or proxy rotation strategies.
- Familiarity with data modeling and schema evolution.
- Exposure to message queues (Redis, RabbitMQ, Kafka).
- Experience with API integrations and data ingestion from third-party services.
Plus Points
- Prior experience working in a fast-paced startup environment.
- Domain knowledge in the financial/data space.
- Any external, relevant certifications.
- Hands-on knowledge of deployment and monitoring tools.
The process
- Candidates who get shortlisted will need to attend a scheduled coding challenge in our office.
- Those who qualify will need to go through a second round of interview
- Suitable candidates will be given an offer as per the company policy.
Key Skills
Skills highlighted with ‘‘ are preferred keyskills
PandasDjango FrameworkNumpyPython
DockerPython DevelopmentKubernetes DeploymentWeb ScrapingSQLKubernetesFlask