Build Data applications with high accuracy and performance across traditional and distributed computing platforms.
 
- 
Design, build, and maintain high performance, reusable, and reliable code quality and features being delivered efficiently and on-time. Document everything. 
- 
Develop database processes, gather, and process raw data at scale (including writing scripts, web scraping, calling APIs, write SQL queries in MySQL, handle data cloud etc.). 
- 
Administer data processing workflows associated with tools like MySQL, Oozie, Zookeeper, Sqoop, Hive, Impala for data processing across the distributed platform. 
- 
Work closely with our engineering team to integrate your amazing innovations and algorithms into our production systems. 
- 
Support business decisions with ad hoc analysis as needed and troubleshoot production issues and identify practical solutions. 
- 
Routine check-up, back-up and monitoring of the entire MySQL and Hadoop ecosystem. 
- 
Take end-to-end responsibility of the Traditional Databases (MySQL), Big Data ETL, Analysis and processing life cycle in the organization and manage deployments of bigdata clusters across private and public cloud platforms. 
- 
4+ years of experience with SQL (MySQL) a must.
 2+ years of Hands-on experience working with Cloudera Hadoop Distribution platform and Apache Spark.
 
- 
Strong understanding of full dev life cycle, for backend database applications across RDBMS and distributed cloud platforms. 
- 
Experience as a Database developer writing SQL queries, DDL/DML statements, managing databases, writing stored procedures, triggers and functions and knowledge of DB internals. 
- 
Knowledge of database administration, performance tuning, replication, backup, and data restoration. 
- 
Comprehensive knowledge of Hadoop Architecture and HDFS, to design, develop, document and architect Hadoop applications. Working knowledge of SQL, NoSQL, data warehousing & DBA along with Map-Reduce, Hive, Impala, Kafka, HBase, Pig, and Java. 
- 
Experience processing large amounts of structured and unstructured data, extracting, and transforming data from remote data stores, such as relational databases or distributed file systems. 
- 
Working expertise with Apache Spark, Spark streaming, Jupyter Notebook, Python or Scala programming. 
- 
Excellent communication skills, ability to tailor technical information for different audiences. Excellent teamwork skills, ability to self-start, share insights, ask questions, and report progress. 
- 
Working knowledge of the general database architectures, trends, and emerging technologies. Familiarity with caching, partitioning, storage engines, query performance tuning, indexes, and distributed computing frameworks. 
- 
Working knowledge & understanding of data analytics or BI tools - like looker studio, Power BI, or any other BI tool is a must. 
About Affinity:
Affinity is an ad-tech company which creates user engagement products (branding and performance) for digital media. It is in the business of creating sustainable and scalable advertising/media products with special attention to user experience. Established in 2006, Affinity is a 400+ employee company that operates 7 business units, namely - mCanvas, Siteplug, VEVE, AdopsOne, Yield Solutions , Nucleus-Links and Affinity Germany . For more information, visit 
www.affinity.com.