Data Science - JD
Key Responsibilities
-
Work with data from multiple internal/external data sources and APIs including Hadoop clusters
-
Work with structured as well as unstructured data
-
Data wrangling for cleanup, handle missing data, standardize/scale data
-
Explore and analyze data using visualizations to uncover hidden patterns
-
Use ML techniques including linear/logistic regression, decision trees, classification, clustering, ensembles, text mining, social networking analysis to build models
-
Use these models to identify patterns and rules which help unearth business insights and aid in decision making
Education & Experience
-
Bachelor, Master's or PhD in Data Science, Computer Science, Statistics, Mathematics, or related quantitative field
-
3+ years of hands-on experience with NLP, recommendation systems, or generative AI
Technical Skills
-
Programming
: Expert-level Python; proficiency in SQL, or Scala
-
ML Frameworks
: PyTorch, TensorFlow, scikit-learn, Hugging Face Transformers
-
NLP Libraries
: spaCy, NLTK, Gensim, transformers
-
Big Data
: Spark, Hadoop, or similar distributed computing frameworks
-
Cloud Platforms
: AWS/GCP/Azure ML services and infrastructure
-
MLOps
: Experience with model deployment, monitoring, and CI/CD pipelines
Databases
: SQL and NoSQL databases, vector databases (Opensearch )
-
Deep understanding of transformer architectures and attention mechanisms
-
Experience with recommendation system architectures (matrix factorization, deep learning approaches)
-
Knowledge of information retrieval and search relevance
-
Familiarity with A/B testing and statistical significance testing
-
Understanding of ML interpretability and fairness considerations