Position Summary:
We are seeking an experienced Machine Learning Engineer / Data Scientist specializing in industrial time-series analytics to develop and deploy advanced AI solutions using OSIsoft PI System and SQL Server data. The role involves building scalable ETL pipelines, engineering high-frequency sensor data, and developing predictive models for anomaly detection, predictive maintenance, and process optimization. The ideal candidate will have strong expertise in machine learning, time-series modeling, SQL Server integration, and MLOps practices, with the ability to operationalize models in production environments for real-time industrial applications.
Key Responsibilities:
1. Industrial Time-Series Data Engineering & Integration
-
Design and implement robust ETL/ELT pipelines that extract high-volume, high-velocity data from OSIsoft PI Tags/Events using PI AF, ODBC, etc.
-
Perform complex feature engineering on time-series data, including handling irregular sampling intervals, sensor gaps, outliers, and noise filtering.
-
Synchronize PI System data with structured relational data in SQL Server to create rich training datasets.
-
Optimize data retrieval strategies from PI System to ensure low-latency access for model training and real-time inference.
2. Machine Learning Model Development
-
Develop, train, and validate machine learning models specifically for industrial time-series problems, such as:
-
Predictive Maintenance (remaining useful life, fault detection).
-
Anomaly Detection in sensor streams.
-
Process Parameter Optimization and Yield Prediction.
-
Apply advanced statistical methods and ML algorithms (ARIMA, LSTM, XGBoost, Random Forest, Isolation Forests).
-
Conduct extensive feature selection and dimensionality reduction techniques tailored to temporal dependencies.
3. SQL Server Integration & Deployment
-
Write efficient T-SQL queries and stored procedures to aggregate, summarize, and join PI data with SQL Server tables.
-
Deploy models into production environments, potentially leveraging SQL Server, deploying models via REST APIs integrated with SQL backends.
-
Ensure seamless data flow between the PI System (historical/time-series) and SQL Server (transactional/relational) for model retraining pipelines.
4. MLOps & Operationalization
-
Implement MLOps best practices for versioning time-series datasets and models.
-
Monitor model performance and data drift, particularly accounting for changes in sensor behavior or process conditions.
Qualifications & Requirements:
-
Academic:
Bachelor's Degree in Computer Science or related fields.
-
Experience:
Minimum of 7 Years experience.