Shrey Jaradi

Data Scientist & Engineer

Transforming data into actionable insights through innovative solutions

Work Experience

Data Scientist

Global Action Alliance, Chicago, USA

Feb 2025 - Present

  • Developed NLP models to analyze 100K+ biomedical records, improving document classification accuracy by 22%
  • Designed and implemented multi-instance data processing on AWS EC2 for large-scale dataset extraction and cleaning, accelerating processing time by 18%

Data Science Intern

Pure Data Consulting Inc, Chicago, USA

Sept 2024 – Feb 2025

  • Developed Retrieval-Augmented Generation (RAG) pipeline using pre-trained Large Language Models (LLMs), improving response accuracy for knowledge-based systems and reducing query resolution time by 16%
  • Automated data preprocessing workflows using Python and SQL, improving data readiness for machine learning models
  • Delivered actionable insights through statistical analyses and data visualizations using Power BI, including A/B testing of new model features

Data Engineer

Accenture Solutions Private Limited (Client: Microsoft), India

June 2021 - Aug 2022

  • Built and optimized scalable data pipelines in Azure Data Factory and Synapse to enable efficient real-time data processing and predictive analytics
  • Reduced decision-making time by 8% for leadership teams by creating 5+ interactive Power BI dashboards
  • Boosted reporting efficiency by 12% through SQL optimization and 4 Azure Analysis Services tabular models

Data Engineer

Softtek India Private Limited, India

April 2019 - June 2021

  • Reduced manual ticketing workload by 9% by developing a ServiceNow automation app using Google Dialogflow APIs
  • Developed a predictive classification model using Python, NLP, and FastText to streamline IT support operations
  • Automated internal processes and cut manual effort by 8% for Nova University by leading a team to design an account termination solution

Projects

Cryptocurrency Predictive Analysis

Designed a predictive analytics pipeline leveraging Random Forest, CatBoost, and HuggingFace BERT models to achieve 67% accuracy, enabling data-driven investment insights with optimized model deployment workflows.

Python Machine Learning BERT Random Forest CatBoost

NLP Document Classification

Developed advanced NLP models for biomedical document classification, achieving 22% improvement in accuracy through innovative preprocessing and model architecture.

NLP Python Deep Learning AWS

RAG Pipeline Development

Implemented Retrieval-Augmented Generation pipeline using pre-trained LLMs, reducing query resolution time by 16% while improving response accuracy.

LLM RAG Python NLP

Skills

Programming

Python R SQL PowerBI

Machine Learning

Predictive Modelling NLP CNN PyTorch TensorFlow Model Deployment RAG Langchain

Cloud & Big Data

Azure ADF Synapse Data Explorer Data Lake SQL Server MongoDB

Others & Methodology

Git Agile DataBricks REST APIs Workflow Automation DevOps CI/CD

Get in Touch

Feel free to reach out for opportunities or collaborations

Contact Me