Command Palette

Search for a command to run...

GitHub
Profile Cover
Ishita's avatar

Ishita 

Driven to make AI safe, reliable, and accessible to all.

Overview

Founding Engineer and AI Lead @Storefox.ai

AI Researcher @Hashkraft

Computer Vision Research Intern @Coriolis

Research Intern @DESY

Hyderabad, India

Social Links

About

I'm an AI engineer with 3+ years of industry experience building robust, scalable, and intelligent systems at the intersection of machine learning, software engineering, and AI safety. My work spans the development of production-ready ML pipelines, large language model (LLM) applications, and scalable cloud infrastructure.

I currently work at Storefox.ai, where I lead the development of audio intelligence systems that convert raw retail conversations into actionable business insights using custom-built pipelines, robust LLM evaluations, and implementing best engineering practices. My contributions have not only driven measurable improvements in accuracy and efficiency, but also shaped product direction through client collaboration and team leadership.

Previously, I built LLM-driven platforms for hiring and chatbots for real estate sector at Hashkraft, developed face recognition systems at scale with Coriolis Technologies, and contributed to scientific ML research at institutes like CERN, DESY, and IISER Pune. I’ve worked across diverse domains—computer vision, NLP, anomaly detection, and particle physics—always focusing on applying AI in meaningful, interpretable, and responsible ways.

I'm also passionate about AI safety and alignment, having participated in the Whitebox Research AI Safety Fellowship, where I explored concepts in interpretability and worked on evaluating the unfaithfulness of chain-of-thought reasoning in LLMs.

My tech stack includes tools like Python, PyTorch, FastAPI, Docker, Kubernetes, and AWS, along with modern AI-focused platforms such as Portkey, Instructor, Claude, and ChatGPT. Whether it’s deploying deep learning models at scale, building RAG pipelines with vector DBs, or improving system reliability with tools like Ansible and Kafka, I enjoy working across the full stack to turn research into impact.

Let's connect and collaborate!

Stack

Experience

Storefox.ai

Current Employer
  • Led the end-to-end design and implementation of a scalable audio processing pipeline for analyzing customer-representative interactions in retail environments, delivering actionable insights to drive business performance in physical stores.
  • Developed a robust system where raw audio is captured and cleaned using Silero Voice Activity Detection (VAD) to remove silent segments, followed by transcription via Gemini LLM models optimized for noisy retail settings.
  • Designed a two-tiered processing architecture: internal clips were filtered and categorized using LLMs, while external clips were further analyzed to generate structured insights tailored to client needs.
  • Instrumental in the development, testing, and validation of each pipeline component—built custom evaluation datasets and benchmarks to compare model performance, prompts, and filtering strategies.
  • Achieved a consistent 80% accuracy across all insight-generation tasks, earning positive feedback from over 15 active clients.
  • Reduced audio processing time by 50% (from 3 minutes to 1.5 minutes) and cut cost per hour of audio from $100 to $20, enabling the introduction of multiple pricing tiers for scalability.
  • Acted as the technical lead and manager for 4 interns, guiding their workstreams and significantly improving team productivity through structured task delegation and mentoring.
  • Worked closely with clients to understand domain-specific needs and customized the pipeline accordingly, resulting in high client satisfaction and successful adoption across diverse use cases.
  • Collaborated cross-functionally with product, frontend, and backend engineering teams to align development with business goals, effectively multitasking across responsibilities and delivering high-impact results.
  • Azure Cloud
  • GCP
  • Speech-to-Text
  • AWS
  • AWS SageMaker
  • Prompt Engineering
  • LLMs
  • GPT/Gemini models
  • LLM evaluations
  • Instructor
  • MongoDB
  • Jinja2
  • Docker
  • Redis
  • Celery
  • Python
  • AI/ML
  • API design
  • Portkey

Hashkraft

  • Developed a scalable AI-driven hiring platform that streamlined candidate evaluation, matching, and communication, with an estimated 50% reduction in hiring time through automation and intelligent retrieval.
  • Leveraged cutting-edge LLMs (GPT-4) alongside Retrieval-Augmented Generation (RAG) pipelines using LangChain, integrating with Chroma vector database to enable real-time semantic candidate-job matching.
  • Conducted detailed benchmarking and evaluations of different LLM configurations, embeddings, and RAG architectures to optimize cost, latency, and accuracy.
  • Led the design and deployment of an AI-based real estate chatbot, allowing users to search, query, and retrieve property listings using natural language.
  • Utilized GPT-4 for conversational context handling and Elasticsearch for fast and accurate property data retrieval across diverse queries.
  • Designed the entire system architecture, ensuring efficient state management, query parsing, and document retrieval, resulting in highly responsive and accurate user interactions.
  • Deployed the chatbot to production using AWS services (EC2, S3, Lambda), ensuring scalability, security, and uptime.
  • Managed and mentored a team of two interns, overseeing their project pipelines, providing code reviews, and ensuring timely deliverables aligned with business goals.
  • Set up project tracking, documentation workflows, and weekly syncs, improving team collaboration and technical output.
  • Drove cross-functional collaboration between backend, frontend, and product teams to ensure seamless integration and user experience.
  • LLMs
  • GPT/Gemini models
  • RAG
  • Langchain
  • Pinecone
  • Chroma
  • NLP
  • AWS
  • Elasticsearch
  • Docker
  • Scrapy
  • Jira
  • GitHub
  • Whatsapp API
  • API design
  • Prompt Engineering
  • Python
  • Machine Learning
  • AI/ML

Coriolis Technologies

  • Developed an AI based surveillance software with a team of about 5-6 people.
  • Deployed on scale face recognition algorithms with an accuracy of 75% on Spark.
  • Optimised object detection on spark increasing processing speed from 16 to 30 frames/s.
  • Improved the loading time of the UI reducing it from 1-2 minutes to 15 seconds.
  • Developed a scalable and fault-tolerant system that can process 60 million images per day by deploying it on Kubernetes.
  • Automated the entire process of setting up our Kubernetes cluster using Ansible, reducing the deployment time from 3 hours to 15 minutes.
  • Managed a team of 7 full time interns working on 4 different projects, overseeing their onboarding and mentoring, increasing the productivity of the entire team by 13%.
  • Computer Vision
  • Face Recognition
  • Object Detection
  • Apache Spark
  • PySpark
  • Docker
  • Ansible
  • Apache Kafka
  • Elasticsearch
  • Opensearch
  • OpenCV
  • PyTorch
  • Github
  • Kubernetes
  • Ansible
  • DevOps
  • Team Management
  • Python
  • Machine Learning
  • AI/ML
  • System Optimization
  • UI Development

Conseil européen pour la Recherche Nucléaire (CERN)

  • Contributed to a joint project between CERN and IISER Pune, focused on synthetic data generation for the Higgs boson decay process to enable more robust statistical analysis in particle physics experiments.
  • Implemented the RealNVP normalizing flow model to generate synthetic data points from latent space, matching the original distribution of Higgs decay data from particle physics experiments.
  • After performing statistical analysis on synthetically generated data, we found that the statistical results closely matched the experimental values, confirming the viability of RealNVP models for data augmentation in scientific research.
  • Particle Physics
  • Synthetic Data Generation
  • RealNVP
  • Normalizing Flows
  • Generative Modeling
  • PyTorch
  • MADGraph
  • Statistical Analysis
  • Deep Learning
  • Python
  • Scientific Research
  • Data Augmentation
  • Machine Learning

Deutsches Elektronen-Synchrotron (DESY)

  • Developed a predictive anomaly detection algorithm to forecast system failures during data transfers between global university experiments and DESY's data unit.
  • Established and managed a distributed computing environment using Apache Spark to efficiently process terabytes of data, enabling scalable analysis for anomaly detection.
  • Applied logistic regression on high-dimensional data, achieving an 85% accuracy in predicting system downtimes.
  • Anomaly Detection
  • Predictive Analytics
  • Apache Spark
  • Distributed Computing
  • Logistic Regression
  • Big Data
  • Machine Learning
  • Python
  • Data Science
  • High-Dimensional Data

Indian Institute of Science Education and Research (IISER), Pune

  • Explored and evaluated various clustering algorithms to identify the most effective method for analyzing epigenetic data.
  • Developed and tested implementation pipelines for algorithms including K-Means, DBSCAN, Hierarchical Clustering, and others using Python and scikit-learn.
  • Designed sample test cases and applied clustering techniques to biological datasets, comparing results across algorithms for accuracy and reliability.
  • Assessed clustering quality using metrics such as the elbow method, silhouette score, and intra-cluster distance to determine optimal performance.
  • Gained experience in unsupervised machine learning, data preprocessing, and applying statistical metrics to real-world scientific datasets.
  • Clustering Algorithms
  • K-Means
  • DBSCAN
  • Hierarchical Clustering
  • Unsupervised Learning
  • Data Preprocessing
  • Statistical Analysis
  • Deep Learning
  • Python
  • Machine Learning
  • AI/ML
  • Data Science
  • Data Preprocessing
  • Statistical Analysis

Genmark.ai

  • Contributed to improving automation of client servicing tasks by streamlining API-based operations via the chatbot interface.
  • Designed and developed an agentic chatbot capable of understanding complex user queries and executing appropriate actions autonomously.
  • Integrated the chatbot with Genmark.ai's internal APIs, enabling real-time interaction and dynamic response generation based on API specifications.
  • Agentic AI
  • LLMs
  • Python
  • Flask
  • Firebase
  • GCP
  • API design
  • Prompt Engineering

University of Glasgow

  • Worked on a research-oriented project exploring how neural networks can be used to probe Effective Field Theory (EFT) couplings, inspired by the ATLAS experiment and the paper "Parameterized Machine Learning for High-Energy Physics".
  • Processed simulated event-level datasets using NumPy and Pandas, followed by exploratory data analysis (EDA) to understand feature distributions and parameter dependencies.
  • Developed and trained a parameterized neural network that incorporates both event features and physics parameters as inputs, enabling smooth interpolation across different EFT coupling values.
  • This project strengthened my knowledge of machine learning in high-energy physics, EFT parameterization, and designing neural networks that generalize over a range of theoretical parameters.
  • Machine Learning
  • Deep Learning
  • Python
  • Data Science
  • High-Energy Physics
  • Effective Field Theory
  • Parameterized Machine Learning
  • Neural Networks
  • AI/ML
  • Data Science
  • Data Preprocessing

AI Safety(2)

Selected as one of 20 fellows from a pool of over 1000 applicants for the prestigious AI Safety Fellowship organized by Whitebox Research.

  • Participated in a rigorous 3-month program focused on foundational and advanced concepts in machine learning and AI alignment, with a strong emphasis on technical understanding and interpretability.
  • Completed structured coursework and discussions on topics such as Model evaluations, SAEs, AI Interpretability, RLHF and more.
  • Engaged in weekly mentor-led sessions and peer-group discussions, reinforcing technical concepts through collaborative learning and problem-solving.
  • Developed a capstone research project in the final phase, investigating unfaithfulness in Chain-of-Thought (CoT) reasoning—analyzing where and how model-generated reasoning deviates from true causal chains.
  • Gained hands-on experience with theoretical alignment techniques and interpretability tools, sharpening my understanding of how to evaluate and constrain model behavior.
  • Machine Learning
  • AI Safety
  • AI Interpretability
  • Chain-of-Thought
  • RLHF
  • Model Evaluation
  • Research
  • Technical Writing

Capstone research project as part of the AI Safety Fellowship (Whitebox Research) focused on evaluating the faithfulness of Chain-of-Thought (CoT) reasoning in large multimodal models.

  • Studied how multimodal LLMs (Claude 3.7 Sonnet and Gemini 2.0 Flash Experimental) reason over semantically equivalent math problems presented in both text and image modalities using a custom-curated subset of the PutnamBench dataset.
  • Built a 5-stage end-to-end evaluation pipeline.
  • Designed a normalized unfaithfulness metric to compare reasoning across problems with varying lengths and complexities.
  • Found that both models showed comparable reasoning patterns and accuracy across modalities, with very low incidence of fully unfaithful shortcuts.
  • Identified limitations including compute constraints, token limits, and reliance on LLM-based auto-raters for CoT step evaluation and proposed future directions including expanding the benchmark across different domains and developing a Unfaithful Shortcuts Benchmark for more comprehensive faithfulness testing.
  • AI Safety Evaluations
  • Chain-of-Thought
  • Python
  • Research
  • LLMs
  • Prompt Engineering
  • Model Evaluation
  • Model Safety
  • Chain-of-Thought
  • Technical Writing

Projects(7)

Built an art analysis application using GPT-4 Vision API to generate comprehensive critiques of artwork across various media types.

  • Designed a structured analysis pipeline using OpenAI’s GPT-4 Vision, integrated through the OpenAI API for generating image-based art critiques.
  • Implemented prompt engineering with Jinja2 templating to craft dynamic, customizable prompts guiding GPT output toward detailed and coherent evaluations.
  • Utilised Pydantic models and the Instructor library to enforce strict type-safe, structured outputs from GPT responses—ensuring reliability and consistency.
  • Developed a clean, responsive Streamlit UI allowing users to upload artwork, enter their API key, and receive AI-powered evaluations with scoring and interpretation.
  • Provided detailed output including formal analysis, artistic interpretation, cultural context, constructive feedback, and granular scoring across multiple criteria.
  • Enhanced my skills in multimodal AI integration, prompt design, structured data handling, and deploying AI services in real-time interactive applications.
  • Machine Learning
  • Content-Based Filtering
  • Data Analysis
  • Python
  • Kaggle Dataset
  • Algorithm Development
  • Netflix API
  • Recommendation Engine

A movie and TV series recommendation website that suggests content based on users' previous watched content, saving time and effort in searching for content on Netflix.

  • Utilizes a content-based recommendation system algorithm
  • Developed using the Netflix Movies and TV Shows dataset from Kaggle (3.4 MB of data)
  • Provides personalized recommendations based on viewing history
  • Saves users 10-15 minutes daily through intelligent content suggestions
  • Implements efficient algorithms for content matching and user preference analysis
  • Machine Learning
  • Content-Based Filtering
  • Data Analysis
  • Python
  • Kaggle Dataset
  • Algorithm Development
  • Netflix API
  • Recommendation Engine

Personal project demonstrating my understanding of attention mechanisms and transformer architecture by recreating the Transformer model ("Attention Is All You Need") from scratch using PyTorch.

  • Built a modular encoder-decoder architecture with multi-head attention, positional encoding, and feed-forward layers, adhering closely to the original paper.
  • Designed each component (attention, layer norm, embeddings, etc.) as separate modules for clarity and flexibility.
  • Implemented a full training pipeline for English-to-French translation using the OPUS Books dataset, with masking, label smoothing, and BLEU score evaluation.
  • Integrated advanced training techniques such as learning rate scheduling, dropout regularization, and Adam optimizer with weight decay.
  • Used HuggingFace tokenizers for text preprocessing and sequence handling, ensuring vocabulary management and consistent batching.
  • Demonstrated ability to work with neural network layers, data loaders, optimizers, and attention mechanism directly in PyTorch.
  • Python
  • PyTorch
  • Computer Vision
  • Deep Learning
  • Image Classification
  • Convolutional Neural Networks
  • Transfer Learning
  • Model Training
  • Model Evaluation

Worked on a real-world data analysis project for a competition organised by Telangana Government in collaboration with Codebasics to extract actionable insights from public datasets.

  • Performed in-depth analysis across multiple domains including document registration (e-Stamps), transportation, industrial investments (TS-iPASS), and government schemes.
  • Analysed district-level revenue growth from stamp registration, identifying key contributing districts like Rangareddy and Medchal-Malkajgiri based on proximity to Hyderabad and industrial zone development.
  • Evaluated transport sales trends across Telangana districts from 2019–2023, identifying growth in electric and petrol vehicle sales and highlighting regional variations.
  • Assessed sector-wise investments through TS-iPASS data, correlating them with employment, infrastructure, and proximity to special economic zones (SEZs).Identified top 5 districts for commercial property investment based on multi-factor data analysis, including revenue trends, document registration, and industrial activity.
  • Proposed actionable recommendations for improving e-Stamp adoption, targeting infrastructure investments, and boosting agricultural and industrial growth.
  • This project enhanced my skills in data analysis, storytelling with data, geospatial and sectoral insights generation, and converting public data into development-focused recommendations.
  • Python
  • PyTorch
  • Computer Vision
  • Deep Learning
  • Image Classification
  • Convolutional Neural Networks
  • Transfer Learning
  • Model Training
  • Model Evaluation

Honors & Awards(4)

Certifications(5)

AI Trailblazer

Issued by
Verix
Issued on

Neural Networks and Deep Learning

Issued by
Coursera
Issued on

Introduction to TensorFlow for Artificial Intelligence, Machine Learning, and Deep Learning

Issued by
Coursera
Issued on

Machine Learning

Issued by
Coursera
Issued on

Programming for Everybody (Getting Started with Python)

Issued by
Coursera
Issued on