Hello! I'mRahul Purswani
Welcome to my corner of the internet! I'm a data-driven engineer with a passion for building across the entire data lifecycle — from data pipelines and infrastructure to analytics and machine learning models. Graduated in May 2025 and open to full-time roles where I can help turn data into impact. Let’s connect!

About me 🙋
My journey in computer science began with a Bachelor of Science degree, where I laid the foundation for my career. Now, as I pursue my Masters in Computer Science, I'm diving deeper into specialized areas like advanced algorithms, computer vision, data mining, deep learning, and natural language processing. Along the way, I've realized that the true mastery comes when I'm actively building projects aligned with the new skills I want to develop. My internship at ZeroEyes and my side projects have been invaluable in providing practical experience and shaping my skills across different areas, allowing me to apply theoretical knowledge in real-world scenarios and further honing my skills.
With a focus on backend engineering and a passion for machine learning, I'm eager to bridge these domains. Graduating in December 2024, I'm actively seeking full-time roles in backend software engineering, machine learning, and data-related positions. Let's connect and explore how we can achieve our goals together.
Software Engineer Intern
ZeroEyes Inc.
Again, I was working in the Data Engineering Team, where I contributed to projects focused on building scalable pipelines, automating workflows, and curating large datasets to support downstream tasks.
• Built a scalable ETL pipeline using Selenium, Scrapy, and AWS EC2 to collect 1.1M+ images for a client’s ML pipeline; improved ingestion efficiency by 20% using image hashing.
• Wrote advanced SQL queries to curate and update 500K+ datapoints for downstream ML workflows; also researched and documented data augmentation methods like AutoAugment, RandAugment, and AutoDA to support experimentation.
Febuary 2024 - May 2024Research and Development Intern
ZeroEyes Inc.
I was working in the Data Engineering Team, where I deepened my understanding in ML pipelines, large-scale data analytics, model training, and deployment-ready solutions.
• Leveraged OpenAI’s CLIP model in PyTorch to enable text-based image search. Improved search speed for context specific image by 60% and improved overall user experience in retrieving relevant images from a corpus of 250K+ images.
• Implemented custom CNN and fine-tuned ResNet & YOLO models in TensorFlow to streamline data annotation, cutting annotation time by 15% and minimizing manual efforts by annotators. Generated embeddings and performed exploratory data analysis (EDA) on metadata for 100k+ datapoints.
May 2023 - December 2023Graduate Teaching Assistant
University of Kansas School of Engineering
I was TA for Senior Capstone and Compilers Construction courses at KU, where I developed strong leadership and mentoring skills while supporting student learning.
• Senior Capstone: Organized and led weekly agile sprint meetings for 9 teams to discuss progress, address obstacles, and set achievable sprint goals. Assisted students with technical challenges and overall boosting team productivity.
• Compilers Construction: Facilitated interactive lab sessions for over 40 students. Explained topics from lexical analysis and parsing to code generation with hands-on examples. Evaluated and provided feedback on students’ lab work.
August 2022 - December 2024Projects 👨💻
KU-Chat
KU-Chat is a domain-specific Q&A chatbot fine-tuned on KU web data to deliver accurate, context-aware answers for students and faculty. In this project, I designed and implemented a full data pipeline to build a KU-specific question-answering chatbot. Crawled university websites using Selenium and Airflow, extracting and cleaning over 1 million tokens. Generated structured Q&A pairs with OpenAI’s GPT-4o-mini and stored the processed data in AWS S3. Fine-tuned LLaMA 3.2 using PEFT-LoRA and 4-bit quantization, then tested performance with a Retrieval-Augmented Generation (RAG) setup.
This led to a 6% boost in ROUGE and BERTScore over the base model. The fine-tuned model delivered more specific, verbose, and context-aware answers, achieving over 90% accuracy on a custom KU evaluation set and significantly improving information access for students and faculty.
Web Scraping · ETL Pipeline · Synthetic Data Generation · LLM Finetuning and Evaluation · PEFT (LoRA) · Airflow · AWS EC2 · HuggingFace· Python
Car Damage Detection
In this project, I developed a model to automatically detect and classify car damages. The model was deployed on an ESP32S board, enabling real-time car damage detection. Utilizing the CarDD dataset, I initially created a Convolutional Neural Network (CNN) model to perform detection and classification tasks. I then used the TensorFlow Object Detection API to train a MobileNetV2-SSD model for enhanced accuracy in identifying car damages.
This project enhanced my skills in model development, TensorFlow API, model optimization, embedded systems deployment, and managing end-to-end machine learning workflows.
Image Processing · TF Object Detection API · Quantization · Embedded Device · Tensorflow · Python · PlatformIO
IMDb Reviews Analysis
The IMDb Reviews Analysis project uncovers patterns and trends in movie reviews, providing insights into how sentiments and ratings change over time. Through Sentiment Analysis, Temporal Analysis, and Correlational Analysis, we explore the dynamics of audience opinions and their evolution. Additionally, we developed a model to accurately predict the sentiment of a review.
I learned a lot about web scraping, text processing, and tokenization. I also gained experience with vectorization techniques and data visualization. Additionally, I developed skills in analyzing large datasets to uncover meaningful insights.
Web Scraping · Natural Language Processing · Text Classification · Data Analytics · BeautifulSoup · Selenium · NLTK · Python
Skin Disease Detection
In this project, I developed an automated system using deep learning models to accurately detect and classify skin lesions. I trained custom CNN models and fine-tuned state-of-the-art models like ResNet & MobileNet to assist in early diagnosis and improve patient outcomes.
Through this project, I learned to implement custom datasets and dataloaders, build and train deep learning models, handle class imbalances, and apply data augmentation techniques, significantly enhancing my understanding of deep learning frameworks.
Image Processing · CNN · Model Finetuning · PyTorch · Python
SoccerTact
SoccerTact is a web application that provides comprehensive soccer analysis, including match based, team based and player based analysis. The application utilizes event data from StatsBomb's open data repository. We built a robust data pipeline to automatically extract data from the StatsBomb GitHub repository and store it in an SQL database.
I gained full-stack development experience, built a robust ETL pipeline, enhanced data analysis and visualization skills.
WebApp · Data Pipeline · Soccer Analytics · Data Visualization · MPLSoccer · NodeJS · JavaScript · Python · MySQL
MAPKU
MAPKU is a web application developed to assist first-year students and staff in navigating the campus. Users can search for buildings by class number, mark multiple destinations for the fastest route, and view route details (estimated time, distance). The app also provides information about campus buildings.
This project enhanced my skills in web development, user experience design, and geolocation services integration. I also gained experience in optimizing algorithms for route calculation.
WebApp · Web Scraping · GoogleMaps API · JavaScript · HTML · CSS · BeautifulSoup · Python
SMS Filtering
Developed a spam detection system using Bernoulli Naive Bayes from scratch in Python. The project involved data cleaning, feature extraction, and probabilistic modeling to classify SMS messages as spam or ham.
Gained skills in data preprocessing, binary feature matrix creation, and model implementation. Learned to calculate prior and likelihood probabilities, evaluate model performance using metrics like accuracy.
Natural Language Processing · Model Implementation · Python
My Portfolio Website
Built with Next.js, React, Tailwind CSS, and JavaScript, this site is my personal corner of the internet.
A fun side project that let me explore frontend development while showcasing my work — designed, developed, and deployed from scratch.
Web Development · NextJS · ReactJS · Tailwind CSS · Javascript
Let's Connect!
Whether you have an exciting project in mind, want to discuss a potential collaboration, or simply wish to say hello, I'd love to hear from you! Drop me a message using the form below, shoot me an email, or connect through social media.
contactme@rahulp.dev