People / Students / Class of 2025
Shubham Kumar earned his Bachelor of Technology (B-Tech/undergraduate) degree in Electrical and Electronics Engineering from Vellore Institute of Technology in 2018. He possesses close to 6 years of work experience in the software and analytics industry and his core competencies lie in backend engineering and Applied ML. Over the course of his B-Tech, he cultivated a strong foundation in Machine Learning and Data Science as well as proficiency in several advanced programming concepts (through Python and Java). His final year project explored different text-embedding methodologies for classification and search use-cases. He started his career with a boutique analytics firm, Tredence Analytics Pvt. Ltd., as a software engineer right after undergrad, where he got several leadership opportunities while working on various analytics applications like churn prediction, demand forecasting, inventory management among others within the broader domains of Digital Customer Experience and E-Commerce. Through the churn prediction project, he learnt about ML model interpretability, and implemented complex logic to derive crucial features and analyze user-behavior that strongly indicated possible attrition in the future, ultimately delivering an ensemble model which performed well on relevant evaluation metrics. He was recognized for his strong coding skills, evident in his use of appropriate design patterns to produce high-quality and maintainable code modules. He simultaneously demonstrated a strong pre-existing understanding of several approaches to tackling any ML challenge. He was also instrumental in the design of different backend services (REST APIs, database modeling) through which to integrate the model into a digital customer experience platform (Flask, Django, MySQL, Google Cloud Platform) responsible for managing customer lifecycle. Within it, he implemented several key features that enabled users to craft customer specific email templates with which to engage them and then compare the templates' effectiveness via A/B tests by tracking click through rates. In the eCommerce project, he demonstrated his adaptability by ramping up on a different tech stack (ExpressJS, MS-SQL, Azure, PySpark) to implement high complexity batch analytics pipelines consisting of time-series forecasting (demand forecasting) modules. He simultaneously worked on the server-side of the web application through which the insights generated by the pipelines were consumed. He received much appreciation for handling some of the most complex programming challenges of the overall solution. He successfully navigated and learnt about different scalability challenges associated with processing large datasets, and improving the performance of REST APIs. His work on time series analysis and forecasting also helped him become more adept in various classical Machine Learning concepts. Throughout the project, he displayed many important leadership qualities while managing stakeholder expectations, and while guiding junior engineers where needed. He was trusted with the responsibility to guide two batches of fresh B-Tech graduates (new joiners) in ramping up on their programming skills through tutoring and mentoring sessions. He also contributed to the team's growth by interviewing prospective candidates for software engineering and analyst roles. He then went on to work at Amazon.com as a Business Intelligence Engineer, where he built several key data ingestion (often real-time) and analytics pipelines needed for generating and gauging the quality of human-annotated image data critical for training production-grade AI models deployed at Amazon's robotics fulfilment centers. He learnt about the human-in-the-loop mechanism as well. His work impacted how Amazon manages packages at its robotics fulfilment centers. This time he was also more deeply involved with designing scalable architectures for the data-products which involved usage of publisher-subscriber as well as event-driven architecture patterns. His effective use of data storage strategies in data-lakes, Spark for distributed computing along with other tools helped build high-volume and high-speed data processing pipelines. He stepped up and took greater responsibilities to ensure the systems' robustness and the data quality, which he saw to by incorporating various data quality monitoring mechanisms and different contingency measures to handle production failures. He took initiative and employed ML techniques to analyze outliers in the time taken by the associates for annotating the images by incorporating additional features from the images being annotated, which garnered interest from various relevant stakeholders. He also worked on automating the monitoring of the team's AWS expenses at a program level by leveraging different AWS APIs, aiding the team in keeping track of the costs of different programs and hence keeping unnecessary expenses in check. His most recent stint was at an interior design software firm called Cyncly (also 2020 Interior Design Software Pvt. Ltd.). There he worked at their newly formed AI-COE as their first Senior Data Engineer, and spearheaded the development of high complexity data pipelines responsible for processing several thousands of images, to generate various use-case specific image embeddings using pre-existing Deep Learning vision models (Vertex AI), to be utilized for different search use cases. He worked as part of a small team of Data and AI engineers that helped take the product being worked upon from 0 to 1. His career so far has given him immense exposure to a wide array of ML and AI concepts, and helped him sharpen his skills in developing high quality solutions centered around Machine Learning. Through the MS in MLDS program, he aims to deepen his knowledge in the tools and concepts he has been working with and take his skills in this domain to the next level. He also looks forward to forming interesting new connections with others from different backgrounds and shared interests. Apart from programming, in his free time he can be found engaging in creative writing as well as reading up on a wide array of novels and books by international authors. An overview of the technical tools he has experience with - Python, Flask, Django, NodeJS/ExpressJS, PySpark, PyTorch along with other common Python libraries for Data Science, MySQL, DynamoDb, MongoDb, several AWS and Azure services useful for building data pipelines, etc.