My name is Pankesh Bamotra, a data scientist who loves all things Python, machine learning, and statistics

Seattle, Washington


Research Engineer

Coupang USA

Feb 2017 - present

Working with the Retail Systems team in Seattle to ramp up product catalog exponentially. Current projects involve building deep learning models using textual and visual data. In the initial months, I also worked with e-commerce data science team to build machine learning models for customer lifetime value and user search behaviour.

Software Intern - Big Data analytics

Autodesk Inc.

May 2016 - Aug 2016

Worked with the cloud platform team to migrate data from legacy system and ingest it into new Apache Spark based platform. The solution was based on AWS, Apache Spark, Apache Oozie, and Apache Hadoop. The new platform helped the organization tune its data ingestion, transport, and compute layers.

Software Engineer

PayPal India

Jun 2013 - Jul 2015

Worked with Open Analytics Platform group to develop self-service database provisioning tool for analysts and developers. My contributions included refactoring the legacy code, developing plug-and-play solution for Qlikview and Teradata, and deliving a Teradata SQL based dashboard to analyse clickstream data.

Software Development Intern

PayPal India

Jan 2013 - Jun 2013

Worked with seller risk management to create early warning dashboards. My contribution include developed a Teradata SQL based solution to analyse buyer account habits, creating early warning indicator reports for teams across NA and EMEA for fraud analysis, and modeling post transaction risk mitigation.


Master of computational data science

School of computer science
Carnegie Mellon University

2015 - 2016

GPA:  3.75

◃ Search Engines.
◃ Natural language processing.
◃ Probabilistic Graphical Models.
◃ Multimedia Databases and Data Mining.
◃ Deep Learning.
◃ Machine Learning for Large Datasets.

B.Tech, Computer Science and Engg.

Vellore Institute of Technology

2009 - 2013

GPA:  9.24

◃ Data structures and algorithms.
◃ Operating systems.
◃ Linear algebra.
◃ Computer programming and problem solving.
◃ Digital Logic.
◃ Algorithm design and analysis.



Python, Pandas, Scikit-learn, SQL


Java, Apache Spark, Apache Hadoop, Keras+Tensorflow, EC2, S3, Redshift