Lakshmimanaswitha Chimakurthi
*B, Smith Street, Boston, MA 02120 Portfolio:manaswitha1001.github.io ac8pic@r.postjobfree.com 617-***-**** Linkedin: manaswithachimakurthi Available for Full-time positions starting May 2019 Github: manaswitha1001 PROFESSIONAL EXPERIENCE
Data Science Co-op
Brigham and Women’s Hospital, Boston, MA May - Dec 2018
Collaborated with the physicians and bioinformaticians and built a pipeline to cluster the Lung-Tissue expression and methylation profiles and identified the clinical associations for each cluster and visualized the results using ggplot in R.
Built a docker image for cheweb (A tool for visualizing Channing’s GWAS results).
Implemented an autoencoder neural network classifier to classify COPD case/controls on dosage values and improved the
AUROC to 0.78 using the stacked approach.
Extracted 1M Genotype data records from multiple Oracle relational databases into a simplified json structure using SQL. EDUCATION
Northeastern University,Boston, MA Jan 2017- Present Master of Science in Data Science Expected Graduation - May 2019 Relevant Courses: Machine Learning, Algorithms, Natural Language Processing, Data Management & Processing Information Retrieval, Database Management Systems, Information Visualization VR Siddhartha Engineering College, Vijayawada, India June 2012 - Apr 2016 Bachelor of Technology in Information Technology
Relevant Courses: Database Management Systems, Data Warehousing, Data Mining, Business Intelligence TECHNICAL SKILLS
Key Strengths: Predictive Modelling, Text Mining, Market-basket Analysis, Web-Scraping, Recommendation Systems, Sentiment Analysis,Time Series Forecasting, Machine Learning, Deep Learning Programming Languages: Python, R, SQL, Scala, C++, Java, Matlab, HTML, CSS, JavaScript Databases: Oracle, MySQL, MongoDB
Machine Learning: Linear/Logistic Regression, SVM, Tree Based, Neural Networks, Clustering, Boosting ML Tools: Scikit Learn, Pandas, Numpy, PySpark, Tensorflow, Keras, ARIMA, Flask Data Visualization:Tableau, Excel, ggplot, R Shiny, Plotly, Matplotlib, d3.js Big data Technologies: Hadoop, Spark, Kafka
Cloud Technologies: AWS, Elasticsearch
Containers: Docker
PROJECTS
Price Prediction of Used Cars Mar - May 2018
Scraped the car listings on attivo.com using BeautifulSoup in Python.
Implemented Linear Regression, Decision Trees, KNN, Boosting to predict the prices of car using the car’s attributes.
Achieved the best RMSE with Gradient - Boost Regressor on test data.
Deployed the prediction model as a Flask API and hosted the interactive web application in Heroku. Credit Risk Prediction Jan - Mar 2018
Developed a binary classifier to classify good/bad loans from applicants details using pyspark.
Undersampling is performed to treat the problem of in-balanced classes.
Implemented Logistic Regression, Random Forest Classifier using Spark MLlib.
Achieved an AUROC of 0.732 with the ensemble model. Sentiment Analysis on Customer Tweets Oct - Dec 2017
Processed the Customer tweets on top 6 US Airline Carriers and encoded the text data into word vectors.
Implemented a multilayer neural network classifier on processed data using Keras in Python.
Classified the customer tweets into positive, negative, neutral and achieved an average AUC of 0.74. Movie Recommender System Aug - Nov 2017
Developed a movie recommender system using collaborative filtering approach on IMDB movie ratings.
Suggests movies based on similar users past ratings for other movies.
Implemented using K-Means, KNN, SVM, neural network and achieved the best Precision of 0.85 with SVD. Search Engine May - Aug 2017
Developed a scalable engine in Python to store an Inverted Index for 85k documents
Provided a ranked list of top 1000 documents for given set of queries.
Utilized the ranking methods such as Okapi, BM25, tf-idf. Activities - Winning team member for INFORMS Data Visualization Hackathon - Presented a poster on Boston Crime Data Analysis.