An Engineer turned Data Scientist who possesses amazing data manipulating skills, a blend of creativity, and a logical mind, with the ability to extract hidden analytical gems and meaningful insights from large datasets. Able to leverage a heavy dose of mathematics and holds exceptional computer programming skills with a healthy sense of exploration.
Key Responsibilities:• Design, maintain, and automate BI Dashboard by building data pipeline using Talend BigData (ETL tool); fetch data from Mongodb, export the required fields into MySql database, and finally reflects it into PowerBI Dashboard.• Perform Sentiment Analysis on transcribed text obtained from Deepgram (real-time voice-to-text service), and perform Intent Classification by the techniques used in NLP (natural language processing) in Python.
Key Responsibilities:• Extract data from lab analyzers (cobas c311, e411 and sysmex) as raw ASC file.• Clean and extract required features (biomarkers) from raw data using python.• Sort and merge processed data into its respective master (as per project and center). • Detect outliers using python from each feature (biomarker) to judge performance of centers (in terms of ‘quality’ of samples receive on daily basis).• Perform EDA (exploratory data analysis) using python to judge performance of centers and projects (in terms of ‘counts’ of samples receive per week). • Collect SPSS files (questionnaires entered in spss by team of DEOs), import, clean and find anomalies in all the fields/records using python. • Compare entry1 vs entry2 spss files data using python, in case of mismatch: make questionnaires pull out from data room from DEOs and make them re-enter the field until correct entry has been made and finally merge that into master. • Update the master data in MySQL database. • Coordinate with the QC team to acquire actionable data for generating data-oriented reports for data science department and lab.• Apply and validate novel machine learning algorithms to ensure effective data integrity, data quality and data uniformity.
Key Responsibilities:
• Delivered Hands-on Training on Python of including topics : • Data Cleaning and Manipulation using Numpy and Pandas.• Data Visualization and EDA using Matplotlib & Seaborn. • Statistical Analysis (Normal, Bayesian, ANOVA, Hypothesis Testing).• Supervised Machine Learning using Scikit Learn. (Linear & Logistic Regression, KNN, Naive Bayes, SVM, Decision Tree). • Unsupervised Machine Learning using Scikit Learn. (K-Means Clustering, Hierarchal Clustering, Dim Red using PCA, Apriori Algo). • Deep Learning using Tensorflow and Keras. (Artificial Neural N/W, Convolutional Neural N/W, Recurrent Neural N/W, Auto Encoders, Transfer Learning, Reinforcement Learning).