About Project

I successfully led and executed a multifaceted project integrating Apache Airflow, Machine Learning, Excel, and Power BI to streamline data processes and enhance analytical capabilities. The project included designing and implementing an Airflow pipeline for ETL processes, developing robust machine-learning models for predictive analytics, ensuring data integrity through advanced Excel data cleaning techniques, and creating dynamic Power BI dashboards for insightful visualizations. This comprehensive approach not only improved data efficiency and accuracy but also provided stakeholders with powerful tools for data-driven decision-making. The project showcased my proficiency in diverse technologies and my ability to deliver end-to-end solutions in the realm of data analytics.

AWS

Chatbot using AWS Bedrock, Kendra, and Serverless

AWS

• Amazon Bedrock Integration: Simplified the creation and scaling of generative AI applications using foundational models.
• Amazon Kendra Utilization: Implemented intelligent enterprise search across various content repositories for enhanced information retrieval.
• AWS Lambda Deployment: Leveraged serverless computing to execute code in response to events, ensuring efficient resource management.
Comprehensive Project Showcase: Detailed repository available on GitHub, demonstrating the practical application of AWS services in building a sophisticated chatbot.

OWN GPT Using Amazon Web Services (AWS) Bedrock

AWS

I've harnessed the power of AWS Bedrock to create a versatile AI platform that delivers:
• 📝 Text Generation(Meta.llama2): Engaging and coherent conversations like ChatGPT.
• 🎨 Image Generation(stability. stable-diffusion): Stunning visuals from text prompts, inspired by MidJourney.
• 💻 Code Generation(anthropic.claude): Efficient code solutions, similar to CodeX.
Key technologies driving this project include Meta.llama2, Stability.AI's Stable Diffusion, and Anthropic.claude. I've developed a sleek web interface using Streamlit for easy public access.

Airflow Pipeline

Spotify ETL(Extract-Transfer-Load) Pipeline

Airflow

• Developed an ETL pipeline to extract, transform, and load data from Spotify's API into a data warehouse (AWS S3) using Apache Airflow.
• Utilized Python for scripting and orchestration to automate data extraction and transformation processes.

Machine Learning

Road Accident Severity Prediction App

Machine Learning

• ML model which predicts a person has a Slight Injury, Serious injury, or Fatal Injury based on experience, weekday, driver age, etc, and achieves 83.3% accuracy
(➡️ Project Flow ➡️ Data collection ➡️ EDA(Exploratory Data Analysis) ➡️ Data Preprocessing ➡️ Feature Engineering ➡️ Data Modelling)

SONAR Rock vs Mine Prediction

Logistic Regression(ML)

• Achieved an accuracy of 83.4% in distinguishing between rock and mine signals using machine learning techniques.
• Implemented feature engineering and model optimization to enhance accuracy and efficiency.

Diabetes Prediction

Machine Learning

• Developed a predictive model with 78.6% accuracy to identify diabetes in patients.
• Employed feature selection and ensemble methods to improve model performance.

MNIST Handwritten Digit Classification

Neural Network (DL)

• Implemented a deep learning neural network to achieve accurate classification of handwritten digits in the MNIST dataset.
• Experimented with various architectures to optimize accuracy and reduce overfitting.

Dog vs Cat Classification

Transfer Learning (DL)

• Demonstrated exceptional accuracy of 98.7% in classifying images of dogs and cats.
• Leveraged transfer learning to efficiently train a deep neural network on a limited dataset

Spam Mail Prediction

Machine Learning

• Developed a highly accurate model with 96.6% precision to classify spam emails.
• Employed text preprocessing techniques and experimented with different algorithms to achieve exceptional results.

Parkinson's Disease Detection

Machine Learning

• Created a machine learning model with an 88.4% accuracy rate for detecting Parkinson's disease based on patient data.
• Utilized feature scaling and hyperparameter tuning to increase model accuracy.

Movie Recommendation System

Machine Learning

• Implemented a recommendation system that suggests movies to users based on their preferences.
• Utilized collaborative filtering techniques and evaluated the system's effectiveness using various metrics.

Power BI

Revenue Insights Hospitality Domain

Power BI

• Designed and developed a comprehensive report dashboard for revenue analysis in the hospitality domain.
• Utilized Power BI to create interactive visualizations, enabling stakeholders to identify trends and make informed decisions. For this dashboard created columns and measures for RevPAR(Revenue per available room), DSRN(Daily sellable room nights), ADR(Average Daily Rate), DBRN(Daily Booked Room Nights), DURN(Daily Utilized Room Nights) to analyze data very well.

SS Retail Data Report Dashboard

Power BI

• Project Scope: Developed a comprehensive Power BI Retail Data Report Dashboard. Integrated data from various sources, ensuring accuracy and consistency.
• Key Achievements: Enabled data-driven decision-making through insightful sales analysis. Improved inventory management and customer engagement strategies. Defined and monitored key performance indicators for business success.
• Outcome: Empowered stakeholders with actionable insights. Enhanced sales strategies and overall business efficiency. Contributed to increased profitability through informed decision-making.

Attendance Data Analysis Dashboard

Power BI

•Developed a Power BI dashboard to analyze attendance patterns of employees.
•The dashboard provided insights into employee engagement and helped in optimizing attendance management processes.

Global Superstore Data Analysis

Power BI

• Created a dynamic Power BI dashboard to analyze sales data from a global superstore, leading to insights that improved inventory management and profitability and also compare quarterly sales from 2011 to 2014.

IMDB Movies Analysis Dashboard

Power BI

• Created a Power BI dashboard to analyze IMDB movie data, exploring factors influencing movie ratings and votes.
• Implemented visualizations to showcase correlations between budget, genre, and performance.

Customer Data Analysis Dashboard

Power BI

• Designed a Power BI dashboard to analyze customer preferences and buying behavior for an e-commerce platform.
• Extracted insights into the total SKUs available for customers, total category, and highest sales by age group.

Layoffs 2020-2022 Analysis Dashboard

Power BI

• Developed a Power BI dashboard to analyze layoff trends across various industries from 2020 to 2022.
• Provided visualizations depicting affected sectors, geographic distribution, and potential factors contributing to layoffs.

Covid-19 Data Analysis Dashboard

Power BI

• Created an informative Power BI dashboard to track Covid-19 statistics globally and regionally.
• Utilized data visualization to display infection rates, new cases, serious & critical cases and Deaths.

Excel

US President Data Cleaning

Excel

• Cleaned and preprocessed historical US President data advanced Excel functions.
• Conducted data quality checks, corrected inconsistencies, and standardized formats.

Vrinda Store Data Analysis Dashboard

Excel

• Created a comprehensive Excel dashboard for analyzing sales data of Vrinda Store.
• Developed interactive visualizations and used pivot tables to extract insights.

Bike Sales Data Analysis Dashboard

Excel

• Designed an Excel dashboard to analyze bike sales trends and customer preferences.
• Utilized charts and graphs to showcase key performance indicators.

Objective

Highly motivated B.Tech graduate with a strong foundation in data analysis and engineering. Seeking a challenging position that combines the roles of Data Analyst and Data Engineer, where I can leverage my skills in Python, SQL, AWS, and data visualization to extract valuable insights and contribute to data-driven decision-making processes.