Developed an automated data pipeline with Apache Airflow for transferring pathological images,
reducing data processing time by 20% and ensuring reliable, on-time data delivery for research studies.
Engineered and deployed Computer Vision Models, resulting in a 10% improvement
in pathological image analysis and bio-marker Identification, translating into 15% of more study sponsors.
Established data quality control procedures, validation protocols, and data auditing
frameworks to improve data precision and ensure conformity with industry benchmarks, resulting in dependable
research outcomes and reports.
Collaborated with lab scientists to automate and standardize slide and file naming,
eliminating errors by 17%.
Designed, built, validated, and deployed Machine Learning
Models and software tools for analyzing and optimizing Anello therapeutics.
Provided analytical insights regarding binding sites, tissue, and cell
specificity from sequenced data collected by the Discovery team from the patients' protein
sequences to enhance the virology team's outcome and develop Anello-backed programmable medicine.
Leveraged NLP-based models for genes/protein sequences to predict tropism to create viral
vectors that can safely and effectively deliver therapies to target cells and tissues.
Implemented positional encoded models to find the tissue-specific motifs and predict
the tropism with important binding site positions.
Developed Graph Neural Network to find essential features for Anello protein sequence from
whole genome sequencing data for improving tropism using TensorFlow and TensorFlow on CUDA using high-performance
computing cluster environments.
Reported to the head of the Genomics team and worked closely with the drug discovery and platform teams
to analyze biological data from various sources and build machine learning-based tools for drug vector design.
Built ETL Docker containers and pipeline to load experimental data into fasta format using Airflow DAG to
eliminate manual data loading to the server.
Engaged with cross-functional project teams and external collaborators to support data-driven biological
modeling using Statistics and Data Science.
Created interactive front-end for company-wide utilization of developed ML models and functionalities
implemented by the Data Science Team.
Developed and executed the 'Caching using Deep Learning' project,
focusing on time-series prediction of user preferences by leveraging RNNs and LSTM-based models.
Preprocessed and filtered a custom 12GB dataset using Pandas, ensuring the effective training of deep
learning models to optimize caching policies.
Designed and implemented an LSTM-based caching policy, employing RNNs to accurately predict future user
requests and preferences with a success rate of 90%.
Conducted extensive benchmarking and performance analysis, demonstrating a 130% improvement
in hit rates compared to traditional caching policies such as LIFO, LRU, and LFU.
Collaborated with team members to integrate the LSTM-based caching policy into the existing system,
contributing to significant performance enhancements and improved user experience.
Continuously monitored and refined the deep learning models, ensuring optimal performance and
accuracy in predicting user preferences for caching purposes.
Documented and presented the project findings, showcasing the effectiveness of
the deep learning approach in outperforming traditional caching policies.
Conducted extensive crop yield prediction and disease identification
research using remote sensing techniques.
Developed and tested machine learning models to predict crop yield and
identify diseases using satellite images and vegetation indices.
Leveraged Python and machine learning libraries such as scikit-learn
and TensorFlow to develop and train predictive models.
Tested and refined the models using real-world data from farms that
implemented computer vision and remote sensing.
Achieved a 90% accuracy rate in predicting crop diseases and yields,
significantly improving the efficiency and profitability of the farms that implemented the developed models.
Successfully connected the farm with the machine, enabling real-time
monitoring and control of irrigation, fertilization, and other critical processes.
Conducted a project entitled "Multimodal Biometric System," combining Iris,
Facial, Speech Recognition, and fingerprints using Convolutional Neural Networks (CNNs).
Integrated the various biometric factors to achieve an 85% precision rate
on multiple datasets and create a reliable biometric system.
Optimized the biometric model by fine-tuning the hyperparameters, resulting in a 10% increase in accuracy.
Developed a data preprocessing pipeline that cleaned and transformed
the data before feeding it into the model, resulting in a 20% improvement in its performance.
Conducted exploratory data analysis to understand the data
distribution and identify any outliers or anomalies.
Deployed the model on a cloud-based platform for
real-time biometric authentication, leveraging AWS and GCP.
Conducted rigorous testing and validation to ensure
the biometric system's reliability and accuracy, achieving an F1 score of 0.87 and a precision of 0.91.
Applied the model to the department's attendance system,
saving 15 minutes for taking attendance in each class daily, and implemented it in the professor's cabin to enhance security.
Implemented a Kullback Leibler Divergence-based support vector machine(SVM)
for speech spoofing detection, significantly increasing accuracy from 70% to 85%.
Utilized Mel-frequency cepstral coefficient (MFCC) feature extraction
to retrieve essential information from speech samples of approximately 2GBs.
The extracted features were mapped in higher dimensions using kernel functions,
allowing for improved separation between genuine and spoofed speech samples.
Optimized hyperparameters of the SVM model through a grid search method, further improving model accuracy.
Conducted rigorous data preprocessing, including noise reduction
and feature scaling, to improve the quality of speech inputs for machine learning models.
Developed and tested various machine learning speech and voice
recognition models, including hidden Markov models and neural networks.
Collaborated with a team of researchers and engineers
to develop innovative speech and voice recognition solutions, including speech-based virtual assistants and speech-enabled devices.
Remained current with the latest advancements in machine learning
for speech and voice recognition by attending conferences and reading research papers.