Lab Lead
Nationality: Nepalese Date of birth: 25/07/1988 Gender: Male
Work: Kathmandu University Dhulikhel, 6250 Kavrepalanchok (Nepal)
Home: Suklaphata Municipality Jhalari-10, Kanchanpur (Nepal)
As an Assistant Professor and the Program Coordinator for Graduate and Undergraduate studies in Artificial Intelligence at Kathmandu University, I bring forth over a decade of expertise in IT education, research, and the development of smart systems. My focus lies in deep learning and machine learning technologies, which I apply to solve real-world problems both in research and development domains. Specializing in multimedia information processing, affective computing, and the creation of smart systems using deep neural networks, I lead research initiatives at Kathmandu University’s Artificial Intelligence and Smart System Research (AISSR) Lab. Through collaborations with international and national research groups, I strive to push the boundaries of AI technology development.
Kathmandu University [ 25/03/2022 – Current ]
City: Kavre | Country: Nepal
Engaging and effective learning experiences for Master’s and Undergraduate students are fostered through teaching, research, and project work centered on artificial intelligence (AI).
Guru Technology [ 01/03/2018 – Current ]
City: Kathmandu | Country: Nepal
As founder and research leader of Guru Technology Pvt. Ltd., I spearhead the development of machine learning solutions to address real-world challenges in Nepal’s agricultural sector.
Fusemachine Nepal [ 01/08/2022 – 02/01/2023 ]
City: Kathmandu | Country: Nepal
I instruct over 100 students from diverse universities in Nepal as part of the Fusemachines Nepal AI Fellowship Program. This involves conducting teaching sessions, collaborating on projects, and assigning tasks aimed at comprehensively grasping emerging AI technologies.
Jeonbuk National University (Artificial Intelligence Laboratory) [ 01/02/2021 – 28/02/2022 ]
City: Jeonju | Country: South Korea
My research focuses on leveraging deep learning for multimedia information retrieval and processing tasks. I explore both supervised and unsupervised methods to address a spectrum of real-world issues, spanning from human cultural analysis to animal vocalization and smart agriculture applications.
Government of Nepal (Ministry of Home Affairs) [ 27/04/2015 – 22/08/2016 ]
City: Kathmandu | Country: Nepal
My experience spans across various aspects of information security and data management. I have a strong understanding of network security, database management, and firewall security.
NAST Engineering College [ 12/02/2013 – 27/04/2015 ]
City: Dhangadhi | Country: Nepal
I possess extensive experience in computer science education, having taught over 12 subjects to undergraduate students for the past 3 years.
Chonbuk National University [ 01/09/2017 – 02/2021 ]
Address: Jeonju 566, 54896 Jeonju (South Korea)
Website: http://jbnu.ac.kr/
Field(s) of study: AI and Deep Learning
Final grade: 4.43/4.5
Thesis: Deep Learning-Based Multimodal Methods for Emotion Classification in Music Video Contents
During my Ph.D. in computer engineering, I explored current research trends in AI technologies, focusing on multimedia affective computing, vegetable disease identification, and animal behavior analysis using acoustic information and music information retrieval. I developed over 12 new datasets to train deep neural networks and several optimized neural networks for supervised and unsupervised training. My key contributions to the AI research community include music video affective computing, cow sound event detection, domestic cat behavior analysis, plant disease identification, music rhythm segmentation, music source separation, and music transcription.
Pokhara University [ 01/08/2010 – 30/05/2013 ]
Address: Lakhnath, Kaski, 427 Pokhara (Nepal)
Website: https://pu.edu.np/
Field(s) of study: Computer Engineering
Final grade: 3.97/4 (CGPA)
Thesis: Power Optimization Methods for Wireless Communication
Pokhara University [ 01/08/2005 – 30/10/2010 ]
Address: Lakhnath, Kaski, 427 Pokhara (Nepal)
Website: https://pu.edu.np/
Final grade: 3.54/4 (CGPA)
Thesis: Home Security System
Guru Technology [ 01/08/2022 – 25/10/2022 ]
Address: New Baneshower, Kathmandu
Website: https://gurutech.com.np/
iSoft Consulting Solution [ 01/07/2013 – 25/07/2013 ]
Address: Maitidevi, Kathmandu
Website: www.isoftconsult.com
Excellent Research Award 2021
The award was given to the best graduating student based on their research.
Link: https://www.jbnu.ac.kr/
Korean Government Scholarship Award (Graduate study)
Korean Government Scholarship Award 2016 (graduate study) for Korean Language and Graduate Study (Duration 4 years).
Link: https://www.studyinkorea.go.kr/en/sub/gks/allnew_invite.do
Dean List ranker
Dean’s Award for Recognition of an Outstanding Meritorious Achievement by Pokhara University.
Link: https://pu.edu.np/
Science & Technology Scholarship (Master in Computer Engineering) (2010-2013)
Pokhara University Science & Technology Scholarship (One seat in whole program) for Master in Computer Engineering.
[ 01/10/2023 – 28/02/2024 ]
International Centre for Integrated Mountain Development (ICIMOD)
Funded Late Blight Prediction System
The project was funded by ICIMOD (https://www.icimod.org/) and Green Resilient Agricultural Productive Ecosystems (GRAPE).
Title: Validation and Dissemination of Weather-based Forecasting Models to Manage Tomato Late Blight
User: Local farmer of Sudurpashim and Karnali province of Nepal
PI: Dr. Yagya Raj Pandeya
[ 09/2022 – 01/2023 ]
Kathmandu University
Internally Funded Research Project for Community Development
Title: Deep Learning Based Vegetable Plant Disease Identification
User: Local farmer of Nepal who use the system to find the vegetable disease of their crops
PI: Dr. Yagya Raj Pandeya
[ 01/04/2021 – 28/02/2022 ]
Development of AI for Analysis and Synthesis of Korean Pansori
Title: Rhythm analysis of Korean Traditional Music (Pansori)
Details: Music structural analysis and transcription
Supervised dataset for segmentation and classification of rhythm
PI: Prof. Jhoonwhoan Lee
[ 01/09/2021 – 31/12/2021 ]
AI Data of Intelligent Smart Farm (mushroom)
Details: Smart mushroom farming using IoT
Automation on temperature, humidity, and carbon control
Dataset and system development
PI: Prof. Jhoonwhoan Lee
[ 01/03/2021 – 31/03/2021 ]
Development of Artificial Intelligence and Robot Technologies for Unmanned Farming
Details: Robot navigation control for grass weed detection
Dataset preparation
PI: Prof. Jhoonwhoan Lee
[ 01/03/2021 – 31/03/2021 ]
Enhancing the UI for Web-based Strawberry Disease Diagnosis and Installment of the System for Expert’s Applications
Details: Strawberry disease classification and segmentation
Dataset and annotation
PI: Prof. Jhoonwhoan Lee
Deep Leaning for Multimedia
• Deep learning technology and algorithms. • Multimedia data processing: Image, video, audio, music, text, EEG signal and Bio-sequence (RNA, DNA). • CNN, RNN, Transformer, Reinforcement learning, Flutter for mobile app. • A complete pipeline for AI system design, development and deployment.
An Essential Guide to Artificial Intelligence
• A book about AI fundamentals: Knowledge representation, Searching, Inferencing etc. • Cover theory and practical guidance about search algorithm, neural network and expert system.
An Essential Guide To Computer Networks
• Computer network and data communication algorithms • Cover Computer Network and data communication university syllabus of Nepal • Cover theory and practical works • Second Edition in Press Link: https://heritagebooks.com.np/product/eureka-advanced-hardware-networking/
First Author: 12, Total: 17
Feb. 2024
Pandeya YR and Jee J
GlocalEmoNet: An optimized neural network for music emotion classification and segmentation using timbre and chroma features
• Novel method for music emotion classification and structural analysis.
• State-of-art results and optimized neural network.
Link: https://link.springer.com/article/10.1007/s11042-024-18246-4
Aug. 2023
Pandeya YR, Karki S, Dongol I, and Rajbanshi N
Deep Learning based Tomato Disease Detection and Remedy Suggestions using Mobile Application
• Novel dataset for vegetable disease and their remedy prescription.
• Object detection system and mobile application for user interface.
• System is designed for local farmer of Nepal who does not have a good understanding of English.
• System interface, service request, and disease suggestions provided in Nepali language.
• Easy to use, user-friendly, and real-time service system.
Link: https://arxiv.org/abs/2310.05929
Sep. 2022
Pandeya YR, Bhattarai B, and Lee J
Tracking the Rhythm: Pansori Rhythm Segmentation and Classification Methods and Datasets
• Music structural analysis using deep neural network.
• Novel dataset, network architecture.
Link: https://doi.org/10.3390/app12199571
Sep. 2022
Bhattarai B, Pandeya YR, Jie Y, Lamichhane AK, and Lee J
High-Resolution Representation Learning and Recurrent Neural Network for Singing Voice Separation
• Music source separation using HR network plus LSTM block.
• Multi-cultural music datasets for source separation.
Link: https://doi.org/10.1007/s00034-022-02166-5
Feb. 2022
Pandeya YR, Bhattarai B, Afzaal U, Kim JB, and Lee J
A monophonic cow sound annotation tool using a semi-automatic method on audio/video data
• Semi-automatic sound event annotation tool using audio and video as input.
• An automatic event detector is used to detect the audio event.
• Based on the automatic detector result, a human annotation has to refine the annotation boundary.
• Easy to use, better audio visualization, python-based, and output in easy CSV data file.
Link: https://doi.org/10.1016/j.livsci.2021.104811
Dec. 2021
Bhattarai B, Pandeya YR, and Lee J
An Incremental Learning for Plant Disease classification
• Plant disease classification using incremental learning.
• Novel dataset and methods.
Link: https://doi.org/10.1109/ICTC52510.2021.9621090
Oct. 2021
Pandeya YR, Bhattarai B, and Lee J
Music Video Emotion Classification Using Slow-fast Audio-video Network and Unsupervised Feature Representation
• Unsupervised and supervised music video emotion classification dataset.
• Autoencoder architecture with audio and video information.
• Slow-fast audio-video network to capture spatial and temporal information of music and video.
• Train time information sharing and attention modules.
Link: https://doi.org/10.1038/s41598-021-98856-2
Oct. 2021
Pandeya YR, Jie Y, Bhattarai B, and Lee J
Multi-modal, Multi-task and Multi-label for Music Genre Classification and Emotion Regression
• Music classification using audio and lyrics.
• Classification and regression task.
• Single and multi-label data.
Link: https://doi.org/10.1109/ICTC52510.2021.9620826
Sep. 2021
Pandeya YR, Bhattarai B, and Lee J
Music Emotion Classification with Deep Neural Nets
• Music emotion analysis using deep learning technology.
• Session chair and oral presentation.
Link: https://doi.org/10.1145/3468891.3468911
July 2021
Afzaal U, Bhattarai B, Pandeya YR, and Lee J
An Instance Segmentation Model for Strawberry Diseases Based on Mask R-CNN
• Strawberry disease detection and segmentation using Mask R-CNN.
• Strawberry disease dataset and data augmentation.
• Ablation study based on network structure.
Link: https://doi.org/10.3390/s21196565
June 2021
Pandeya YR, Bhattarai B, and Lee J
Deep-Learning-Based Multimodal Emotion Classification for Music Videos
• Music video emotion classification dataset (Improved and Extended version).
• Ablation study on unimodal and multimodal using music, video, and facial expression.
• Network complexity reduction using novel channel and filter separable convolution.
• Train time information sharing and boosting modules.
• End-to-end training, better result on visual and statistical analysis.
Link: https://doi.org/10.3390/s21144927
April 2021
Bhattarai B, Pandeya YR, and Lee J
Deep Learning-based Face Mask Detection Using Automated GUI for COVID-19
• Face mask, no mask and wrong wear of mask classes in novel dataset.
• A user interface for more data annotation.
• Best paper award.
Link: https://doi.org/10.1145/3468891.3468899
Dec. 2020
Pandeya YR, Bhattarai B, and Lee J
Sound Event Detection in Cowshed using Synthetic Data and Convolutional Neural Network
• CNN-based sound event detection.
• Sound event annotation tool.
• Sound localization and classification.
Link: https://ieeexplore.ieee.org/abstract/document/9289545
Nov. 2020
Bhattarai B, Pandeya YR, and Lee J
Parallel Stacked Hourglass Network for Music Source Separation
• Prepared Korean traditional song (Pansori) dataset with 3 sources.
• Korean traditional music Pansori dataset, MIR-1K dataset, and DSD100 dataset used in the experiment.
• Proposed a novel parallel stacked hourglass network (PSHN) with multiple band spectrograms.
• Ablation study on proposed and past architecture.
Link: https://ieeexplore.ieee.org/document/9257356
Sep. 2020
Pandeya YR, Bhattarai B, and Lee J
Visual Object Detector for Cow Sound Event Detection
• Cow sound event detection dataset with 4 class categories.
• CNN was used for sound event detection using the Cow sound dataset and the UrbanSound8K dataset.
• Visual object detection architecture (F-RCNN, CF-RCNN, FPN, C-FPC) used for audio event detection (in Log Mel-Spectrogram).
• Compare the proposed CNN and Visual object detection architecture using three test datasets.
Link: https://ieeexplore.ieee.org/document/9187249
Sep. 2020
Pandeya YR, and Lee J
Deep learning-based late fusion of multimodal information for emotion classification of music video
• Music-Video emotion classification using audio and video multimodal network architecture.
• Use pre-trained CNN for audio and 3D video models (I3D and C3D).
• The network learned features ate late fused and compare the impact of network feature fusion.
• Cross-validation and network feature fusion.
Link: https://doi.org/10.1007/s11042-020-08836-3
March 2019
Pandeya YR, and Lee J
Music-Video Emotion Analysis Using Late Fusion of Multimodal
• An oral presentation on a multimodal approach of music video emotion classification using deep learning technology.
Link: https://doi.org/10.12783/dtcse/iteee2019/28738
Oct. 2018
Pandeya YR, Kim D, and Lee J
Domestic Cat Sound Classification Using Learned Features from Deep Neural Nets
• CNN and CDBN network architecture.
• Cat sound dataset preparation of 10 class categories.
• Frequency division average pooling (FDAP) technique instead of global average pooling (GAP) to make a robust prediction using various frequency band features.
• Audio augmentation and learned feature visualization.
Link: https://doi.org/10.3390/app8101949
June 2018
Pandeya YR, and Lee J
Domestic Cat Sound Classification Using Transfer Learning
• Cat sound dataset with 10 class categories.
• Use pre-trained CNN for feature extraction and make feature classification.
• Machine learning classifier and deep learning classifier comparison.
• Ensemble and data augmentation.
Link: https://doi.org/10.5391/IJFIS.2018.18.2.154
A Survey on Graph Representation and Graph Neural Networks for EEG Signal Analysis
Applied Sciences (MDPI)
• A review paper on graph-based EEG Signal processing.
• Survey on methods, dataset, and code availability.
A Novel Approach: Graph Embedding and Independent Features for a Family of Weather Reconstruction
• Weather forecasting.
ImputeCast: Advancing Weather Data Imputation and Historical Prediction
• Weather data imputation.
Deep Dive into Music Videos: Hierarchical Emotion Recognition with Rich Audio and Visual Features
• Novel dataset for hierarchical multi-culture emotion classification in music video.
• Novel neural networks.
Late Blight Disease Prediction with Mobile Application using Spatio-temporal Graph Neural Network
• Late-blight prediction using weather data.
03/03/2024 – 07/03/2024
Validation and Dissemination of Weather-based Forecasting Models to Manage Tomato Late Blight
An invited speaker in a provincial level knowledge sharing workshop organized by International Centre for Integrated Mountain Development (ICIMOD).
Link: https://www.icimod.org/
Nov. 5, 2022
Keynote Speaker on Kathmandu Engineering College Conference (OKRP)
Presentation on Artificial Intelligence: A Game Changer for Agriculture of Nepal.
Link: https://www.kecktm.edu.np/category/research_publication/introduction_okrp
Dec. 15, 2022
Invited speaker at NAST Engineering College
A presentation on AI applications and Trends.
Link: https://nast.edu.np/
Aug. 27, 2022
Invited speaker at D.A.V. College (BCA Program)
A presentation on Deep Learning for All.
Link: https://davcollege.edu.np/
June 24, 2022
One Day Workshop on AI for Social Impact: Exploring the Roles of Academia, Industry and the Government
Organizer and session moderators.
March 13-14, 2021
Oral presentation on SONSIK 8th Educational Seminar 2021
Presentation on Deep Learning Based Face Mask State Detection for COVID-19.
Link: https://www.sonsik.org.np/events/sonsik-8th-educational-seminar/
Oct. 24-27, 2018
Poster presentation on ISITC 2018 (Conference)
A poster presentation on Convolutional Neural Network for Music Onset Detection.
Link: http://kism.or.kr/ISITC2018/