Dr. Yagya Raj Pandeya

Lab Lead

As an Assistant Professor and the Program Coordinator for Graduate and Undergraduate studies in Artificial Intelligence at Kathmandu University, I bring forth over a decade of expertise in IT education, research, and the development of smart systems. My focus lies in deep learning and machine learning technologies, which I apply to solve real-world problems both in research and development domains. Specializing in multimedia information processing, affective computing, and the creation of smart systems using deep neural networks, I lead research initiatives at Kathmandu University’s Artificial Intelligence and Smart System Research (AISSR) Lab. Through collaborations with international and national research groups, I strive to push the boundaries of AI technology development.

Journal/Conference Paper Publications

Feb. 2024
Pandeya YR and Jee J
GlocalEmoNet: An optimized neural network for music emotion classification and segmentation using timbre and chroma features
• Novel method for music emotion classification and structural analysis.
• State-of-art results and optimized neural network.

Aug. 2023
Pandeya YR, Karki S, Dongol I, and Rajbanshi N
Deep Learning based Tomato Disease Detection and Remedy Suggestions using Mobile Application
• Novel dataset for vegetable disease and their remedy prescription.
• Object detection system and mobile application for user interface.
• System is designed for local farmer of Nepal who does not have a good understanding of English.
• System interface, service request, and disease suggestions provided in Nepali language.
• Easy to use, user-friendly, and real-time service system.

Sep. 2022
Pandeya YR, Bhattarai B, and Lee J
Tracking the Rhythm: Pansori Rhythm Segmentation and Classification Methods and Datasets
• Music structural analysis using deep neural network.
• Novel dataset, network architecture.

Sep. 2022
Bhattarai B, Pandeya YR, Jie Y, Lamichhane AK, and Lee J
High-Resolution Representation Learning and Recurrent Neural Network for Singing Voice Separation
• Music source separation using HR network plus LSTM block.
• Multi-cultural music datasets for source separation.

Feb. 2022
Pandeya YR, Bhattarai B, Afzaal U, Kim JB, and Lee J
A monophonic cow sound annotation tool using a semi-automatic method on audio/video data
• Semi-automatic sound event annotation tool using audio and video as input.
• An automatic event detector is used to detect the audio event.
• Based on the automatic detector result, a human annotation has to refine the annotation boundary.
• Easy to use, better audio visualization, python-based, and output in easy CSV data file.

Dec. 2021
Bhattarai B, Pandeya YR, and Lee J
An Incremental Learning for Plant Disease classification
• Plant disease classification using incremental learning.
• Novel dataset and methods.

Oct. 2021
Pandeya YR, Bhattarai B, and Lee J
Music Video Emotion Classification Using Slow-fast Audio-video Network and Unsupervised Feature Representation
• Unsupervised and supervised music video emotion classification dataset.
• Autoencoder architecture with audio and video information.
• Slow-fast audio-video network to capture spatial and temporal information of music and video.
• Train time information sharing and attention modules.

Oct. 2021
Pandeya YR, Jie Y, Bhattarai B, and Lee J
Multi-modal, Multi-task and Multi-label for Music Genre Classification and Emotion Regression
• Music classification using audio and lyrics.
• Classification and regression task.
• Single and multi-label data.

Sep. 2021
Pandeya YR, Bhattarai B, and Lee J
Music Emotion Classification with Deep Neural Nets
• Music emotion analysis using deep learning technology.
• Session chair and oral presentation.

July 2021
Afzaal U, Bhattarai B, Pandeya YR, and Lee J
An Instance Segmentation Model for Strawberry Diseases Based on Mask R-CNN
• Strawberry disease detection and segmentation using Mask R-CNN.
• Strawberry disease dataset and data augmentation.
• Ablation study based on network structure.

June 2021
Pandeya YR, Bhattarai B, and Lee J
Deep-Learning-Based Multimodal Emotion Classification for Music Videos
• Music video emotion classification dataset (Improved and Extended version).
• Ablation study on unimodal and multimodal using music, video, and facial expression.
• Network complexity reduction using novel channel and filter separable convolution.
• Train time information sharing and boosting modules.
• End-to-end training, better result on visual and statistical analysis.

April 2021
Bhattarai B, Pandeya YR, and Lee J
Deep Learning-based Face Mask Detection Using Automated GUI for COVID-19
• Face mask, no mask and wrong wear of mask classes in novel dataset.
• A user interface for more data annotation.
• Best paper award.

Dec. 2020
Pandeya YR, Bhattarai B, and Lee J
Sound Event Detection in Cowshed using Synthetic Data and Convolutional Neural Network
• CNN-based sound event detection.
• Sound event annotation tool.
• Sound localization and classification.

Nov. 2020
Bhattarai B, Pandeya YR, and Lee J
Parallel Stacked Hourglass Network for Music Source Separation
• Prepared Korean traditional song (Pansori) dataset with 3 sources.
• Korean traditional music Pansori dataset, MIR-1K dataset, and DSD100 dataset used in the experiment.
• Proposed a novel parallel stacked hourglass network (PSHN) with multiple band spectrograms.
• Ablation study on proposed and past architecture.

Sep. 2020
Pandeya YR, Bhattarai B, and Lee J
Visual Object Detector for Cow Sound Event Detection
• Cow sound event detection dataset with 4 class categories.
• CNN was used for sound event detection using the Cow sound dataset and the UrbanSound8K dataset.
• Visual object detection architecture (F-RCNN, CF-RCNN, FPN, C-FPC) used for audio event detection (in Log Mel-Spectrogram).
• Compare the proposed CNN and Visual object detection architecture using three test datasets.

Sep. 2020
Pandeya YR, and Lee J
Deep learning-based late fusion of multimodal information for emotion classification of music video
• Music-Video emotion classification using audio and video multimodal network architecture.
• Use pre-trained CNN for audio and 3D video models (I3D and C3D).
• The network learned features ate late fused and compare the impact of network feature fusion.
• Cross-validation and network feature fusion.

March 2019
Pandeya YR, and Lee J
Music-Video Emotion Analysis Using Late Fusion of Multimodal
• An oral presentation on a multimodal approach of music video emotion classification using deep learning technology.

Oct. 2018
Pandeya YR, Kim D, and Lee J
Domestic Cat Sound Classification Using Learned Features from Deep Neural Nets
• CNN and CDBN network architecture.
• Cat sound dataset preparation of 10 class categories.
• Frequency division average pooling (FDAP) technique instead of global average pooling (GAP) to make a robust prediction using various frequency band features.
• Audio augmentation and learned feature visualization.

June 2018
Pandeya YR, and Lee J
Domestic Cat Sound Classification Using Transfer Learning
• Cat sound dataset with 10 class categories.
• Use pre-trained CNN for feature extraction and make feature classification.
• Machine learning classifier and deep learning classifier comparison.
• Ensemble and data augmentation.

Articles Under Review

A Survey on Graph Representation and Graph Neural Networks for EEG Signal Analysis
Applied Sciences (MDPI)
• A review paper on graph-based EEG Signal processing.
• Survey on methods, dataset, and code availability.

A Novel Approach: Graph Embedding and Independent Features for a Family of Weather Reconstruction
• Weather forecasting.

ImputeCast: Advancing Weather Data Imputation and Historical Prediction
• Weather data imputation.

Deep Dive into Music Videos: Hierarchical Emotion Recognition with Rich Audio and Visual Features
• Novel dataset for hierarchical multi-culture emotion classification in music video.
• Novel neural networks.

Late Blight Disease Prediction with Mobile Application using Spatio-temporal Graph Neural Network
• Late-blight prediction using weather data.


03/03/2024 – 07/03/2024
Validation and Dissemination of Weather-based Forecasting Models to Manage Tomato Late Blight
An invited speaker in a provincial level knowledge sharing workshop organized by International Centre for Integrated Mountain Development (ICIMOD).

Nov. 5, 2022
Keynote Speaker on Kathmandu Engineering College Conference (OKRP)
Presentation on Artificial Intelligence: A Game Changer for Agriculture of Nepal.

Dec. 15, 2022
Invited speaker at NAST Engineering College
A presentation on AI applications and Trends.

Aug. 27, 2022
Invited speaker at D.A.V. College (BCA Program)
A presentation on Deep Learning for All.

June 24, 2022
One Day Workshop on AI for Social Impact: Exploring the Roles of Academia, Industry and the Government
Organizer and session moderators.

March 13-14, 2021
Oral presentation on SONSIK 8th Educational Seminar 2021
Presentation on Deep Learning Based Face Mask State Detection for COVID-19.

Oct. 24-27, 2018
Poster presentation on ISITC 2018 (Conference)
A poster presentation on Convolutional Neural Network for Music Onset Detection.