Amirshayan Nasirimajd
I'm a research fellow at Politecnico di Milano in Milan, Italy, where I work on ARISE, and ENFIELD projects as a part of EU projects.
I received my master's degree in Data Science and Engineering from Polytechnic University of Turin, where my thesis was ''Sequential Domain Generalisation for Egocentric Action Recognition''. During my master's degree I was able to win Epic@CVPR challenge during CVPR 2023. In the Polytechnic University of Milan I am working on application of Computer Vision and AI in human-robot interaction for manufacturing purposes.
Email /
CV /
Bio /
Scholar /
Twitter /
Github
|
|
Research
I'm interested in computer vision, video understanding, multi-modal learning and robotics. Most of my research is about use of vision language models to improve domain generalization and adaptation for egocentric action recognition:
|
|
Sequential Domain Generalisation for Egocentric Action Recognition
Amirshayan Nasirimajd,
Master's Degree Thesis, 2024
Webthesis.POLITO
In this thesis, we present Sequential Domain Generalisation (SeqDG), a reconstruction-based architecture to improve the generalization of action recognition models. This is accomplished through the utilization of a language model and a dual encoder-decoder that refines the feature representation.
|
|
EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge: Mixed Sequences Prediction
Amirshayan Nasirimajd,
Simone Alberto Peirone,
Chiara Plizzari,
Barbara Caputo
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshop, 2023
CVPR Oral Presentation
/
arXiv
The Winner of EPIC-Kitchens-100 Unsupervised Domain Adaptation (UDA) Challenge in Action Recognition. Our approach is based on the idea that the order in which actions are performed is similar between the source and target domains. Based on this, we generate a modified sequence by randomly combining actions from the source and target domains. As only unlabelled target data are available under the UDA setting, we use a standard pseudo-labeling strategy for extracting action labels for the target. We then ask the network to predict the resulting action sequence. This allows to integrate information from both domains during training and to achieve better transfer results on target. Additionally, to better incorporate sequence information, we use a language model to filter unlikely sequences.
|
|