Davide Bonura

MSc Computer Engineering  ·  University of Palermo

About

I work on machine learning systems for speech and language — from self-supervised audio models and NLP to multimodal architectures. My MSc thesis built a multimodal screening system for Alzheimer's, combining handwriting and speech analysis. I am currently extending that work toward a peer-reviewed publication.

Research Interests

Speech Processing Deep Learning Natural Language Processing Computer Vision

Experience

Speech & Audio Processing Lab — University of Palermo

Mar. 2026 – Present
Research Collaborator · Prof. Sabato Marco Siniscalchi
Extending the multimodal Alzheimer's screening dataset with additional participants and preparing a peer-reviewed publication.

Speech & Audio Processing Lab — University of Palermo

Sep. 2025 – Mar. 2026
MSc Thesis · Prof. Sabato Marco Siniscalchi, Dr. M. La Quatra
Multimodal Alzheimer's screening integrating raw handwriting sequences (CNN-Transformer + kinematic RF) with ASR-generated speech transcriptions (Qwen3-ASR + fine-tuned UmBERTo); decision-level late fusion over two binary tasks on a novel 140-participant Italian-speaking dataset.

Projects

CNN-Transformer on raw handwriting + Qwen3-ASR + fine-tuned UmBERTo on Italian speech. Late fusion over two binary diagnostic tasks (HC vs. Mild-AD and HC vs. MCI).
macro-F1 0.88 · sensitivity 0.91 (HC vs. Mild-AD)
PyTorchHuggingFaceQwen3-ASRUmBERTo
Re-implemented all 5 models from Pan et al. (INTERSPEECH 2021), replacing Kaldi ASR with Wav2Vec2 and WhisperX. KenLM-guided hypothesis sweeping reconstructs lattice uncertainty without raw lattice access.
85.92% accuracy · original paper 84.51%
Wav2Vec2WhisperXBERTADReSSo-2021
Systematic layer-level sweep across four SSL models (Wav2Vec2, WavLM, HuBERT, Whisper-small) to identify the most informative representations for depression detection on DAIC-WOZ and E-DAIC-WOZ.
PyTorchWav2Vec2WavLMHuBERTWhisper

ASR-Kit

2025–Present
Modular Python ASR library with a unified Transcriber API and pluggable driver registry. Word-level timestamps via forced alignment, multi-file batch transcription, pip-installable with optional per-backend dependencies.
PythonPyTorchQwen3-ASR
Fine-tuned BERTweet-base on TweetEval for irony detection and 4-class emotion classification. LLM-based data augmentation with DeepSeek-V3 for minority classes.
Irony F1 0.767 · Emotion F1 0.815
BERTweetHuggingFaceDeepSeek-V3TweetEval
DETR-based detection with Hungarian assignment. Cost matrix combines IoU and cosine similarity on RoI-aligned CNN features.
HOTA 25.8 · MOTA 24.4 on MOT17 test set
PyTorchDETROpenCVMOT17
Custom 8-bit SPI counter peripheral across three abstraction levels: RTL in SystemVerilog with testbench, FPGA synthesis (Yosys/nextpnr on Tang Nano 9K), and a bit-banging bare-metal C driver on Raspberry Pi 3B+. Extended with an ARM32 single-cycle processor simulation.
SystemVerilogCARM AssemblyTang Nano 9K
Interactive equirectangular 360° image and video navigator: rectilinear projection from spherical coordinates, keyboard-driven FOV/latitude/longitude control, zoom, and screenshot capture. Tkinter GUI for file selection and initial parameters.
PythonOpenCVTkinter
Pedestrian detection using HOG feature extraction and a trained SVM classifier on the WiderPerson dataset. Includes feature extraction pipeline, preprocessing comparison, and sliding-window inference.
PythonOpenCVscikit-learnHOGWiderPerson
Full-stack e-commerce app: Kotlin/Material Design 3 Android frontend (Retrofit2 + OkHttp) consuming a Django REST API backed by MySQL. Features product browsing, cart, wishlist, checkout with saved addresses and cards, order history, and reviews.
KotlinAndroidDjangoMySQLDocker

Skills

Programming Python, Java, C, SQL, JavaScript, ARM Assembly, SystemVerilog, Kotlin, C#, MATLAB
Frameworks PyTorch, HuggingFace Transformers, scikit-learn, NumPy, Pandas, OpenCV, librosa, Spring Boot, Android SDK
Tools Git, Linux, Jupyter, Docker, Unity, Neo4j, CometML
Languages Italian (native), English (B2)