Data Science portfolio
M.Sc., Computer Science | Reichman University (2022 - 2025) |
MBA, Business Administration | Tel Aviv University (2024 - Present) |
B.Sc, Computer Science | College of Management (2019 - 2022) |
R&D Group Lead @ Israeli Military Intelligence - Unit 8200 (2024 - Present)
Software Engineer Team Lead @ Israeli Military Intelligence - Unit 8200 (2022 - 2024)
We developed a Sentence-BERT (SBERT) model tailored for Hebrew, applying advanced representation learning and contrastive training techniques to generate high-quality sentence embeddings. Using the Hebrew Natural Language Inference (NLI) dataset for training and the Hebrew Semantic Textual Similarity (STS) benchmark for evaluation, we fine-tuned four transformer architectures — AlephBERT, mBERT, DictaBERT, and RoBERTa — within the SentenceTransformer framework. The best-performing configuration, DictaBERT with Contrastive Loss, achieved Pearson correlation of 0.648 and Spearman correlation of 0.661, outperforming all other models in capturing subtle semantic relationships in Hebrew text. Beyond the research contribution, this project showcases the data science workflow for low-resource NLP, including data preprocessing, model evaluation, and optimization under computational constraints.
EyeVision is an AI-powered system that uses advanced computer vision and video analytics to automatically detect signs of violence in daycares from regular CCTV footage. Leveraging deep learning models for both visual and motion analysis, our solution achieved over 90% accuracy in real-time detection and alerting through a dedicated app for parents and authorities. Beyond its technical innovation, EyeVision presents a scalable business solution with strong social impact potential — and was awarded First Place in the College of Management’s Outstanding Final Project Competition.
ChatGPT The Tweets, a large-scale data science and machine learning pipeline designed to analyze public sentiment and trends about ChatGPT on Twitter. Using PySpark for scalable big data processing and logistic regression for sentiment classification, we analyzed over 190,000 tweets to uncover correlations between sentiment, geography, and user occupation. Additionally, we applied N-Gram and TF-IDF models to extract trending keywords and popular discussion topics. Beyond its analytical depth, this project demonstrates how AI-driven social listening can inform business and product strategy through real-time public opinion insights.
In this project, we applied Reinforcement Learning (RL) techniques to train an AI agent to play the game Flappy Bird using SARSA and Q-Learning algorithms. We built a custom state-space preprocessing pipeline that reduced over 10^21 possible states to only 10,000, enabling efficient tabular learning. The project incorporated reward shaping, epsilon decay exploration, and extensive hyperparameter experimentation to evaluate performance and stability. Results showed that SARSA with Epsilon Decay achieved the best balance between exploration and exploitation, outperforming other configurations with an average reward of 27.55 and a maximum score of 343.