Hao Peng
.
Education
Mastere of science in Computer Science Darmstadt, German
Oct.2024 - Present
Technical University of Darmstadt, Faculty of Computer Science, Darmstadt
- Specialization: Visual Computing, Machine Learning, Multimodal AI and NLP
- Project: Development of a multimodal AI model for generating audio data based on image regions
Bachelor of Engineering in Computer Science and Technology Nanning, China
Sep.2020 - Jun.2024
Guangxi University. Faculty of Computer Science and Technology, Nanning Grade: 82
- Specialization: Science and Technology
- Thesis: Detection of weeds in rice field and comparison of results mainly using the YOLO model
- Related Coursework: Computer Vision and Image Processing, Artificial Intelligence, Operating Systems, Database Systems, Data Structures and Algorithms, Software Engineering, Object-Oriented Programming (Java/C++), Fundamentals of C Programming
Experiences
1. Research Assistant Jul. 2023 – Oct. 2023
- Internship in Institute of Automation, Chinese Academy of Sciences
- Digital image processing (measurement of object size, lane detection)
- Fundamentals of Deep Learning (Mnist data set, neural networks)
- Weed detection in paddies and cornfields
Technical Skills: Python, PyTorch, OpenCv and Labelme
2. Summer School Jul.2021 - Aug.2021
- Summer School in TU-Berlin
- Python learning and learn to use basic data analysis modules and practice in applications.
- Applied OpenCV to perform fundamental image reading, processing, and feature extraction tasks, including grayscale conversion and edge detection techniques.
- Conducted web data extraction using Python with requests and BeautifulSoup, followed by data cleaning, transformation, statistical analysis, and basic visualization using NumPy and Pandas.
- Grade :1.3
Technical Skills: Python, NumPy, Pandas and OpenCV
3. Web Development and Maintenance Jun.2021 - Jun.2023
- Django Web Development and Maintenance In KONGGU CAMPUS MEDIA
- Use the django framework to develop a web page for Konggu Campus Media.
- Test and launch this Web.
- Update and release of special topics.
- Maintenance of the web.
Technical skill: Python, Django, HTML, CSS, Javascript, Git and Markdown.
Project
1. Project: Multimodal Artificial Intelligence Oct. 2024 – Mar. 2025
- The theme of our project is Image-Region2Audio Generation
- Designed a region-aware image-to-audio generation system by integrating SAM2 segmentation and CLIP features into the IM2WAV framework, achieving improved semantic alignment and perceptual audio quality.
Technical skill: Multimodal learning, Foundation model usage, Model customization & fine-tuning, Computer Vision, Audio Generation,PyTorch,HuggingFace and OpenCV
2. Project: Database and User Interface Developmet Aug.2021 - Jul.2022
- Developed a lightweight management system for a small supermarket.
- Followed the complete software development process, including requirements gathering and documentation, designing system architecture, implementing core functionalities, testing for reliability, and delivering a user-friendly interface based on the user specification document.
- Developed the frontend using the Vue.js framework and implemented the backend with a MySQL database for data storage and management.
Technical skill: Java, Vue.js, JavaScript, HTML, CSS, MySQL, Git and Markdown
3. Project: Multi-View 3D Imaging System on Mobile PlatformDec.2020 - Jun.2023
- The Program is a Multi-view 3D intelligent imaging measurement system based on mobile terminal and it’s for College Student’s innovation and Entrepreneurship Competition.
- My primary responsibility in the project was developing the Android interface, focusing on user interaction, UI design, and integration with the 3D imaging modules.
Technical skill: Java, Algorithm and Android Development.
Language Skills
- English: Professional working proficiency
- Chinese: Native
- German: Elementary
Technical Skills
- Programming Languages: Python, C++, C, Java, Go
- AI: PyTorch, YOLO, Mutimodal AI, NLP
- Web & Mini Program Development: Django, SSM (Spring + Spring MVC + MyBatis), WeChat Mini Program Development
- Database Systems: MySQL, SQL Server
- Tools & Platforms: Git, GitHub, Markdown, LaTeX
- Collaboration & Soft Skills: Team collaboration, strong communication, creative problem solving, innovation-driven mindset, open to feedback, proactive attitude