gogo2/vision/notes.md
2024-09-10 02:37:01 +03:00

25 lines
770 B
Markdown

Visual options :
-- OD:
- object detction /w fine tuning: Yolo V5: https://learnopencv.com/custom-object-detection-training-using-yolov5/
-- V-aware
- visual LLM: LLAVA : https://llava.hliu.cc/
-- BOTH detection and comprehention:
-Phi
https://huggingface.co/microsoft/Phi-3-vision-128k-instruct
https://github.com/microsoft/Phi-3CookBook
- Lavva chat
https://github.com/LLaVA-VL/LLaVA-Interactive-Demo?tab=readme-ov-file
git clone https://github.com/LLaVA-VL/LLaVA-Interactive-Demo.git
conda create -n llava_int -c conda-forge -c pytorch python=3.10.8 pytorch=2.0.1 -y
conda activate llava_int
cd LLaVA-Interactive-Demo
pip install -r requirements.txt
source setup.sh
- decision making based on ENV, RL: https://github.com/OpenGenerativeAI/llm-colosseum