From 736ef27852ad7d4d748ff80c708aeb39099212ad Mon Sep 17 00:00:00 2001 From: Dobromir Popov Date: Tue, 10 Sep 2024 02:37:01 +0300 Subject: [PATCH] MISC --- vision/notes.md | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) create mode 100644 vision/notes.md diff --git a/vision/notes.md b/vision/notes.md new file mode 100644 index 0000000..4d311c5 --- /dev/null +++ b/vision/notes.md @@ -0,0 +1,25 @@ + Visual options : + -- OD: + - object detction /w fine tuning: Yolo V5: https://learnopencv.com/custom-object-detection-training-using-yolov5/ + +-- V-aware + - visual LLM: LLAVA : https://llava.hliu.cc/ + + -- BOTH detection and comprehention: + -Phi + https://huggingface.co/microsoft/Phi-3-vision-128k-instruct + https://github.com/microsoft/Phi-3CookBook + +- Lavva chat +https://github.com/LLaVA-VL/LLaVA-Interactive-Demo?tab=readme-ov-file +git clone https://github.com/LLaVA-VL/LLaVA-Interactive-Demo.git +conda create -n llava_int -c conda-forge -c pytorch python=3.10.8 pytorch=2.0.1 -y +conda activate llava_int +cd LLaVA-Interactive-Demo +pip install -r requirements.txt +source setup.sh + + + + +- decision making based on ENV, RL: https://github.com/OpenGenerativeAI/llm-colosseum \ No newline at end of file