quarterturn
/

molmo-flux-captioner

quarterturn commited on Oct 11, 2024

Commit

ee26be4

1 Parent(s): 837bad6

first commit

Files changed (3) hide show

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+*.png filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -24,3 +24,8 @@ Install:
    ``` python3 caption.py ```
    1. make sure your images are in the "images" directory
    2. captions will be placed in the "images" directory

    ``` python3 caption.py ```
    1. make sure your images are in the "images" directory
    2. captions will be placed in the "images" directory
+Note:
+- The scripts are configured to load the model at bf16 precision, for max precision and lower memory utilization. This should fit in a single 24GB GPU.
+- You can edit the scripts to use a lower quant of the model, such as fp8, though accuracy may be lower.
+- If torch sees your first GPU supports flash attention and the others do not, it will assume all the cards do and it will throw an exception. A workaround is to use, for example, "CUDA_VISIBLE_DEVICES=0 python3 main.py (or caption.py)", to force torch to ignore the card supporting flash attention, so that it will use your other cards without it. Or, use it to exclude non-flash-attention-supporting GPUs.

example.png CHANGED Viewed