File size: 2,129 Bytes
5bb6107
 
 
 
 
6e67656
 
 
 
 
 
9007f39
 
be4c045
9007f39
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
be522d2
 
 
 
 
 
9007f39
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
---
license: apache-2.0
tags:
- Computer
- computervision
---

# Uses

This LLM is trained on data generated by my code for the yolov8 model. [Github code](https://github.com/bauerhartmut/yolov8-Computervision)
The model is capable of briefly describing what the yolov8 model can detect and can also execute a command (/click). 
When the command is triggered, a dictionary is generated containing the key data of the object to be clicked.

# Testing
You can test the model by giving it this informations:

```json
{
    "Object": [
        {
            "index": "window_0",
            "label": "window",
            "property": "toplayer",
            "coords": [
                189.06007385253906,
                79.33326721191406,
                1156.018798828125,
                750.1478271484375
            ],
            "textes": 24,
            "interactions": [
                {
                    "label": "close_window",
                    "interaction_type": 1,
                    "coords": [
                        1114.04541015625,
                        84.65348815917969,
                        1149.1778564453125,
                        113.41248321533203
                    ]
                },
                {
                    "label": "maximize",
                    "interaction_type": 1,
                    "coords": [
                        1067.0111083984375,
                        84.82215118408203,
                        1099.86328125,
                        112.69491577148438
                    ]
                },
                {
                    "label": "minize_window",
                    "interaction_type": 1,
                    "coords": [
                        1024.7701416015625,
                        85.06327819824219,
                        1053.4327392578125,
                        111.52396392822266
                    ]
                }
            ]
        }
    ]
}
```

You can give the model this informations and a prompt like "Was siehst du" or "Kannst du das Fenster schließen".

The Model is at the moment only trained on german.