Spaces:

Saving-Willy
/

saving-willy-dev

Sleeping

App Files Files Community

vancauwe commited on 22 days ago

Commit

8209004

unverified ·

2 Parent(s): 821ac40 836bd51

Merge pull request #38 from sdsc-ordes/fix/spoof-metadata

Browse files

Files changed (8) hide show

.github/workflows/python-pytest.yml +1 -0
.github/workflows/python-visualtests.yml +2 -0
docs/dev_notes.md +24 -7
src/input/input_handling.py +33 -12
tests/test_demo_input_sidebar.py +9 -2
tests/test_demo_multifile_upload.py +6 -2
tests/test_main.py +5 -2
tests/visual_selenium/test_visual_main.py +3 -0

.github/workflows/python-pytest.yml CHANGED Viewed

@@ -34,3 +34,4 @@ jobs:
     - name: Run quick tests with pytest
       run: |
         pytest -m "not slow and not visual" --strict-markers --ignore=tests/visual_selenium

     - name: Run quick tests with pytest
       run: |
         pytest -m "not slow and not visual" --strict-markers --ignore=tests/visual_selenium

.github/workflows/python-visualtests.yml CHANGED Viewed

@@ -51,3 +51,5 @@ jobs:
       # otherwise, not one step it consistently fails at.)
       run: |
             pytest -m "visual" --strict-markers tests/visual_selenium/ -s --demo

       # otherwise, not one step it consistently fails at.)
       run: |
             pytest -m "visual" --strict-markers tests/visual_selenium/ -s --demo
+      # DEBUG_AUTOPOPULATE_METADATA=True streamlit run src/main.py

docs/dev_notes.md CHANGED Viewed

@@ -13,7 +13,7 @@ Then use a web browser to view the site indiciated, by default: http://localhost
 # How to build and view docs locally
-We have a CI action to presesnt the docs on github.io.
 To validate locally, you need the deps listed in `requirements.txt` installed.
 Run
@@ -51,14 +51,15 @@ The CI runs with `--strict-markers` so any new marker must be registered in
 - the basic CI action runs the fast tests only, skipping all tests marked
   `visual` and `slow`
-- the CI action on PR runs the `slow` tests, but stil excluding `visual`.
-- TODO: a new action for the visual tests is to be developed.
 Check all tests are marked ok, and that they are filtered correctly by the
 groupings used in CI:
 ```bash
 pytest --collect-only -m "not slow and not visual" --strict-markers --ignore=tests/visual_selenium
 pytest --collect-only -m "not visual" --strict-markers --ignore=tests/visual_selenium
 ```
@@ -97,7 +98,8 @@ pytest --cov-report=lcov --cov=src
 We use seleniumbase to test the visual appearance of the app, including the
 presence of elements that appear through the workflow.  This testing takes quite
-a long time to execute and is not yet configured with CI.
 ```bash
 # install packages for app and for visual testing
@@ -106,14 +108,15 @@ pip install -r tests/visual_selenium/requirements_visual.txt
 ```
 **Running tests**
-The execution of these tests requires that the site/app is running already.
-In one tab:
 ```bash
 streamlit run src/main.py
 ```
-In another tab:
 ```bash
 # run just the visual tests
 pytest -m "visual" --strict-markers
@@ -132,3 +135,17 @@ pytest -m "not slow and not visual" --strict-markers --ignore=tests/visual_selen
 Initially we have an action setup that runs all tests in the `tests` directory, within the `test/tests` branch.
 TODO: Add some test report & coverage badges to the README.

 # How to build and view docs locally
+We have a CI action to present the docs on github.io.
 To validate locally, you need the deps listed in `requirements.txt` installed.
 Run
 - the basic CI action runs the fast tests only, skipping all tests marked
   `visual` and `slow`
+- the CI action on PR runs the `slow` tests, but still excluding `visual`.
+- a second action for the visual tests runs on PR.
 Check all tests are marked ok, and that they are filtered correctly by the
 groupings used in CI:
 ```bash
 pytest --collect-only -m "not slow and not visual" --strict-markers --ignore=tests/visual_selenium
 pytest --collect-only -m "not visual" --strict-markers --ignore=tests/visual_selenium
+pytest --collect-only -m "visual" --strict-markers tests/visual_selenium/ -s --demo
 ```
 We use seleniumbase to test the visual appearance of the app, including the
 presence of elements that appear through the workflow.  This testing takes quite
+a long time to execute. It is configured in a separate CI action
+(`python-visualtests.yml`).
 ```bash
 # install packages for app and for visual testing
 ```
 **Running tests**
+The execution of these tests requires that the site/app is running already, which
+is handled by a fixture (that starts the app in another thread).
+Alternatively, in one tab, run:
 ```bash
 streamlit run src/main.py
 ```
+In another tab, run:
 ```bash
 # run just the visual tests
 pytest -m "visual" --strict-markers
 Initially we have an action setup that runs all tests in the `tests` directory, within the `test/tests` branch.
 TODO: Add some test report & coverage badges to the README.
+## Environment flags used in development
+- `DEBUG_AUTOPOPULATE_METADATA=True` : Set this env variable to have the text
+  inputs autopopulated, to make stepping through the workflow faster during
+  development work.
+Typical usage:
+```bash
+DEBUG_AUTOPOPULATE_METADATA=True streamlit run src/main.py
+```

src/input/input_handling.py CHANGED Viewed

@@ -2,6 +2,7 @@ from typing import List, Tuple
 import datetime
 import logging
 import hashlib
 import streamlit as st
 from streamlit.delta_generator import DeltaGenerator
@@ -23,15 +24,31 @@ both the UI elements (setup_input_UI) and the validation functions.
 '''
 allowed_image_types = ['jpg', 'jpeg', 'png', 'webp']
 # an arbitrary set of defaults so testing is less painful...
 # ideally we add in some randomization to the defaults
-spoof_metadata = {
-    "latitude": 0.5,
-    "longitude": 44,
-    "author_email": "super@whale.org",
-    "date": None,
-    "time": None,
-}
 def check_inputs_are_set(empty_ok:bool=False, debug:bool=False) -> bool:
     """
@@ -50,12 +67,15 @@ def check_inputs_are_set(empty_ok:bool=False, debug:bool=False) -> bool:
         return empty_ok
     exp_input_key_stubs = ["input_latitude", "input_longitude", "input_date", "input_time"]
-    #exp_input_key_stubs = ["input_latitude", "input_longitude", "input_author_email", "input_date", "input_time",
     vals = []
     # the author_email is global/one-off - no hash extension.
     if "input_author_email" in st.session_state:
         val = st.session_state["input_author_email"]
         vals.append(val)
         if debug:
             msg = f"{'input_author_email':15}, {(val is not None):8}, {val}"
@@ -190,10 +210,11 @@ def metadata_inputs_one_file(file:UploadedFile, image_hash:str, dbg_ix:int=0) ->
     msg = f"[D] {filename}: lat, lon from image metadata: {latitude0}, {longitude0}"
     m_logger.debug(msg)
-    if latitude0 is None: # get some default values if not found in exifdata
-        latitude0:float = spoof_metadata.get('latitude', 0) + dbg_ix
-    if longitude0 is None:
-        longitude0:float = spoof_metadata.get('longitude', 0) - dbg_ix
     image = st.session_state.images.get(image_hash, None)
     # add the UI elements

 import datetime
 import logging
 import hashlib
+import os
 import streamlit as st
 from streamlit.delta_generator import DeltaGenerator
 '''
 allowed_image_types = ['jpg', 'jpeg', 'png', 'webp']
+def _is_str_true(v:str) -> bool:
+    ''' convert a string to boolean: if contains True or 1 (or yes), return True '''
+    # https://stackoverflow.com/questions/715417/converting-from-a-string-to-boolean-in-python
+    return v.lower() in ("yes", "true", "t", "1")
+def load_debug_autopopulate() -> bool:
+    return _is_str_true( os.getenv("DEBUG_AUTOPOPULATE_METADATA", "False"))
 # an arbitrary set of defaults so testing is less painful...
 # ideally we add in some randomization to the defaults
+dbg_populate_metadata = load_debug_autopopulate()
+# the other main option would be argparse, where we can run `streamlit run src/main.py -- --debug` or similar
+# - I think env vars are simple and clean enough, it isn't really a CLI that we want to offer debug options, it is for dev.
+if dbg_populate_metadata:
+    spoof_metadata = {
+        "latitude": 0.5,
+        "longitude": 44,
+        "author_email": "[email protected]",
+        "date": None,
+        "time": None,
+    }
+else:
+    spoof_metadata = {}
 def check_inputs_are_set(empty_ok:bool=False, debug:bool=False) -> bool:
     """
         return empty_ok
     exp_input_key_stubs = ["input_latitude", "input_longitude", "input_date", "input_time"]
     vals = []
     # the author_email is global/one-off - no hash extension.
     if "input_author_email" in st.session_state:
         val = st.session_state["input_author_email"]
+        # if val is a string and empty, set to None
+        if isinstance(val, str) and not val:
+            val = None
         vals.append(val)
         if debug:
             msg = f"{'input_author_email':15}, {(val is not None):8}, {val}"
     msg = f"[D] {filename}: lat, lon from image metadata: {latitude0}, {longitude0}"
     m_logger.debug(msg)
+    if spoof_metadata:
+        if latitude0 is None: # get some default values if not found in exifdata
+            latitude0:float = spoof_metadata.get('latitude', 0) + dbg_ix
+        if longitude0 is None:
+            longitude0:float = spoof_metadata.get('longitude', 0) - dbg_ix
     image = st.session_state.images.get(image_hash, None)
     # add the UI elements

tests/test_demo_input_sidebar.py CHANGED Viewed

@@ -3,6 +3,7 @@ from pathlib import Path
 from io import BytesIO
 from PIL import Image
 import numpy as np
 import pytest
 from unittest.mock import MagicMock, patch
@@ -12,7 +13,7 @@ import time
 from input.input_handling import spoof_metadata
 from input.input_observation import InputObservation
-from input.input_handling import buffer_uploaded_files
 from streamlit.runtime.uploaded_file_manager import UploadedFile
@@ -184,7 +185,13 @@ def test_no_input_no_interaction():
     at = AppTest.from_file(SCRIPT_UNDER_TEST, default_timeout=10).run()
     verify_initial_session_state(at)
-    assert at.session_state.input_author_email == spoof_metadata.get("author_email")
     # print (f"[I] whole tree: {at._tree}")
     # for elem in at.sidebar.markdown:

 from io import BytesIO
 from PIL import Image
 import numpy as np
+import os
 import pytest
 from unittest.mock import MagicMock, patch
 from input.input_handling import spoof_metadata
 from input.input_observation import InputObservation
+from input.input_handling import buffer_uploaded_files, load_debug_autopopulate
 from streamlit.runtime.uploaded_file_manager import UploadedFile
     at = AppTest.from_file(SCRIPT_UNDER_TEST, default_timeout=10).run()
     verify_initial_session_state(at)
+    dbg = load_debug_autopopulate()
+    #var = at.session_state.input_author_email
+    #_cprint(f"[I] input email is '{var}' type: {type(var)} | is None? {var is None} | {dbg}", PURPLE)
+    if dbg: # autopopulated
+        assert at.session_state.input_author_email == spoof_metadata.get("author_email")
+    else: # should be empty, the user has to fill it in
+        assert at.session_state.input_author_email == ""
     # print (f"[I] whole tree: {at._tree}")
     # for elem in at.sidebar.markdown:

tests/test_demo_multifile_upload.py CHANGED Viewed

@@ -26,7 +26,7 @@ from streamlit.testing.v1 import AppTest
 # for expectations
-from input.input_handling import spoof_metadata
 from input.input_validator import get_image_datetime, get_image_latlon
@@ -137,7 +137,11 @@ def test_no_input_no_interaction():
     at = AppTest.from_file("src/apptest/demo_multifile_upload.py").run()
     assert at.session_state.observations == {}
-    assert at.session_state.input_author_email == spoof_metadata.get("author_email")
 def test_bad_email():
     with patch.dict(spoof_metadata, {"author_email": "notanemail"}):

 # for expectations
+from input.input_handling import spoof_metadata, load_debug_autopopulate
 from input.input_validator import get_image_datetime, get_image_latlon
     at = AppTest.from_file("src/apptest/demo_multifile_upload.py").run()
     assert at.session_state.observations == {}
+    dbg = load_debug_autopopulate()
+    if dbg: # autopopulated
+        assert at.session_state.input_author_email == spoof_metadata.get("author_email")
+    else: # should be empty, the user has to fill it in
+        assert at.session_state.input_author_email == ""
 def test_bad_email():
     with patch.dict(spoof_metadata, {"author_email": "notanemail"}):

tests/test_main.py CHANGED Viewed

@@ -3,7 +3,7 @@ from unittest.mock import MagicMock, patch
 from streamlit.testing.v1 import AppTest
 import time
-from input.input_handling import spoof_metadata
 from input.input_observation import InputObservation
 from input.input_handling import buffer_uploaded_files
@@ -72,7 +72,10 @@ def test_click_validate_after_data_entry(mock_file_rv: MagicMock, mock_uploadedF
     assert infer_button.disabled == True
-    # 2. upload files, and trigger the callback
     # put the mocked file_upload into session state, as if it were the result of a file upload, with the key 'file_uploader_data'
     at.session_state["file_uploader_data"] = mock_files

 from streamlit.testing.v1 import AppTest
 import time
+from input.input_handling import spoof_metadata, load_debug_autopopulate
 from input.input_observation import InputObservation
 from input.input_handling import buffer_uploaded_files
     assert infer_button.disabled == True
+    # 2. upload files, enter email, and trigger the callback
+    if not load_debug_autopopulate():
+        # fill the text box with a dummy email
+        at.session_state.input_author_email = "[email protected]"
     # put the mocked file_upload into session state, as if it were the result of a file upload, with the key 'file_uploader_data'
     at.session_state["file_uploader_data"] = mock_files

tests/visual_selenium/test_visual_main.py CHANGED Viewed

@@ -208,6 +208,7 @@ class RecorderTest(BaseCase):
         # - setup steps:
         #    - open the app
         #    - upload two images
         #    - validate the data entry
         #    - click the infer button, wait for ML
         # - the real test steps:
@@ -228,6 +229,8 @@ class RecorderTest(BaseCase):
             'input[data-testid="stFileUploaderDropzoneInput"]',
             "\n".join([str(img_f1), str(img_f2)]),
         )
         # advance to the next step, by clicking the validate button (wait for it first)
         wait_for_element(self, By.XPATH, "//button//strong[contains(text(), 'Validate')]")

         # - setup steps:
         #    - open the app
         #    - upload two images
+        #    - enter author email
         #    - validate the data entry
         #    - click the infer button, wait for ML
         # - the real test steps:
             'input[data-testid="stFileUploaderDropzoneInput"]',
             "\n".join([str(img_f1), str(img_f2)]),
         )
+        # enter author email
+        self.type('input[aria-label="Author Email"]', "[email protected]\n")
         # advance to the next step, by clicking the validate button (wait for it first)
         wait_for_element(self, By.XPATH, "//button//strong[contains(text(), 'Validate')]")