bsenst commited on
Commit
ca2f0f4
·
1 Parent(s): 30e2215

reorder content, add grid view

Browse files
Files changed (24) hide show
  1. src/{colab.qmd → 01_setup/erforderlich/colab.qmd} +0 -0
  2. src/{google-konto.qmd → 01_setup/erforderlich/google-konto.qmd} +0 -0
  3. src/{huggingface.qmd → 01_setup/erforderlich/huggingface.qmd} +0 -0
  4. src/{colab-github.qmd → 01_setup/optional/colab-github.qmd} +0 -0
  5. src/{quarto-lokal.qmd → 01_setup/optional/quarto-lokal.qmd} +0 -0
  6. src/{google-play-search.qmd → 02_basics/app_market/google-play-search.qmd} +0 -0
  7. src/{pdf-grouping.qmd → 02_basics/pdf/pdf-grouping.qmd} +0 -0
  8. src/{pdf-link-extractor.qmd → 02_basics/pdf/pdf-link-extractor.qmd} +0 -0
  9. src/{website-url-extractor.qmd → 02_basics/webspider/website-url-extractor.qmd} +0 -0
  10. src/{webspider.qmd → 02_basics/webspider/webspider.qmd} +0 -0
  11. src/03_low_code/app_market_scraping/app_market_scraping.qmd +1 -0
  12. src/{notebooks → 03_low_code/catalogue}/bookstoscrape.qmd +0 -0
  13. src/{notebooks → 03_low_code/catalogue}/quotes_scraper.ipynb +0 -0
  14. src/{notebooks → 03_low_code/video_transcripts}/get_videos_for_youtube_channels.ipynb +0 -0
  15. src/{social-media.qmd → 03_low_code/video_transcripts/social-media.qmd} +0 -0
  16. src/{notebooks → 03_low_code/video_transcripts}/youtube-transcript-extraction.ipynb +0 -0
  17. src/{notebooks → 04_use_case/forum}/buergergeld_forum.ipynb +0 -0
  18. src/{notebooks → 04_use_case/jobs}/Jobboerse_API.ipynb +0 -0
  19. src/{notebooks → 04_use_case/laws}/Gesetze_im_Internet_Aktualitätendienst.ipynb +0 -0
  20. src/_quarto.yml +79 -51
  21. src/{tools.qmd → basics.qmd} +24 -1
  22. src/index.qmd +13 -0
  23. src/low_code.qmd +22 -0
  24. src/use_case.qmd +22 -0
src/{colab.qmd → 01_setup/erforderlich/colab.qmd} RENAMED
File without changes
src/{google-konto.qmd → 01_setup/erforderlich/google-konto.qmd} RENAMED
File without changes
src/{huggingface.qmd → 01_setup/erforderlich/huggingface.qmd} RENAMED
File without changes
src/{colab-github.qmd → 01_setup/optional/colab-github.qmd} RENAMED
File without changes
src/{quarto-lokal.qmd → 01_setup/optional/quarto-lokal.qmd} RENAMED
File without changes
src/{google-play-search.qmd → 02_basics/app_market/google-play-search.qmd} RENAMED
File without changes
src/{pdf-grouping.qmd → 02_basics/pdf/pdf-grouping.qmd} RENAMED
File without changes
src/{pdf-link-extractor.qmd → 02_basics/pdf/pdf-link-extractor.qmd} RENAMED
File without changes
src/{website-url-extractor.qmd → 02_basics/webspider/website-url-extractor.qmd} RENAMED
File without changes
src/{webspider.qmd → 02_basics/webspider/webspider.qmd} RENAMED
File without changes
src/03_low_code/app_market_scraping/app_market_scraping.qmd ADDED
@@ -0,0 +1 @@
 
 
1
+ # App Market Scraping
src/{notebooks → 03_low_code/catalogue}/bookstoscrape.qmd RENAMED
File without changes
src/{notebooks → 03_low_code/catalogue}/quotes_scraper.ipynb RENAMED
File without changes
src/{notebooks → 03_low_code/video_transcripts}/get_videos_for_youtube_channels.ipynb RENAMED
File without changes
src/{social-media.qmd → 03_low_code/video_transcripts/social-media.qmd} RENAMED
File without changes
src/{notebooks → 03_low_code/video_transcripts}/youtube-transcript-extraction.ipynb RENAMED
File without changes
src/{notebooks → 04_use_case/forum}/buergergeld_forum.ipynb RENAMED
File without changes
src/{notebooks → 04_use_case/jobs}/Jobboerse_API.ipynb RENAMED
File without changes
src/{notebooks → 04_use_case/laws}/Gesetze_im_Internet_Aktualitätendienst.ipynb RENAMED
File without changes
src/_quarto.yml CHANGED
@@ -5,15 +5,15 @@ website:
5
  navbar:
6
  left:
7
  - href: agenda.qmd
8
- text: "Agenda"
9
  - href: index.qmd
10
  text: "1️⃣ Start"
11
- - href: tools.qmd
12
- text: "2️⃣ No-Code"
13
- - href: notebooks/bookstoscrape.qmd
14
- text: "3️⃣ Low-Code"
15
- - href: notebooks/Gesetze_im_Internet_Aktualitätendienst.ipynb
16
- text: "4️⃣ Use-Case"
17
  tools:
18
  - icon: chat-dots
19
  href: https://huggingface.co/spaces/datenwerkzeuge/CDL-Webscraping-Workshop-2025/discussions
@@ -24,54 +24,82 @@ website:
24
  contents:
25
  - href: index.qmd
26
  text: "Willkommen"
27
- - href: google-konto.qmd
28
- text: "Google Konto erstellen"
29
- - href: colab.qmd
30
- text: "Colab nutzen"
31
- - href: colab-github.qmd
32
- text: "Colab nach GitHub speichern"
33
- - href: huggingface.qmd
34
- text: "Huggingface Ressourcen"
35
- - href: quarto-lokal.qmd
36
- text: "Quarto lokal"
37
- - title: "No-Code"
 
 
 
 
38
  contents:
39
- - href: tools.qmd
40
- text: "Werkzeuge"
41
- - href: pdf-link-extractor.qmd
42
- text: "PDF Link Extractor"
43
- - href: pdf-grouping.qmd
44
- text: "PDF Grouping"
45
- - href: google-play-search.qmd
46
- text: "Google Play Search"
47
- - href: website-url-extractor.qmd
48
- text: "URL Extractor"
49
- - href: webspider.qmd
50
- text: "Webspider"
51
- - title: "Low-Code"
 
 
 
 
 
 
52
  contents:
53
- - section: "Scrapen einer Beispielseite"
54
- contents:
55
- - href: notebooks/bookstoscrape.qmd
56
- text: "Bücherliste scrapen"
57
- - href: notebooks/quotes_scraper.ipynb
58
- text: "Zitate scrapen"
59
- - section: "Soziale Medien"
60
- contents:
61
- - href: social-media.qmd
62
- text: "Hinweise Scraping Social Media"
63
- - href: notebooks/buergergeld_forum.ipynb
64
- text: "Buergergeld Forum"
65
- - href: notebooks/get_videos_for_youtube_channels.ipynb
66
- text: "YouTube Channel Videos"
67
- - href: notebooks/youtube-transcript-extraction.ipynb
68
- text: "YouTube Video Transcripts"
 
 
 
 
69
  - title: "Use-Case"
70
  contents:
71
- - href: notebooks/Gesetze_im_Internet_Aktualitätendienst.ipynb
72
- text: "Aktualitätendienst Gesetze"
73
- - href: notebooks/Jobboerse_API.ipynb
74
- text: "Jobbörse"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
75
 
76
  format:
77
  html:
 
5
  navbar:
6
  left:
7
  - href: agenda.qmd
8
+ text: "📅 Agenda"
9
  - href: index.qmd
10
  text: "1️⃣ Start"
11
+ - href: basics.qmd
12
+ text: "2️⃣ No Code"
13
+ - href: low_code.qmd
14
+ text: "3️⃣ Low Code"
15
+ - href: use_case.qmd
16
+ text: "4️⃣ Anwendungsfall"
17
  tools:
18
  - icon: chat-dots
19
  href: https://huggingface.co/spaces/datenwerkzeuge/CDL-Webscraping-Workshop-2025/discussions
 
24
  contents:
25
  - href: index.qmd
26
  text: "Willkommen"
27
+ - section: "Erforderlich"
28
+ contents:
29
+ - href: 01_setup/erforderlich/google-konto.qmd
30
+ text: "Google Konto erstellen"
31
+ - href: 01_setup/erforderlich/colab.qmd
32
+ text: "Colab nutzen"
33
+ - href: 01_setup/erforderlich/huggingface.qmd
34
+ text: "Huggingface Ressourcen"
35
+ - section: "Optional"
36
+ contents:
37
+ - href: 01_setup/optional/colab-github.qmd
38
+ text: "Colab nach GitHub speichern"
39
+ - href: 01_setup/optional/quarto-lokal.qmd
40
+ text: "Quarto lokal"
41
+ - title: "No Code"
42
  contents:
43
+ - href: basics.qmd
44
+ text: "No Code Übersicht"
45
+ - section: "PDF"
46
+ contents:
47
+ - href: 02_basics/pdf/pdf-link-extractor.qmd
48
+ text: "PDF Link Extractor"
49
+ - href: 02_basics/pdf/pdf-grouping.qmd
50
+ text: "PDF Grouping"
51
+ - section: "App Marketplace"
52
+ contents:
53
+ - href: 02_basics/app_market/google-play-search.qmd
54
+ text: "Google Play Search"
55
+ - section: "Webspider"
56
+ contents:
57
+ - href: 02_basics/webspider/website-url-extractor.qmd
58
+ text: "URL Extractor"
59
+ - href: 02_basics/webspider/webspider.qmd
60
+ text: "Webspider"
61
+ - title: "Low Code"
62
  contents:
63
+ - href: low_code.qmd
64
+ text: "Low Code Übersicht"
65
+ - section: "Katalog"
66
+ contents:
67
+ - href: 03_low_code/catalogue/bookstoscrape.qmd
68
+ text: "Bücherliste scrapen"
69
+ - href: 03_low_code/catalogue/quotes_scraper.ipynb
70
+ text: "Zitate scrapen"
71
+ - section: "App Markt"
72
+ contents:
73
+ - href: 03_low_code/app_market_scraping/app_market_scraping.qmd
74
+ text: "App Markt scrapen"
75
+ - section: "Video Transkripte"
76
+ contents:
77
+ - href: 03_low_code/video_transcripts/social-media.qmd
78
+ text: "Hinweise Scraping Social Media"
79
+ - href: 03_low_code/video_transcripts/get_videos_for_youtube_channels.ipynb
80
+ text: "YouTube Channel Videos"
81
+ - href: 03_low_code/video_transcripts/youtube-transcript-extraction.ipynb
82
+ text: "YouTube Video Transcripts"
83
  - title: "Use-Case"
84
  contents:
85
+ - href: use_case.qmd
86
+ text: "Anwendungsfall Übersicht"
87
+ - section: "Gesetze"
88
+ contents:
89
+ - href: 04_use_case/laws/Gesetze_im_Internet_Aktualitätendienst.ipynb
90
+ text: "Aktualitätendienst Gesetze"
91
+ - section: "Jobs"
92
+ contents:
93
+ - href: 04_use_case/jobs/Jobboerse_API.ipynb
94
+ text: "Jobbörse"
95
+ - section: "Forum"
96
+ contents:
97
+ - href: 04_use_case/forum/buergergeld_forum.ipynb
98
+ text: "Buergergeld Forum"
99
+ - title: "Blog"
100
+ contents:
101
+ - href: blog.qmd
102
+ text: "Blog"
103
 
104
  format:
105
  html:
src/{tools.qmd → basics.qmd} RENAMED
@@ -1,4 +1,27 @@
1
- Eine Sammlung interaktiver **Spaces**, die praktische Anwendungen rund um **Webscraping** und **lokale Datensammlung** demonstrieren. Ziel ist es, die Möglichkeiten der Datenerfassung zu illustrieren.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
  ### **Bereits verfügbares Werkzeug:**
4
  - **[Webspider](https://huggingface.co/spaces/datenwerkzeuge/webspider)**:
 
1
+ ---
2
+ title: "No Code Übersicht"
3
+ listing:
4
+ - id: pdf
5
+ contents: "02_basics/pdf"
6
+ type: grid
7
+ - id: app_market
8
+ contents: "02_basics/app_market"
9
+ type: grid
10
+ - id: webspider
11
+ contents: "02_basics/webspider"
12
+ type: grid
13
+ ---
14
+
15
+ Eine Sammlung interaktiver **Spaces**, die praktische Anwendungen rund um **Webscraping** und **lokale Datensammlung** demonstrieren. Ziel ist es, die Möglichkeiten der Datenerfassung zu illustrieren.
16
+
17
+ ::: {#pdf}
18
+ :::
19
+
20
+ ::: {#app_market}
21
+ :::
22
+
23
+ ::: {#webspider}
24
+ :::
25
 
26
  ### **Bereits verfügbares Werkzeug:**
27
  - **[Webspider](https://huggingface.co/spaces/datenwerkzeuge/webspider)**:
src/index.qmd CHANGED
@@ -1,9 +1,22 @@
1
  ---
2
  title: "Willkommen"
 
 
 
 
 
 
 
3
  ---
4
 
5
  Herzlich willkommen zum Webscraping Workshop! In diesem Workshop lernen Sie, wie Sie Daten aus dem Web extrahieren und analysieren können. Egal, ob Sie Anfänger oder Fortgeschrittener sind, dieser Workshop bietet Ihnen wertvolle Einblicke und praktische Erfahrungen.
6
 
 
 
 
 
 
 
7
  ## Navigation auf der Quarto-Webseite 🧭
8
 
9
  Unsere Quarto-Webseite ist so strukturiert, dass Sie sich leicht zurechtfinden und dem Workshop effizient folgen können.
 
1
  ---
2
  title: "Willkommen"
3
+ listing:
4
+ - id: erforderlich
5
+ contents: "01_setup/erforderlich"
6
+ type: grid
7
+ - id: optional
8
+ contents: "01_setup/optional"
9
+ type: grid
10
  ---
11
 
12
  Herzlich willkommen zum Webscraping Workshop! In diesem Workshop lernen Sie, wie Sie Daten aus dem Web extrahieren und analysieren können. Egal, ob Sie Anfänger oder Fortgeschrittener sind, dieser Workshop bietet Ihnen wertvolle Einblicke und praktische Erfahrungen.
13
 
14
+ ::: {#erforderlich}
15
+ :::
16
+
17
+ ::: {#optional}
18
+ :::
19
+
20
  ## Navigation auf der Quarto-Webseite 🧭
21
 
22
  Unsere Quarto-Webseite ist so strukturiert, dass Sie sich leicht zurechtfinden und dem Workshop effizient folgen können.
src/low_code.qmd ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: "Low Code Übersicht"
3
+ listing:
4
+ - id: catalogue
5
+ contents: "03_low_code/catalogue"
6
+ type: grid
7
+ - id: app_market_scraping
8
+ contents: "03_low_code/app_market_scraping"
9
+ type: grid
10
+ - id: video_transcripts
11
+ contents: "03_low_code/video_transcripts"
12
+ type: grid
13
+ ---
14
+
15
+ ::: {#catalogue}
16
+ :::
17
+
18
+ ::: {#app_market_scraping}
19
+ :::
20
+
21
+ ::: {#video_transcripts}
22
+ :::
src/use_case.qmd ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: "Anwendungsfall Übersicht"
3
+ listing:
4
+ - id: laws
5
+ contents: "04_use_case/laws"
6
+ type: grid
7
+ - id: jobs
8
+ contents: "04_use_case/jobs"
9
+ type: grid
10
+ - id: forum
11
+ contents: "04_use_case/forum"
12
+ type: grid
13
+ ---
14
+
15
+ ::: {#laws}
16
+ :::
17
+
18
+ ::: {#jobs}
19
+ :::
20
+
21
+ ::: {#forum}
22
+ :::