Welcome to the Club de TeleMatique mast bash crash course!

Here you will learn how to make a mastodon bot with a couple of bash script lines. (ok almost 100, but there are comments and spaces).
Why bash ? Besides the fact that I like bash, you will have shell on any unix machine, and the script can be easily modified for another shell. Most of the other scripts I found on the internet are python script. Nothing wrong with that, but if it isn't on your machine, or not the supported version...

Lesson 1: The Foundation

What you need:
- an ascii editor. Vim on linux, emacs, whatever.
- a linux/unix machine with bash installed.
- a mastodon account...
- a copy of madonctl installed on the machine. (or anything that can post on mastodon)
Vocabulary: in programming, you need to understand the 4 following words:
- constant: things that don't change
- variable: things that do change
- type: sort of
- function: piece of code returning a result

Some languages don't have constants, so constants are just variable that don't change (ideally). (*)

(*) Hence the famous programmer joke: constants aren't - variables won't.

[Top]

Lesson 2: The basics

install madonctl in a place it will be found (ie in path). That's where you'll place your bash script as well.
What we are going to do is a simple script that

gets the news from a newspaper website
checks if it has been posted in the previous run
posts whatever hasn't been posted before
saves what has been posted for the next run to check

Nothing fancy, the idea is to give you the basics, you can of course improve and modify to your taste.

[Top]

Lesson 3: The script

First we download the page with curl (a standard unix client for urls).
```
curl  "https://www.lessentiel.lu/de"
```
What we get is a html page. Pretty much everything is mangled, so we need first to have a better idea of what is really in there. In order to do that, instead of dumping the content directly to the screen, we will first filter it by sorting it by html tags.
we create a small function that will just do that:
```
##########################
xmlgetnext () {
local IFS='>'
read -d '<' TAG VALUE
}
```
And do the same again:
```
curl "https://www.lessentiel.lu/de" | while xmlgetnext ; do echo $TAG ; done
```
Much better. We have now all tags clearly on the screen, and we can see that all the articles that we want have a a href="/de/story in front of them

Then we parse it and get what we want, ie the news
So let's just do that:

curl "https://www.lessentiel.lu/de" | while xmlgetnext ; do echo $TAG ; done | grep '^a href="/de/stor'

grep is the standard unix utility to get regular expressions. The regular expression is what follow, and it means a href etc at the beginning of the line (^).
Almost there. Now all we need to do is send that to a variable instead of the screen, and we're done for this part:

Info=`curl --silent "https://www.lessentiel.lu/de" | while xmlgetnext ; do echo $TAG ; done | grep '^a href="/de/stor' `

The result inside the Info variable is

a href="/de/story/vollsperrung-auf-der-a6-nach-schwerem-unfall-928128456251" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/ukraine-newsticker-240655635662" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/flugzeug-der-luxemburgischen-mannschaft-wegen-unwetter-umgeleitet-991648887208" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/nadal-zum-14-mal-koenig-von-paris-finalsieg-gegen-norweger-ruud-681901613911" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/die-bilder-des-tages-867096646033" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/promi-ticker-500140628298" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/maskenpflicht-in-bus-und-bahn-soll-naechste-woche-fallen-901618527805" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/baby-und-mann-sterben-bei-frontalzusammenstoss-696983931160" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/unbekannte-feuern-in-us-stadt-in-menge-drei-tote-und-elf-verletzte-709366376325" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/petruss-kasematten-erstrahlen-wieder-in-vollem-glanz-832051921241" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/fuerstin-charlene-von-monaco-hat-sich-mit-corona-infiziert-488796398755" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/containerlager-brennt-nach-explosion-lichterloh-mindestens-49-tote-481372111006" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/nur-ein-index-wird-vorerst-auf-eis-gelegt-769164783135" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/punks-feiern-chaostage-ausflug-und-nehmen-ferieninsel-sylt-in-beschlag-678833531630" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/mann-erhaelt-am-steuer-oralsex-und-verliert-bei-unfall-fast-seinen-penis-951729532003" class="sc-1vcyo0a-1 hyTLuW"

Which is nice, but not quite yet what we want, which is a nice formatted URL. However, we can see that the url we are looking for are (partially) the second, 6th, 10th, 14th, etc.. part of our huge string.
Time to cut it into pieces format it nice while storing it in an array.

We clean the result and format it

for ind in 2 6 10 14 18 22 26 30 34 38 42 46 50 54 58 ; do
  t=`echo $Info | awk -v i=$ind '{print $i}'`
  ur[$nbur]="https://www.lessentiel.lu/de${t:6:${#t}-7}";
  nbur=$((nbur+1));
done

We compare with what was posted before (which we load) and select those which haven't been posted yet
Loading:

ReadDB () {
while read -r line ; do
  db[$nbdb]=$line;
  nbdb=$((nbdb+1));
done < $previous
}

For the sorting, we take each url and check if it is in our database. if it is, we exit the loop, if it isn't, we put it on the pile to post along with the text extracted, and we loop to the next.

SortRSS () {
  dejala=0;
for (( i=0; i < $nbur ; i++)); do
  for (( j=0; j < $nbdb ; j++)); do
   if  [[ "${ur[$i]}" = "${db[$j]}" ]] ; then
       dejala=1;
       break;
   fi
  done
   if [[ "$dejala" = "0" ]] ; then 
       post[$nbpost]=${ur[$i]};
       t=`echo ${ur[$i]} | cut -c36- `
       txt[$nbpost]=`echo ${t:0:${#t}-12} |sed 's/-/ /g'`
       nbpost=$((nbpost+1));
   else
     dejala=0;
   fi
done
}

We post the selected and save them for the next run

Post () {
for (( i=0; i < $maxpost ; i++)); do
 line="L'Essentiel: ${txt[$i]} ${post[$i]} #news #Luxemburg";
 echo ${post[$i]} >> $previous
 madonctl toot $line
done
}

That's it.
mastodon post

You can download the full script. It is a bit longer because it also contains a procedure to purge the database every 3 days to prevent it from growing too big.
.

[Top]
xpost from X.