Welcome to the Club de TeleMatique mast bash crash course!

Here you will learn how to make a mastodon bot with a couple of bash script lines. (ok almost 100, but there are comments and spaces).
Why bash ? Besides the fact that I like bash, you will have shell on any unix machine, and the script can be easily modified for another shell. Most of the other scripts I found on the internet are python script. Nothing wrong with that, but if it isn't on your machine, or not the supported version...

Lesson 1: The Foundation

  1. What you need:
    • an ascii editor. Vim on linux, emacs, whatever.
    • a linux/unix machine with bash installed.
    • a mastodon account...
    • a copy of madonctl installed on the machine. (or anything that can post on mastodon)
  2. Vocabulary: in programming, you need to understand the 4 following words:
    • constant: things that don't change
    • variable: things that do change
    • type: sort of
    • function: piece of code returning a result

Some languages don't have constants, so constants are just variable that don't change (ideally). (*)


(*) Hence the famous programmer joke: constants aren't - variables won't.

[Top]

Lesson 2: The basics


install madonctl in a place it will be found (ie in path). That's where you'll place your bash script as well.
What we are going to do is a simple script that
Nothing fancy, the idea is to give you the basics, you can of course improve and modify to your taste.

[Top]

Lesson 3: The script

  1. First we download the page with curl (a standard unix client for urls).
    curl  "https://www.lessentiel.lu/de"
    What we get is a html page. Pretty much everything is mangled, so we need first to have a better idea of what is really in there. In order to do that, instead of dumping the content directly to the screen, we will first filter it by sorting it by html tags.
    we create a small function that will just do that:
    ##########################
    xmlgetnext () {
    local IFS='>'
    read -d '<' TAG VALUE
    }
    
    And do the same again:
    curl "https://www.lessentiel.lu/de" | while xmlgetnext ; do echo $TAG ; done
    
    Much better. We have now all tags clearly on the screen, and we can see that all the articles that we want have a a href="/de/story in front of them

  2. Then we parse it and get what we want, ie the news
    So let's just do that:
    curl "https://www.lessentiel.lu/de" | while xmlgetnext ; do echo $TAG ; done | grep '^a href="/de/stor'
    grep is the standard unix utility to get regular expressions. The regular expression is what follow, and it means a href etc at the beginning of the line (^).
    Almost there. Now all we need to do is send that to a variable instead of the screen, and we're done for this part:
    Info=`curl --silent "https://www.lessentiel.lu/de" | while xmlgetnext ; do echo $TAG ; done | grep '^a href="/de/stor' `
    
    The result inside the Info variable is
    a href="/de/story/vollsperrung-auf-der-a6-nach-schwerem-unfall-928128456251" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/ukraine-newsticker-240655635662" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/flugzeug-der-luxemburgischen-mannschaft-wegen-unwetter-umgeleitet-991648887208" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/nadal-zum-14-mal-koenig-von-paris-finalsieg-gegen-norweger-ruud-681901613911" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/die-bilder-des-tages-867096646033" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/promi-ticker-500140628298" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/maskenpflicht-in-bus-und-bahn-soll-naechste-woche-fallen-901618527805" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/baby-und-mann-sterben-bei-frontalzusammenstoss-696983931160" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/unbekannte-feuern-in-us-stadt-in-menge-drei-tote-und-elf-verletzte-709366376325" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/petruss-kasematten-erstrahlen-wieder-in-vollem-glanz-832051921241" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/fuerstin-charlene-von-monaco-hat-sich-mit-corona-infiziert-488796398755" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/containerlager-brennt-nach-explosion-lichterloh-mindestens-49-tote-481372111006" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/nur-ein-index-wird-vorerst-auf-eis-gelegt-769164783135" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/punks-feiern-chaostage-ausflug-und-nehmen-ferieninsel-sylt-in-beschlag-678833531630" class="sc-1vcyo0a-1 hyTLuW" a href="/de/story/mann-erhaelt-am-steuer-oralsex-und-verliert-bei-unfall-fast-seinen-penis-951729532003" class="sc-1vcyo0a-1 hyTLuW"  
    Which is nice, but not quite yet what we want, which is a nice formatted URL. However, we can see that the url we are looking for are (partially) the second, 6th, 10th, 14th, etc.. part of our huge string.
    Time to cut it into pieces format it nice while storing it in an array.

  3. We clean the result and format it
  4. First we select the parts that we are interested in in the huge Info string. For that we use awk because it is faster that the built in unix utility "cut". We send that piece to a temporary string which we will then clean and format as a nice URL before putting it in an array.
    We take off the 6 first characters and the last one and fill in with the full website url:
    for ind in 2 6 10 14 18 22 26 30 34 38 42 46 50 54 58 ; do
      t=`echo $Info | awk -v i=$ind '{print $i}'`
      ur[$nbur]="https://www.lessentiel.lu/de${t:6:${#t}-7}";
      nbur=$((nbur+1));
    done      
        
  5. We compare with what was posted before (which we load) and select those which haven't been posted yet
    Loading:
    ReadDB () {
    while read -r line ; do
      db[$nbdb]=$line;
      nbdb=$((nbdb+1));
    done < $previous
    }
        
    For the sorting, we take each url and check if it is in our database. if it is, we exit the loop, if it isn't, we put it on the pile to post along with the text extracted, and we loop to the next.
    SortRSS () {
      dejala=0;
    for (( i=0; i < $nbur ; i++)); do
      for (( j=0; j < $nbdb ; j++)); do
       if  [[ "${ur[$i]}" = "${db[$j]}" ]] ; then
           dejala=1;
           break;
       fi
      done
       if [[ "$dejala" = "0" ]] ; then 
           post[$nbpost]=${ur[$i]};
           t=`echo ${ur[$i]} | cut -c36- `
           txt[$nbpost]=`echo ${t:0:${#t}-12} |sed 's/-/ /g'`
           nbpost=$((nbpost+1));
       else
         dejala=0;
       fi
    done
    }    
    

  6. We post the selected and save them for the next run
    Post () {
    for (( i=0; i < $maxpost ; i++)); do
     line="L'Essentiel: ${txt[$i]} ${post[$i]} #news #Luxemburg";
     echo ${post[$i]} >> $previous
     madonctl toot $line
    done
    }
    

That's it.
mastodon post
You can download the full script. It is a bit longer because it also contains a procedure to purge the database every 3 days to prevent it from growing too big.
.









[Top]
xpost from X.