Testing Roccat Kain mouse failure with xev

I’m using Roccat mouse since a while. Confortable to hold, precise enough, I was quite satisfied. Except they dont live so long. One had a left button failure. Another one, new, died instantly after a small drop of coffee. I drop a lot of coffee on such devices, it is the first time I managed to instakill a mouse like this. Nonetheless, I bought even more of them, ROCCAT Kain 122, two at once.

One mouse mousewheel was behaving obviously erratically: scrolling down was inconsistent. Running xev is enough to catch the issue. You’ll find out that “button 4” is the action activated for mousewheel going up (with a specific serial that is irrelevant here, “button 5” going down). So it is enough to run:

xev | grep "button 4"

and to turn the mousewheel down to notice a few erratic “button 4” pressed pop-in up.

Worse, it did so too with the second mouse for which the bug was not obvious. Even worse, it looks like this hardware issue exists at least since a year. So, to sum up, they still ship a buggy series of mouses, bug you might not immediately notice. Despite their other good points, I’ll avoid this brand for now.

Generating static galleries with nanogallery2

As wrote earlier, I now use nanogallery2 to share pictures due to nextcloud removal of the earlier gallery app and the fact that the replacement app called Photos does not match my needs.

It is copied directly from the cloud LXC container, with rsync and ssh, to another server that only serves static files via HTTPS – with access restricted through usual basic authentication. It means that I am not giving any access to the cloud infrastructure to the HTTPS server.

nanogallery2 have associated “providers” to easily set up a gallery (directly from a local directory or distant storage). But that implies on-the-fly index and thumbnails generation by a PHP CGI server while I prefer the server to be serving only static files.

So I made a small perl script called dir2nanog.pl to build a single index and thumbnails, using subdirectories as distinct galleries:

#!/usr/bin/perl
#
# Copyright (c) 2020 Mathieu Roy <yeupou--gnu.org>
#      https://yeupou.wordpress.com
#
#   This program is free software; you can redistribute it and/or modify
#   it under the terms of the GNU General Public License as published by
#   the Free Software Foundation; either version 2 of the License, or
#   (at your option) any later version.
#
#   This program is distributed in the hope that it will be useful,
#   but WITHOUT ANY WARRANTY; without even the implied warranty of
#   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#   GNU General Public License for more details.
#
#   You should have received a copy of the GNU General Public License
#   along with this program; if not, write to the Free Software
#   Foundation, Inc., 59 Temple Place,d Suite 330, Boston, MA  02111-1307
#   USA


use strict "vars";
use File::Find;
use File::Basename;
use Image::ExifTool;
use Image::Magick;
use Fcntl ':flock';
use CGI qw(:standard escapeHTML);

my $debug = 1;
my $path = $ARGV[0];
my $topdirstring = "ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ";   # hopefully no such directory exists
my @topimages;
my %subdirsimages;
my %comment;
my %gps;
my %model;
my %focalength;
my %flash;
my %exposure;
my %iso;
my $images_types = "png|gif|jpg|jpeg";
my @mandatory = ("https://raw.githubusercontent.com/nanostudio-org/nanogallery2/master/dist/jquery.nanogallery2.js",
		 "https://raw.githubusercontent.com/nanostudio-org/nanogallery2/master/dist/jquery.nanogallery2.core.min.js",
		 "https://raw.githubusercontent.com/nanostudio-org/nanogallery2/master/dist/jquery.nanogallery2.min.js",
		 "https://raw.githubusercontent.com/nanostudio-org/nanogallery2/master/dist/css/nanogallery2.woff.min.css",
		 "https://raw.githubusercontent.com/nanostudio-org/nanogallery2/master/dist/css/nanogallery2.min.css"); 

my @mandatoryfont = ("https://raw.githubusercontent.com/nanostudio-org/nanogallery2/master/dist/css/font/ngy2_icon_font.woff",
		 "https://raw.githubusercontent.com/nanostudio-org/nanogallery2/master/dist/css/font/ngy2_icon_font.woff2");



# silently forbid concurrent runs
# (http://perl.plover.com/yak/flock/samples/slide006.html)
open(LOCK, "< $0") or die "Failed to ask lock. Exit";
flock(LOCK, LOCK_EX | LOCK_NB) or exit;

### FUNCTIONS
sub thumbnail_dimensions {
        my ($x, $y, $ratio, $height, $width);
        ($x, $y) = @_;
        $ratio = $x / $y;	
	$width = 200;
	$height = int($width / $ratio);
        return $width, $height;
}

sub thumbnail_name {
    my ($name,$path) = fileparse($_[0]);
    return "$path.thumbnail.$name";
}

sub wanted {
    # skip directories and non-images in general
    next if -d $_;
    next unless lc($_) =~ /\.($images_types)$/i;
    # skip thumbnails
    next if $_ =~ /^\.thumbnail\./i;

    # create thumbnail if missing
    unless (-e thumbnail_name($File::Find::name)) {
	my ($image, $error, $x, $y, $size, $format, $width, $height);
	
        # https://www.perlmonks.org/?node_id=209235
        $image = Image::Magick->new;
        $error = $image->Read($File::Find::name);
	if (!$error) {
	    ($x, $y, $size, $format) = $image->Ping($File::Find::name);
	    ($width, $height) = thumbnail_dimensions($x, $y);
	    $image->Scale(width=>$width, height=>$height);
	    $error = $image->Write(thumbnail_name($File::Find::name));
        }
    }
    
    # identify URL, path removed 
    my $url = substr($File::Find::name, length($path)+1, length($File::Find::name));    

    # name will be based on file creation (using comment would require utf8 checks)
    # according to exif data, not real filesystem info
    my $exifTool = new Image::ExifTool;
    my %exifTool_options = (DateFormat => '%x');
    my $info = $exifTool->ImageInfo($File::Find::name, ("CreateDate","GPSLatitude","GPSLongitude","GPSDateTime", "Make", "Model", "Software", "FocalLength", "Flash", "Exposure", "ISO"), \%exifTool_options);
    $comment{$url} = $info->{"CreateDate"};
    $comment{$url} = $info->{"GPSDateTime"} if $info->{"GPSDateTime"};
    
    if ($info->{"GPSLatitude"} and $info->{"GPSLongitude"}) {
	$gps{$url} = $info->{"GPSLatitude"}." ".$info->{"GPSLongitude"};
	$gps{$url} =~ s/\sdeg/°/g;
    }

    $model{$url} = $info->{"Make"}." " if $info->{"Make"};
    $model{$url} .= $info->{"Model"}." " if $info->{"Model"};
    $model{$url} .= "(".$info->{"Software"}.")" if $info->{"Software"};

    $focalength{$url} = $info->{"FocalLength"} if $info->{"FocalLength"};
    $flash{$url} = $info->{"Flash"} if $info->{"Flash"};
    $exposure{$url} = $info->{"Exposure"} if $info->{"Exposure"};
    $iso{$url} = $info->{"ISO"} if $info->{"ISO"};

    
    # top directory image
    push(@{$subdirsimages{$topdirstring}}, $url) and next if ($File::Find::dir eq $path);    

    # other images
    # identify subdir by removing the path and keeping only the resulting top directory
    my ($subdir,) = split("/", substr($File::Find::dir, length($path)+1, length($File::Find::dir)));
    push(@{$subdirsimages{$subdir}}, $url);
        
    return;
}


### RUN

chdir($path) or die "Unable to enter $path, exiting";

# list images in top dir and subdirectories
find(\&wanted, $path);

# create the output
# first check if mandatory files are there
foreach my $file (@mandatory) {
    next if -e "$path/".basename($file);
    system("/usr/bin/wget", $file);
}
mkdir("$path/font") unless -e  "$path/font";
chdir("$path/font");
foreach my $file (@mandatoryfont) {
    next if -e "$path/font/".basename($file);
    system("/usr/bin/wget", $file);
}
chdir($path);
    
# create index
open(INDEX, "> $path/index.html");
print INDEX '<!DOCTYPE html
	PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
	 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta name="generator" content="dir2nanog.pl" />
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <!-- jQuery -->
    <script type="text/javascript"  src="https://cdnjs.cloudflare.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
    <!-- nanogallery2 -->
    <link href="nanogallery2.min.css" rel="stylesheet" type="text/css" />
    <script  type="text/javascript" src="jquery.nanogallery2.min.js"></script>
    <!-- other -->
    <link type="text/css" rel="stylesheet" href="style.css" />
    <title>'.basename($path).'</title>
  </head>
  <body>
';


my $galleriescount = 0;

for (reverse(sort(keys(%subdirsimages)))) {
    
    # top dir gallery has no specific ID or title
    if ($galleriescount > 0) {	
	print INDEX '    <h1>'.$_."</h1>\n";
	print INDEX '    <div id="nanogallery2'.$_;
    } else {
	print INDEX '    <div id="nanogallery2';
    }

    # add gallery setup
    print INDEX '" data-nanogallery2 = \'{
      "viewerToolbar":   {
        "display":    true,
        "standard":   "label, infoButton",
        "minimized":  "label, infoButton"
      },
      "viewerTools":     {
        "topLeft":    "pageCounter, playPauseButton",
        "topRight":   "downloadButton, fullscreenButton, closeButton"
      },
      "thumbnailWidth": "auto",
      "galleryDisplayMode": "pagination",
      "galleryMaxRows": 3,
      "thumbnailHoverEffect2": "borderLighter"
    }\'>
';

    # add image list
    for (sort(@{$subdirsimages{$_}})) {
	my $extra;
	$extra .= ' data-ngexiflocation="'.escapeHTML($gps{$_}).'"' if $gps{$_};
	$extra .= ' data-ngexifmodel="'.escapeHTML($model{$_}).'"' if $model{$_};
	$extra .= ' data-ngexiffocallength="'.escapeHTML($focalength{$_}).'"' if $focalength{$_};
	$extra .= ' data-ngexifflash="'.escapeHTML($flash{$_}).'"' if $flash{$_};
	$extra .= ' data-ngexifexposure="'.escapeHTML($exposure{$_}).'"' if $exposure{$_};
	$extra .= ' data-ngexifiso="'.escapeHTML($iso{$_}).'"' if $iso{$_};

	# if no comment set, check if if the filename might be YYMMDD_
	$comment{$_} = "$3/$2/20$1" if (!$comment{$_} and /(\d{2})(\d{2})(\d{2})_[^\\]*$/);  	
	
	print INDEX '      <a href="'.$_.'" data-ngThumb="'.thumbnail_name($_).'"'.$extra.'>'.escapeHTML($comment{$_})."</a>\n";
    }
print INDEX '
	</div>
';
    $galleriescount++;
}

# finish index	
	print INDEX
	'</body>
</html>';

close(INDEX);

# EOF 

To run it:

chmod +x /usr/local/bin/dir2nanog.pl
/usr/local/bin/dir2nanog.pl /var/www/html/pictures/

I have the following crontab for the dedicated user:

*/15 * * * *     export LC_ALL=fr_FR.UTF-8 && /usr/local/bin/dir2nanog.pl /var/www/html/pictures/thistopic

Dealing with Nextcloud broken oc_flow_operations table after 19 upgrade

I finally upgraded my Nextcloud server, using nanogallery2 instead of relying on the new brand Photos app, since it does not provide a simple way to share a gallery and is clearly not about to be resolved (the developers that in first place considered there was no problem to fix finally said that people were too harsh try to convince them otherwise so they finally lost interest in fixing it). Well, in any case, using static files for this purpose is much more efficient.

During the upgrade, I hit issue with the oc_flow_operations table that is mentioned in numerous reports (the whole first page of google search).

I first fixed it by, as suggested by some users, adding a missing column. That allowed to finish the upgrade but something was still off, with the logfile growing to a few MB every minutes.

Finally, recreating the whole table fixed it for me:

cat /etc/mysql/debian.cnf | grep password
mysql -udebian-sys-maint -p
connect nextcloud;
DROP table oc_flow_operations ;
CREATE table oc_flow_operations (id   int(11) NOT NULL  AUTO_INCREMENT PRIMARY KEY, class  varchar(256), name     varchar(256) NOT NULL, checks   longtext, operation longtext,entity  varchar(256),events longtext);
su www-data -s /bin/bash
php occ maintenance:repair

Fetching mails from gmail.com with lieer/notmuch instead of fetchmail

Google is gradually making traditional IMAPS access to gmail.com impossible, in it’s usual opaque way. It is claimed to be a security plus, though, if your data is already on gmail.com, it means that you are already accepting that it can and is spied on, so whether the extra fuss is justified is not so obvious.

Nonetheless, I have a few secondary old boxes on gmail, that I’d like to still extract mails from, not to miss any, without bothering connecting to with a web browser. But the usual fetchmail setup is no longer reliable.

The easiest alternative is to use lieer, notmuch and procmail together.

apt install links notmuch lieer procmail

# set up as regular user
su enduser

boxname=mygoogleuser
mkdir -p ~/mail/$boxname
notmuch
notmuch new
cd ~/mail/$boxname
gmi init $boxname@gmail.com

# at this moment, you'll get an URL to connect to gmail.com 
# with a web browser
# 
# if it started links instead, exit cleanly (no control-C)
# to get the URL
#
# that will then return an URL to a localhost:8080
# to no effect if you are on a distant server
# in this case, just run, in an another terminal
links "localhost:8080/..."

The setup should be ok. You can check and toy a bit with the setup:

gmi sync
notmuch new
notmuch search tag:unread
notmuch show --format=mbox thread:000000000000009c

Then you should mark all previous messages as read:

for thread in `notmuch search --output=threads tag:unread`; do echo "$thread" && notmuch tag -unread "$thread" ; done
gmi push

Finally, we set up the mail forward (in my case, the fetch action is done in a dedicated container, so it is forwarded to ‘user’ by SMTP to the destination host, but procmail allows any setup) and fetchmail script: each new unread mail is forwarded and then marked as read:

echo ':0
! user' > ~/.procmail

echo '#!/bin/bash

BOXES=""

# no concurrent run
if pidof -o %PPID -x "fetchmail.sh">/dev/null; then
        exit 1
fi

for box in $BOXES; do
    cd ~/mail/$box/
    # get data
    gmi pull --quiet
    notmuch new --quiet >/dev/null
    for thread in `notmuch search --output=threads tag:unread`; do
	# send unread through procmail
	notmuch show --format=mbox "$thread" | procmail
	# mark as read
	notmuch tag -unread "$thread"
    done
    # send data
    gmi push --quiet
    cd
done

# EOF' > ~/fetchmail.sh
chmod +x ~/fetchmail.sh

Then it is enough to call the script (with BOXES=”boxname” properly set) and to include it in cronjob, like using crontab -e`

notmuch does not remove mails.

Downgrading Nextcloud 18.0.3.0 to 17.0.5

I upgraded Nextcloud to the latest save without realizing the gallery app has been replaced by an half-baked “photos” app, completely useless to share pictures in any relevant way to me, in addition to a bug with ublock origin making the whole “sharing” interface disappearing.

Rolling back is not as easy at it seems: the “occ” php app check with your config version if it matches the software version, and if not matching just plainly refuses to work with:

An unhandled exception has been thrown:
OC\HintException: [0]: Downgrading is not supported and is likely to cause unpredictable issues (from 18.0.3.0 to 17.0.5.0) ()

So you need to update this config/config.php. Then, playing with occ app:list, occ app:remove and occ app:install only you can get back to a working install.

update: since then, nothing changed, nextcloud 20 has been released and it looks like the developers have no plan to deal with this issue. Since users were pissed at their refusal to consider there is an issue to fix, now they lost their interest in fixing the issue that they did not want to consider in first place.

Resize completely a wordpress.com blog’s media gallery

I found no convenient way to resize a whole media gallery on wordpress.com, with the free plan which does not allow to install plugins. Aside from that, I find strange wordpress itself still does not prevent duplicates media using checksum or else.

I had a blog with a media gallery reaching the limit of upload on a free plan. And it contained tons of very high-res pictures that actually could be downsized without posing any problem.

I found no convenient way to replace images (with the free plan and no plugins). If you reupload the same file, after deleting it, it will get an extra suffix -1/-2, etc: wordpress clearly keep the deleted media names in the database and prevent it to be reused so it is a no go.

The only solution I found was to:

  • export both post and media files;
  • delete all files and post;
  • run scripts to update images files and xml export/import files;
  • reimport everything with new filenames.

It is not perfect, some data will be lost, namely old galleries and some post front image choice, etc. Follows the first approach (1, 2, 3) and the second one (1, 2+3, 4) that tries to do smarter things (but is then more likely to break soon).

1. Renaming and downsizing the images

# to run in the exported media directory (extracted in a single directory)    
date=`date +%H%M%S`
backup=`mktemp --directory --tmpdir=$PWD -t $date-XXX.bak`
    
for file in  *.png *.jpg *.jpeg; do
	# skip if not a file
	if [ ! -f "$file" ]; then continue; fi

	# rename:
	newfile="BLOGNAME.wordpress.com-"$file
	echo "$file => $newfile"
	
	# limit to 1600 in max size  - to check up to decent size amount for the full
	convert "$file" -resize 1600x1600\> "$newfile"
	mv "$file" $backup/
	
done

2. Updating post export/import with new images filenames

#!/usr/bin/perl
# to be saved as as perl script file, 
# edit (especial THISBLOG and 2020/03, YYYY/MM of the new upload that will be added automatically in the URL of the media file during upload)
# and then 
# run against the exports files like
# chmod +x ./thiscript.pl
# ./thiscript.pl 

# note: wordpress.com rename -- in - during upload

use strict;


open(IN, "< $ARGV[0]");
open(OUT, "> edited.$ARGV[0]");
while (<IN>) {
    s@\.files\.wordpress.com\/\d{4}\/\d{2}\/@.files.wordpress.com/2020/03/THISBLOG.wordpress.com-re.up-@ig;
    print OUT $_;
}
close(IN);
close(OUT);

3. Checking if images appears in posts

#!/usr/bin/perl
#
# make sure every image downloaded actually exist in posts
# also to download and run against the xml import/export files and the media directory
# ./thiscript.pl THISBLOG.wordpress.2020-03-29.001.xml #../images_updated

use strict;
use warnings;
use Regexp::Common qw/URI/;
use File::Basename;
use File::Copy;

my %images;

open(IN, "< $ARGV[0]");
while (<IN>) {
    my $url;
    next unless /($RE{URI}{HTTP}{-keep})/;
    $url = $1;
    next unless $url =~ /THISBLOG\.files\./;
    my ($basename, $parentdir, $extension) = fileparse($url, qr/\.[^.]*$/);

    if ($extension  =~ /^\.(png|jpg|jpeg|gif)$/i) {
	#print "$basename$extension ($url)\n";
	$images{"$basename$extension"} = "$basename$extension"
	    unless $images{"$basename$extension"};
    } else {
	#print "IGNORE $basename$extension ($url)\n";
    }

}

while (my($image, ) = sort(each(%images))) {
    print "$image\n";
    if ($ARGV[1]) {
	if ((-d "$ARGV[1]") and (-e "$ARGV[1]/$image")) {
	    mkdir("$ARGV[1]/valide") unless -d "$ARGV[1]/valide";
	    copy("$ARGV[1]/$image", "$ARGV[1]/valide/") or print "failed to copy $image to $ARGV[1]/valide/\n";
	}
    }
}

This scripts are primitive (sure, even the blog name and upload YYYY/MM was hardcoded). Since platforms like wordpress.com often changes, this might no longer works at another time. Many pages on the Internet claims you can simply erase and reupload image with the same name: this clearly no longer works. This page could save you some trial-and-error process to give a solution that works as of today.

Note also that some wordpress.com theme have front page image and other selections that wont be carried over.

2+3. Updated script to update xml and check images

I wrote the following script for a smoother process. Being more complex, it is more likely to be fragile also. It still miss handling/removing some specific outdated wp:metadata. This one list duplicates, unused and missing files.

#!/usr/bin/perl
#
# ./renew-url+checkimages.pl FOLDER_OF_XML FOLDER_OF_IMAGES (no subdirs)

use strict;
use warnings;
use Regexp::Common qw/URI/;
use File::Basename;
use File::Copy;
use File::Slurp;
use Cwd 'abs_path';
use MIME::Base64;
use URI;

my $blog = "zukowka";
my $newprefix = "zukowka.wordpress.com-re.up-";
my $upload_yyyymm = "2020/03"; # time of the new upload
my $images_types = "png|jpg|jpeg|gif";

my %oldurl_newurl;       # {original} = new  
my %imagebase64_newurl;    # {base64} = new_url


die "$0 FOLDER_OF_XML FOLDER_OF_IMAGES (no subdirs)" unless $ARGV[0] and$ARGV[1];



# get dir full path 
my $xmldir;
$xmldir = abs_path($ARGV[0])
    if $ARGV[0];
# get dir full path 
my $imgdir;
$imgdir = abs_path($ARGV[1])
    if $ARGV[1];


my ($unused,$dupes,$missing,$mapped);
open(DUPES, "> $0.duplicatedimages");  # same image found with different names/URL
open(MISSING, "> $0.missingimages");   # file listed in xml not found in the image directory
open(UNUSED, "> $0.unusedimages");   #  file found in the image directory but not listed
open(MAPPING, "> $0.mapping");


# test access to dir
print "XML dir (ARG1): $xmldir
images dir (ARG2): $imgdir\n";
chdir($xmldir) if -d $xmldir or die "unable to enter $xmldir, exiting"; 
chdir($imgdir) if -d $imgdir or die "unable to enter $imgdir, exiting";

# - slurp all urls in import files
# - check if the image listed exist, store with base64 so we keep only one
chdir($xmldir);
while (defined(my $file = glob("*.xml"))) {
    print "### slurp $file ###\n";
    open(IN, "< $file");
    while (<IN>) {
	while (/($RE{URI}{HTTP}{-scheme => qr@https?@}{-keep})/g) {
	    # remove the http/https and possible args to keep the most portable url 
	    my $uri = URI->new($1);
	    my $url = $uri->authority.$uri->path;

	    
	    # images URL always start with $blog.files.wordpress.com
	    next unless $url =~ /^$blog\.files\./;
	    
	    my ($basename, $parentdir, $extension) = fileparse($url, qr/\.[^.]*$/);
	    my $newurl = "$blog.files.wordpress.com/$upload_yyyymm/$newprefix$basename$extension";
	    
	    # skip if the url was already mapped
	    if ($oldurl_newurl{$url}) {
		#print "SEEN ALREADY (skipping) $url\n";
		next;
	    }
	    
	    # work only on images
	    next unless lc($extension) =~ /^\.($images_types)$/i;
	    
	    # check if the relevant image exists in the image folder
	    my $newimage = "$imgdir/$newprefix$basename$extension";
	    unless (-e $newimage) { 
		print "MISSING (skipping) $newimage\n";
		print MISSING "$newimage\n";
		$missing++;
		next;
	    }
	    
	    # get image base64
	    my $base64 = encode_base64(read_file($newimage));
	    
	    # find out if this exact image is already known
	    if ($imagebase64_newurl{$base64}) {
		# already known, will point to the first one found
		print "DUPES $newprefix$basename$extension\n";
		print DUPES "$newprefix$basename$extension:\n\t$url => $imagebase64_newurl{$base64}\n";
		$dupes++;
		
		# URLs will point to the first image found
		#   full form like http://blog.files.wordpress.com/YYYY/MM/file.jpg
		$oldurl_newurl{$url} = $imagebase64_newurl{$base64};
		#   short form like http://blog.wordpress.com/file/
		my ($base64_basename, $base64_parentdir, $base64_extension) = fileparse($imagebase64_newurl{$base64}, qr/\.[^.]*$/);
		$oldurl_newurl{"$blog.wordpress.com/$basename/"} = "$blog.wordpress.com/$base64_basename/" if
		    "$blog.wordpress.com/$basename/" ne "$blog.wordpress.com/$base64_basename/";
		
	    } else {
		# store base64 with full form url to
		$imagebase64_newurl{$base64} = $newurl;
		
		# URLs will point to the first image found
		#   full form like http://blog.files.wordpress.com/YYYY/MM/file.jpg
		$oldurl_newurl{$url} = $newurl;
		#   short form like http://blog.wordpress.com/file/
		$oldurl_newurl{"$blog.wordpress.com/$basename/"} = "$blog.wordpress.com/$newprefix$basename/" if
		    "$blog.wordpress.com/$basename/" ne "$blog.wordpress.com/$newprefix$basename/";
			
	    }
	}
    }
    close(IN);
}

# store mappings
my %used;
while (my($oldurl,$newurl) = sort(each(%oldurl_newurl))) {
    print MAPPING "$oldurl => $newurl\n";
    my ($basename, $parentdir, $extension) = fileparse($newurl, qr/\.[^.]*$/);
    $used{"$basename$extension"} = 1;
    $mapped++;
}


# build import xml 
chdir($xmldir);
mkdir("$xmldir/renew") unless -d "$xmldir/renew";
while (defined(my $file = glob("*.xml"))) {
    print "### rewrite $file ###\n";
    open(OUT, "> renew/renewed-$file");
    open(IN, "< $file");
    while (<IN>) {
	my $line = $_;
	# check every url
	while (/($RE{URI}{HTTP}{-scheme => qr@https?@}{-keep})/g) {
	    # remove the http/https and possible args to keep the most portable url 
	    my $uri = URI->new($1);
	    my $url = $uri->authority.$uri->path;
	    
	    # update if mapping registered
	    if ($oldurl_newurl{$url}) {
		$line =~ s/$url/$oldurl_newurl{$url}/g;
		#print "$url -> $oldurl_newurl{$url}\n"
	    }
	}
	print OUT $line;
    }
    close(OUT);
    close(IN);	
}

# finally list useless images
chdir($imgdir);
mkdir("$xmldir/renew") unless -d "$xmldir/renew";
while (defined(my $file = glob("*"))) {    
    # work only on images
    my ($basename, $parentdir, $extension) = fileparse($file, qr/\.[^.]*$/);
    next unless lc($extension) =~ /^\.($images_types)$/i;

    # check if registered yet
    next if $used{$file};

    # if we reach this point, this media is unknown
    $unused++;
    print UNUSED $file, "\n";
}

close(MAPPING);
close(DUPES);
close(MISSING);
close(UNUSED);


print "=============================
$mapped mapped URL
$missing missing files (!)
$dupes duplicated files/links
$unused unused files\n";

# EOF

Grabbing new images post_id

There are used in gallery in the form . To use this script, you must compare your new XML produced by the previous script and new XML export made by wordpress.com AFTER uploading the new images.

The point is to get post_id from newly uploaded images and to map them to the old removed images post_id.


#!/usr/bin/perl
#
# ./update-post_id.pl FOLDER_OF_XML_UPDATED FOLDER_OF_XML_AFTER_IMAGE_REUPLOAD
#
#
# galleries are } -->
# refering to <wp:post_id>27</wppost_id> of image attachements.
#
# reuploaded files have new post_id along with new metadata hardcoded in the database
# 	<guid isPermaLink="false">http://XX.files.wordpress.com/2020/03/XXX-img_20160814_125558.jpg</guid>
#       <wp:post_type>attachment</wp:post_type>
# should match
#       <wp:attachment_url>https://XX.files.wordpress.com/2020/03/XXX-img_20160814_125558.jpg</wp:attachment_url>


use strict;
use warnings;
use Regexp::Common qw/URI/;
use File::Basename;
use File::Copy;
use File::Slurp;
use Cwd 'abs_path';
use MIME::Base64;
use URI;
use XML::LibXML;

die "$0 FOLDER_OF_XML_UPDATED FOLDER_OF_XML_AFTER_IMAGE_REUPLOAD" unless $ARGV[0] and$ARGV[1];



# get dir full path 
my $xmldir;
$xmldir = abs_path($ARGV[0])
    if $ARGV[0];
# get dir full path 
my $xmlafterreuploaddir;
$xmlafterreuploaddir = abs_path($ARGV[1])
    if $ARGV[1];

# test access to dir
print "XML dir (ARG1): $xmldir
XML after reupload dir (ARG2): $xmlafterreuploaddir\n";
chdir($xmldir) if -d $xmldir or die "unable to enter $xmldir, exiting"; 
chdir($xmlafterreuploaddir) if -d $xmlafterreuploaddir or die "unable to enter $xmlafterreuploaddir, exiting";


my %guid_postid;  # {guid} = postid



chdir($xmlafterreuploaddir);
# get postid after reupload
while (defined(my $file = glob("*.xml"))) {
    print "### read $file ###\n";
    my $dom = XML::LibXML->load_xml(location=>$file);

    foreach my $e ($dom->findnodes('//item')) {	
	#	print $e->to_literal();

	# only care for attachement type post
	next unless $e->findvalue('./wp:post_type') eq 'attachment';
	
	# store new post_id with the guid as key 	
	$guid_postid{$e->findvalue('./guid')} = $e->findvalue('./wp:post_id');
    }   
}


my %old2new_postid; # {old} = new


chdir($xmldir);
# get postid in first export
while (defined(my $file = glob("*.xml"))) {
    print "### read $file ###\n";
    my $dom = XML::LibXML->load_xml(location=>$file);

    foreach my $e ($dom->findnodes('//item')) {
	# only care for attachement type post
	next unless $e->findvalue('./wp:post_type') eq 'attachment';

	# ignore if this guid was not found/replaced
	next unless $guid_postid{$e->findvalue('./guid')};

	# map postids
	$old2new_postid{$e->findvalue('./wp:post_id')} = $guid_postid{$e->findvalue('./guid')};	
	print $e->findvalue('./wp:post_id')." -> ".$guid_postid{$e->findvalue('./guid')}."\n";
    }   
}


# finally, with this mapping, edit xml gallery entries:
# build import xml 
chdir($xmldir);
mkdir("$xmldir/newpostid") unless -d "$xmldir/newpostid";
while (defined(my $file = glob("*.xml"))) {
    print "### rewrite $file ###\n";
    open(OUT, "> newpostid/$file");
    open(IN, "< $file");
    while (<IN>) {
	my $line = $_;
	# older galleries
	# 	
	while (/\*)".*]/g) {
	    print "$1\n";
	    my $original = $1;
	    my @new_ids;
	    foreach my $id (split(",", $original)) {
		if ($old2new_postid{$id}) {
		    push(@new_ids, $old2new_postid{$id});
		} else {
		    # if not found, push back the original one
		    push(@new_ids, $id);
		}
	    }
	    my $new = join(",", @new_ids);
	    print " => $new\n";

	    $line =~ s/\,"columns":2} -->	
	while (/\<\!\-\- wp\:gallery \{\"ids\"\:\[([\d|,]*)\].*\}/g) {
	    print "$1\n";
	    my $original = $1;
	    my @new_ids;
	    foreach my $id (split(",", $original)) {
		if ($old2new_postid{$id}) {
		    push(@new_ids, $old2new_postid{$id});
		} else {
		    # if not found, push back the original one
		    push(@new_ids, $id);
		}
	    }
	    my $new = join(",", @new_ids);
	    print " => $new\n";

	    $line =~ s/\<\!\-\- wp\:gallery \{\"ids\"\:\[$original\]/<!-- wp:gallery {"ids":[$new]/g;
	    print $line;
	}

	print OUT $line;
    }
    close(OUT);
    close(IN);	
}


# EOF

SPF-aware greylisting with Exim and memcache

This is a followup of my 2011’s article avoiding Spams with SPF and greylisting within Exim. What changed since then? I actually am not more harrassed by spam that I was earlier on. It works. I am spam free since a decade now. No, but, however, several importants mail providers have a tendancy to send mail through multiples SMTPs, so many it took a while for any of them to do at least two attempt. So some mails takes ages to pass the greylist.

Contemplating the idea to use opensmtpd, I incidentally found an interesting proposal to mix greylisting of IP with SPF-validated domains.

The idea is that you greylist either an SMTP IP or a domain including any SMTP IP approved by SPF.

I updated the memcached-exim.pl script previously used and described. It was simplified because I dont think useful to actually make greylist per sender and recipient, only per IP or domain. Now it either only greylist IP, if not validated by SPF, or the domain and IP on success (to save a few SPF further test).

I dont think it should have any noticeable impact on the server behavior. SPF is anyway checked, so it is meaningless since there is local caching DNS on my mail servers.

The earlier /etc/exim4/memcached.conf is actually no longer required (defaults are enough). You still need exim configuration counterparts:  /etc/exim4/conf.d/main/00_stalag13-config_0greylist and /etc/exim4/conf.d/acl/26_stalag13-config_check_rcpt.

Delisting an Exim4 server from Office365 ban list

Ever tried to get delisted from Office365 ban list, for whatever reason you might try to get (new IP for a server that was abused in the past or else, you won’t know since they wont tell – and it even looks like they probably dont even really know)?

It is a funny process, because it involves receive a mail from their servers, a mail that will probably be flagged as spam, with clues so big that it might be blocked at SMTP time.

With Exim4, you’ll probably get in the log something like:

2019-08-20 22:18:09 1i0Aa1-0004Hu-8h H=mail-eopbgr740042.outbound.protection.outlook.com (NAM01-BN3-obe.outbound.protection.outlook.com) [40.107.74.42] X=TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256 CV=no F=<no-reply@microsoft.com> rejected after DATA: maximum allowed line length is 998 octets, got 3172

Long story short (this length test is not welcomed by all users), add /etc/exim4/conf.d/main/00_localoptions add

IGNORE_SMTP_LINE_LENGTH_LIMIT=1

and then restart the server.

Try delisting and check your spam folder. You should get now the relevant mail. Whatever we think about the lenght limit test of Exim4 (based on RCF, isn’t it?), you still end up with a mail sent by Office365 like this:

X-Spam-Flag: YES
X-Spam-Level: *****
X-Spam-Status: Yes, score=5.2 required=3.4 tests=BASE64_LENGTH_79_INF,
	HTML_IMAGE_ONLY_08,HTML_MESSAGE,MIME_HTML_ONLY,MIME_HTML_ONLY_MULTI,
	MPART_ALT_DIFF,SPF_HELO_PASS,SPF_PASS autolearn=no autolearn_force=no
	version=3.4.2
X-Spam-Report: 
	* -0.0 SPF_PASS SPF: sender matches SPF record
	* -0.0 SPF_HELO_PASS SPF: HELO matches SPF record
	*  0.7 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
	*  0.7 MPART_ALT_DIFF BODY: HTML and text parts are different
	*  0.0 HTML_MESSAGE BODY: HTML included in message
	*  1.8 HTML_IMAGE_ONLY_08 BODY: HTML: images with 400-800 bytes of
	*      words
	*  2.0 BASE64_LENGTH_79_INF BODY: base64 encoded email part uses line
	*      length greater than 79 characters
	*  0.0 MIME_HTML_ONLY_MULTI Multipart message only has text/html MIME
	*      parts

Considering the context, it screams incompetence.

No-fuss setting user-specific locales (for instance for XFCE with ligthdm or slim)

End of 2018, you’d think, by now, that locales setup should not be a concern. But, still, in the case of user-specific configuration, mismatching the system locale (granted, that must not so be so common), I got various odd results. Like lightdm not setting anything no matter what you select on the login window. Or, worse, half-assed setup, with LANG being set and then unset, of LANGUAGE being not set but still expected by some apps, with a desktop with no option to configure it like XFCE.

After a few tests, turns out that user .xsessionrc works perfecly, independantly from desktop environment or desktop login manager:

echo "export LANG=fr_FR.UTF-8
export LANGUAGE=fr_FR.UTF-8" >> ~/.xsessionrc

with french (fr_FR) selected here.

Typing SSH passphrase(s) only once per session

Here’s a very simple way to type SSH passphrases only once. This simple function, to be added in your ~/.bashrc, will make sure that ssh-agent will always be called before ssh, once per session, so you do not have to type your ssh passphrase more than once:

function sshwithauthsock {
 if [ ! -S ~/.ssh/ssh_auth_sock ]; then
   eval `ssh-agent`
   ln -sf "$SSH_AUTH_SOCK" ~/.ssh/ssh_auth_sock
 fi
 export SSH_AUTH_SOCK=~/.ssh/ssh_auth_sock
 ssh-add -l > /dev/null || ssh-add
 "$@" 
}

alias ssh='sshwithauthsock ssh'
alias scp='sshwithauthsock scp'

Check for possibly updated version directly in my repository.