Downgrading Nextcloud 18.0.3.0 to 17.0.5

I upgraded Nextcloud to the latest save without realizing the gallery app has been replaced by an half-baked “photos” app, completely useless to share pictures in any relevant way to me, in addition to a bug with ublock origin making the whole “sharing” interface disappearing.

Rolling back is not as easy at it seems: the “occ” php app check with your config version if it matches the software version, and if not matching just plainly refuses to work with:

An unhandled exception has been thrown:
OC\HintException: [0]: Downgrading is not supported and is likely to cause unpredictable issues (from 18.0.3.0 to 17.0.5.0) ()

So you need to update this config/config.php. Then, playing with occ app:list, occ app:remove and occ app:install only you can get back to a working install.

Resize completely a wordpress.com blog’s media gallery

I found no convenient way to resize a whole media gallery on wordpress.com, with the free plan which does not allow to install plugins. Aside from that, I find strange wordpress itself still does not prevent duplicates media using checksum or else.

I had a blog with a media gallery reaching the limit of upload on a free plan. And it contained tons of very high-res pictures that actually could be downsized without posing any problem.

I found no convenient way to replace images (with the free plan and no plugins). If you reupload the same file, after deleting it, it will get an extra suffix -1/-2, etc: wordpress clearly keep the deleted media names in the database and prevent it to be reused so it is a no go.

The only solution I found was to:

  • export both post and media files;
  • delete all files and post;
  • run scripts to update images files and xml export/import files;
  • reimport everything with new filenames.

It is not perfect, some data will be lost, namely old galleries and some post front image choice, etc. Follows the first approach (1, 2, 3) and the second one (1, 2+3, 4) that tries to do smarter things (but is then more likely to break soon).

1. Renaming and downsizing the images

# to run in the exported media directory (extracted in a single directory)    
date=`date +%H%M%S`
backup=`mktemp --directory --tmpdir=$PWD -t $date-XXX.bak`
    
for file in  *.png *.jpg *.jpeg; do
	# skip if not a file
	if [ ! -f "$file" ]; then continue; fi

	# rename:
	newfile="BLOGNAME.wordpress.com-"$file
	echo "$file => $newfile"
	
	# limit to 1600 in max size  - to check up to decent size amount for the full
	convert "$file" -resize 1600x1600\> "$newfile"
	mv "$file" $backup/
	
done

2. Updating post export/import with new images filenames

#!/usr/bin/perl
# to be saved as as perl script file, 
# edit (especial THISBLOG and 2020/03, YYYY/MM of the new upload that will be added automatically in the URL of the media file during upload)
# and then 
# run against the exports files like
# chmod +x ./thiscript.pl
# ./thiscript.pl 

# note: wordpress.com rename -- in - during upload

use strict;


open(IN, "< $ARGV[0]");
open(OUT, "> edited.$ARGV[0]");
while (<IN>) {
    s@\.files\.wordpress.com\/\d{4}\/\d{2}\/@.files.wordpress.com/2020/03/THISBLOG.wordpress.com-re.up-@ig;
    print OUT $_;
}
close(IN);
close(OUT);

3. Checking if images appears in posts

#!/usr/bin/perl
#
# make sure every image downloaded actually exist in posts
# also to download and run against the xml import/export files and the media directory
# ./thiscript.pl THISBLOG.wordpress.2020-03-29.001.xml #../images_updated

use strict;
use warnings;
use Regexp::Common qw/URI/;
use File::Basename;
use File::Copy;

my %images;

open(IN, "< $ARGV[0]");
while (<IN>) {
    my $url;
    next unless /($RE{URI}{HTTP}{-keep})/;
    $url = $1;
    next unless $url =~ /THISBLOG\.files\./;
    my ($basename, $parentdir, $extension) = fileparse($url, qr/\.[^.]*$/);

    if ($extension  =~ /^\.(png|jpg|jpeg|gif)$/i) {
	#print "$basename$extension ($url)\n";
	$images{"$basename$extension"} = "$basename$extension"
	    unless $images{"$basename$extension"};
    } else {
	#print "IGNORE $basename$extension ($url)\n";
    }

}

while (my($image, ) = sort(each(%images))) {
    print "$image\n";
    if ($ARGV[1]) {
	if ((-d "$ARGV[1]") and (-e "$ARGV[1]/$image")) {
	    mkdir("$ARGV[1]/valide") unless -d "$ARGV[1]/valide";
	    copy("$ARGV[1]/$image", "$ARGV[1]/valide/") or print "failed to copy $image to $ARGV[1]/valide/\n";
	}
    }
}

This scripts are primitive (sure, even the blog name and upload YYYY/MM was hardcoded). Since platforms like wordpress.com often changes, this might no longer works at another time. Many pages on the Internet claims you can simply erase and reupload image with the same name: this clearly no longer works. This page could save you some trial-and-error process to give a solution that works as of today.

Note also that some wordpress.com theme have front page image and other selections that wont be carried over.

2+3. Updated script to update xml and check images

I wrote the following script for a smoother process. Being more complex, it is more likely to be fragile also. It still miss handling/removing some specific outdated wp:metadata. This one list duplicates, unused and missing files.

#!/usr/bin/perl
#
# ./renew-url+checkimages.pl FOLDER_OF_XML FOLDER_OF_IMAGES (no subdirs)

use strict;
use warnings;
use Regexp::Common qw/URI/;
use File::Basename;
use File::Copy;
use File::Slurp;
use Cwd 'abs_path';
use MIME::Base64;
use URI;

my $blog = "zukowka";
my $newprefix = "zukowka.wordpress.com-re.up-";
my $upload_yyyymm = "2020/03"; # time of the new upload
my $images_types = "png|jpg|jpeg|gif";

my %oldurl_newurl;       # {original} = new  
my %imagebase64_newurl;    # {base64} = new_url


die "$0 FOLDER_OF_XML FOLDER_OF_IMAGES (no subdirs)" unless $ARGV[0] and$ARGV[1];



# get dir full path 
my $xmldir;
$xmldir = abs_path($ARGV[0])
    if $ARGV[0];
# get dir full path 
my $imgdir;
$imgdir = abs_path($ARGV[1])
    if $ARGV[1];


my ($unused,$dupes,$missing,$mapped);
open(DUPES, "> $0.duplicatedimages");  # same image found with different names/URL
open(MISSING, "> $0.missingimages");   # file listed in xml not found in the image directory
open(UNUSED, "> $0.unusedimages");   #  file found in the image directory but not listed
open(MAPPING, "> $0.mapping");


# test access to dir
print "XML dir (ARG1): $xmldir
images dir (ARG2): $imgdir\n";
chdir($xmldir) if -d $xmldir or die "unable to enter $xmldir, exiting"; 
chdir($imgdir) if -d $imgdir or die "unable to enter $imgdir, exiting";

# - slurp all urls in import files
# - check if the image listed exist, store with base64 so we keep only one
chdir($xmldir);
while (defined(my $file = glob("*.xml"))) {
    print "### slurp $file ###\n";
    open(IN, "< $file");
    while (<IN>) {
	while (/($RE{URI}{HTTP}{-scheme => qr@https?@}{-keep})/g) {
	    # remove the http/https and possible args to keep the most portable url 
	    my $uri = URI->new($1);
	    my $url = $uri->authority.$uri->path;

	    
	    # images URL always start with $blog.files.wordpress.com
	    next unless $url =~ /^$blog\.files\./;
	    
	    my ($basename, $parentdir, $extension) = fileparse($url, qr/\.[^.]*$/);
	    my $newurl = "$blog.files.wordpress.com/$upload_yyyymm/$newprefix$basename$extension";
	    
	    # skip if the url was already mapped
	    if ($oldurl_newurl{$url}) {
		#print "SEEN ALREADY (skipping) $url\n";
		next;
	    }
	    
	    # work only on images
	    next unless lc($extension) =~ /^\.($images_types)$/i;
	    
	    # check if the relevant image exists in the image folder
	    my $newimage = "$imgdir/$newprefix$basename$extension";
	    unless (-e $newimage) { 
		print "MISSING (skipping) $newimage\n";
		print MISSING "$newimage\n";
		$missing++;
		next;
	    }
	    
	    # get image base64
	    my $base64 = encode_base64(read_file($newimage));
	    
	    # find out if this exact image is already known
	    if ($imagebase64_newurl{$base64}) {
		# already known, will point to the first one found
		print "DUPES $newprefix$basename$extension\n";
		print DUPES "$newprefix$basename$extension:\n\t$url => $imagebase64_newurl{$base64}\n";
		$dupes++;
		
		# URLs will point to the first image found
		#   full form like http://blog.files.wordpress.com/YYYY/MM/file.jpg
		$oldurl_newurl{$url} = $imagebase64_newurl{$base64};
		#   short form like http://blog.wordpress.com/file/
		my ($base64_basename, $base64_parentdir, $base64_extension) = fileparse($imagebase64_newurl{$base64}, qr/\.[^.]*$/);
		$oldurl_newurl{"$blog.wordpress.com/$basename/"} = "$blog.wordpress.com/$base64_basename/" if
		    "$blog.wordpress.com/$basename/" ne "$blog.wordpress.com/$base64_basename/";
		
	    } else {
		# store base64 with full form url to
		$imagebase64_newurl{$base64} = $newurl;
		
		# URLs will point to the first image found
		#   full form like http://blog.files.wordpress.com/YYYY/MM/file.jpg
		$oldurl_newurl{$url} = $newurl;
		#   short form like http://blog.wordpress.com/file/
		$oldurl_newurl{"$blog.wordpress.com/$basename/"} = "$blog.wordpress.com/$newprefix$basename/" if
		    "$blog.wordpress.com/$basename/" ne "$blog.wordpress.com/$newprefix$basename/";
			
	    }
	}
    }
    close(IN);
}

# store mappings
my %used;
while (my($oldurl,$newurl) = sort(each(%oldurl_newurl))) {
    print MAPPING "$oldurl => $newurl\n";
    my ($basename, $parentdir, $extension) = fileparse($newurl, qr/\.[^.]*$/);
    $used{"$basename$extension"} = 1;
    $mapped++;
}


# build import xml 
chdir($xmldir);
mkdir("$xmldir/renew") unless -d "$xmldir/renew";
while (defined(my $file = glob("*.xml"))) {
    print "### rewrite $file ###\n";
    open(OUT, "> renew/renewed-$file");
    open(IN, "< $file");
    while (<IN>) {
	my $line = $_;
	# check every url
	while (/($RE{URI}{HTTP}{-scheme => qr@https?@}{-keep})/g) {
	    # remove the http/https and possible args to keep the most portable url 
	    my $uri = URI->new($1);
	    my $url = $uri->authority.$uri->path;
	    
	    # update if mapping registered
	    if ($oldurl_newurl{$url}) {
		$line =~ s/$url/$oldurl_newurl{$url}/g;
		#print "$url -> $oldurl_newurl{$url}\n"
	    }
	}
	print OUT $line;
    }
    close(OUT);
    close(IN);	
}

# finally list useless images
chdir($imgdir);
mkdir("$xmldir/renew") unless -d "$xmldir/renew";
while (defined(my $file = glob("*"))) {    
    # work only on images
    my ($basename, $parentdir, $extension) = fileparse($file, qr/\.[^.]*$/);
    next unless lc($extension) =~ /^\.($images_types)$/i;

    # check if registered yet
    next if $used{$file};

    # if we reach this point, this media is unknown
    $unused++;
    print UNUSED $file, "\n";
}

close(MAPPING);
close(DUPES);
close(MISSING);
close(UNUSED);


print "=============================
$mapped mapped URL
$missing missing files (!)
$dupes duplicated files/links
$unused unused files\n";

# EOF

Grabbing new images post_id

There are used in gallery in the form . To use this script, you must compare your new XML produced by the previous script and new XML export made by wordpress.com AFTER uploading the new images.

The point is to get post_id from newly uploaded images and to map them to the old removed images post_id.


#!/usr/bin/perl
#
# ./update-post_id.pl FOLDER_OF_XML_UPDATED FOLDER_OF_XML_AFTER_IMAGE_REUPLOAD
#
#
# galleries are } -->
# refering to <wp:post_id>27</wppost_id> of image attachements.
#
# reuploaded files have new post_id along with new metadata hardcoded in the database
# 	<guid isPermaLink="false">http://XX.files.wordpress.com/2020/03/XXX-img_20160814_125558.jpg</guid>
#       <wp:post_type>attachment</wp:post_type>
# should match
#       <wp:attachment_url>https://XX.files.wordpress.com/2020/03/XXX-img_20160814_125558.jpg</wp:attachment_url>


use strict;
use warnings;
use Regexp::Common qw/URI/;
use File::Basename;
use File::Copy;
use File::Slurp;
use Cwd 'abs_path';
use MIME::Base64;
use URI;
use XML::LibXML;

die "$0 FOLDER_OF_XML_UPDATED FOLDER_OF_XML_AFTER_IMAGE_REUPLOAD" unless $ARGV[0] and$ARGV[1];



# get dir full path 
my $xmldir;
$xmldir = abs_path($ARGV[0])
    if $ARGV[0];
# get dir full path 
my $xmlafterreuploaddir;
$xmlafterreuploaddir = abs_path($ARGV[1])
    if $ARGV[1];

# test access to dir
print "XML dir (ARG1): $xmldir
XML after reupload dir (ARG2): $xmlafterreuploaddir\n";
chdir($xmldir) if -d $xmldir or die "unable to enter $xmldir, exiting"; 
chdir($xmlafterreuploaddir) if -d $xmlafterreuploaddir or die "unable to enter $xmlafterreuploaddir, exiting";


my %guid_postid;  # {guid} = postid



chdir($xmlafterreuploaddir);
# get postid after reupload
while (defined(my $file = glob("*.xml"))) {
    print "### read $file ###\n";
    my $dom = XML::LibXML->load_xml(location=>$file);

    foreach my $e ($dom->findnodes('//item')) {	
	#	print $e->to_literal();

	# only care for attachement type post
	next unless $e->findvalue('./wp:post_type') eq 'attachment';
	
	# store new post_id with the guid as key 	
	$guid_postid{$e->findvalue('./guid')} = $e->findvalue('./wp:post_id');
    }   
}


my %old2new_postid; # {old} = new


chdir($xmldir);
# get postid in first export
while (defined(my $file = glob("*.xml"))) {
    print "### read $file ###\n";
    my $dom = XML::LibXML->load_xml(location=>$file);

    foreach my $e ($dom->findnodes('//item')) {
	# only care for attachement type post
	next unless $e->findvalue('./wp:post_type') eq 'attachment';

	# ignore if this guid was not found/replaced
	next unless $guid_postid{$e->findvalue('./guid')};

	# map postids
	$old2new_postid{$e->findvalue('./wp:post_id')} = $guid_postid{$e->findvalue('./guid')};	
	print $e->findvalue('./wp:post_id')." -> ".$guid_postid{$e->findvalue('./guid')}."\n";
    }   
}


# finally, with this mapping, edit xml gallery entries:
# build import xml 
chdir($xmldir);
mkdir("$xmldir/newpostid") unless -d "$xmldir/newpostid";
while (defined(my $file = glob("*.xml"))) {
    print "### rewrite $file ###\n";
    open(OUT, "> newpostid/$file");
    open(IN, "< $file");
    while (<IN>) {
	my $line = $_;
	# older galleries
	# 	
	while (/\*)".*]/g) {
	    print "$1\n";
	    my $original = $1;
	    my @new_ids;
	    foreach my $id (split(",", $original)) {
		if ($old2new_postid{$id}) {
		    push(@new_ids, $old2new_postid{$id});
		} else {
		    # if not found, push back the original one
		    push(@new_ids, $id);
		}
	    }
	    my $new = join(",", @new_ids);
	    print " => $new\n";

	    $line =~ s/\,"columns":2} -->	
	while (/\<\!\-\- wp\:gallery \{\"ids\"\:\[([\d|,]*)\].*\}/g) {
	    print "$1\n";
	    my $original = $1;
	    my @new_ids;
	    foreach my $id (split(",", $original)) {
		if ($old2new_postid{$id}) {
		    push(@new_ids, $old2new_postid{$id});
		} else {
		    # if not found, push back the original one
		    push(@new_ids, $id);
		}
	    }
	    my $new = join(",", @new_ids);
	    print " => $new\n";

	    $line =~ s/\<\!\-\- wp\:gallery \{\"ids\"\:\[$original\]/<!-- wp:gallery {"ids":[$new]/g;
	    print $line;
	}

	print OUT $line;
    }
    close(OUT);
    close(IN);	
}


# EOF

SPF-aware greylisting with Exim and memcache

This is a followup of my 2011’s article avoiding Spams with SPF and greylisting within Exim. What changed since then? I actually am not more harrassed by spam that I was earlier on. It works. I am spam free since a decade now. No, but, however, several importants mail providers have a tendancy to send mail through multiples SMTPs, so many it took a while for any of them to do at least two attempt. So some mails takes ages to pass the greylist.

Contemplating the idea to use opensmtpd, I incidentally found an interesting proposal to mix greylisting of IP with SPF-validated domains.

The idea is that you greylist either an SMTP IP or a domain including any SMTP IP approved by SPF.

I updated the memcached-exim.pl script previously used and described. It was simplified because I dont think useful to actually make greylist per sender and recipient, only per IP or domain. Now it either only greylist IP, if not validated by SPF, or the domain and IP on success (to save a few SPF further test).

I dont think it should have any noticeable impact on the server behavior. SPF is anyway checked, so it is meaningless since there is local caching DNS on my mail servers.

The earlier /etc/exim4/memcached.conf is actually no longer required (defaults are enough). You still need exim configuration counterparts:  /etc/exim4/conf.d/main/00_stalag13-config_0greylist and /etc/exim4/conf.d/acl/26_stalag13-config_check_rcpt.

Delisting an Exim4 server from Office365 ban list

Ever tried to get delisted from Office365 ban list, for whatever reason you might try to get (new IP for a server that was abused in the past or else, you won’t know since they wont tell – and it even looks like they probably dont even really know)?

It is a funny process, because it involves receive a mail from their servers, a mail that will probably be flagged as spam, with clues so big that it might be blocked at SMTP time.

With Exim4, you’ll probably get in the log something like:

2019-08-20 22:18:09 1i0Aa1-0004Hu-8h H=mail-eopbgr740042.outbound.protection.outlook.com (NAM01-BN3-obe.outbound.protection.outlook.com) [40.107.74.42] X=TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256 CV=no F=<no-reply@microsoft.com> rejected after DATA: maximum allowed line length is 998 octets, got 3172

Long story short (this length test is not welcomed by all users), add /etc/exim4/conf.d/main/00_localoptions add

IGNORE_SMTP_LINE_LENGTH_LIMIT=1

and then restart the server.

Try delisting and check your spam folder. You should get now the relevant mail. Whatever we think about the lenght limit test of Exim4 (based on RCF, isn’t it?), you still end up with a mail sent by Office365 like this:

X-Spam-Flag: YES
X-Spam-Level: *****
X-Spam-Status: Yes, score=5.2 required=3.4 tests=BASE64_LENGTH_79_INF,
	HTML_IMAGE_ONLY_08,HTML_MESSAGE,MIME_HTML_ONLY,MIME_HTML_ONLY_MULTI,
	MPART_ALT_DIFF,SPF_HELO_PASS,SPF_PASS autolearn=no autolearn_force=no
	version=3.4.2
X-Spam-Report: 
	* -0.0 SPF_PASS SPF: sender matches SPF record
	* -0.0 SPF_HELO_PASS SPF: HELO matches SPF record
	*  0.7 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
	*  0.7 MPART_ALT_DIFF BODY: HTML and text parts are different
	*  0.0 HTML_MESSAGE BODY: HTML included in message
	*  1.8 HTML_IMAGE_ONLY_08 BODY: HTML: images with 400-800 bytes of
	*      words
	*  2.0 BASE64_LENGTH_79_INF BODY: base64 encoded email part uses line
	*      length greater than 79 characters
	*  0.0 MIME_HTML_ONLY_MULTI Multipart message only has text/html MIME
	*      parts

Considering the context, it screams incompetence.

Typing SSH passphrase(s) only once per session

Here’s a very simple way to type SSH passphrases only once. This simple function, to be added in your ~/.bashrc, will make sure that ssh-agent will always be called before ssh, once per session, so you do not have to type your ssh passphrase more than once:

function sshwithauthsock {
 if [ ! -S ~/.ssh/ssh_auth_sock ]; then
   eval `ssh-agent`
   ln -sf "$SSH_AUTH_SOCK" ~/.ssh/ssh_auth_sock
 fi
 export SSH_AUTH_SOCK=~/.ssh/ssh_auth_sock
 ssh-add -l > /dev/null || ssh-add
 "$@" 
}

alias ssh='sshwithauthsock ssh'
alias scp='sshwithauthsock scp'

Check for possibly updated version directly in my repository.

 

Getting nginx’s wildcard-based server names to pass Exim HELO syntax checks

Many PHP-based apps, like webmails, when using SMTP functions, depends on nginx server_name value to set up the HELO sent.

But if your server_name value is wildcard-based, you’ll get “syntactically invalid argument(s)” from the SMTP server. Example with ownCloud.

Assuming that the SMTP running on the same host as your webmail is not accepting mail but from the webmail itself, you can easily work around this. You can addd

helo_allow_chars=^~

in, for example, /etc/exim4/conf.d/main/00_webmail, if your server name is something like ~^mx.

 

Isn’t SRS breaking SPF itself, at least regarding spam?

Earlier on this blog, I proposed ways to implement SPF (Sender Policy Framework). I recently noticed mails forwarded by one of my servers being tagged as spam by gmail.com due to SPF checks. It means that while SPF works for my domains with near to 0 user base, no real business of forwarding, it is a nuisance for forwarding in general. So you are advised to use SRS (Sender Rewriting Scheme). Strangely enough it is not fully integrated on main servers and some implementation (Exim in Debian) are based on unmaintained library (SRS C library).

Unmaintained?

Fact is SRS is far from being nice. It makes so your own forwarding server is vouching for fowarded mails. But why would you want that?

SPF test will fail because your forwarding server is not a registered valid source for (forwarded) mails sent from domain X. SRS proposal is that your server will alter header so to forward the mail from X domain X to appear as sent from an address of your own domain for you server is a registered valid source.

I guess the logic is to make forwarders somehow responsible of filtering, not bad in principle.

But it also means that for each spam forwarders fail to identify, they’ll be tagged as spam originator. It is particulary annoying when forwarding is made on public addresses bound to attract spam. So it seems better to get a failed SPF test on every forwarded messages including valid ones than a valid SPF test on every forwarded messages including spam.

SPF without SRS breaks forwarding. But SPF with SRS, the workaround, breaks SPF itself regarding spam and will give you (your IPs, your domains) bad rep, with will make your legit mail at risk of being blacklisted, unless you apply an overly harsh policy on forwarded mails.

Annoying. I am thinking removing SPF completely, instead.  For now, I am updating my SPF records to remove any Fail statement, since there is no way for me to know whether one of my mail can legitimately be forwarded through several servers.  Funny enough, google that promotes SPF usage recommends using SoftFail over Fail. But I might even reset to Neutral.

Interesting link on topic : Mail server setup wih SRS ; Why not SPF?

Alternative: I implemented DKIM on my servers. Seems much smarter to have a server signature.