LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Blogs > Michael Uplawski
User Name
Password

Notices


Rate this Entry

Post processor script for flnews

Posted 04-15-2023 at 03:14 AM by Michael Uplawski
Updated 04-23-2023 at 03:40 PM by Michael Uplawski (screen shot, bugfix)

A styled version of this document: http://www.uplawski.eu/articles/flnews
The flnews newsreader is sufficient for Usenet-access, i.e. to receive and read articles from -, as well as to write and post articles to newsgroups.
When you compare news-clients, you will always notice the differences and choose the software that you prefer. Flnews however, has the charm that you can influence how the program itself works but also modify posts that flnews produces, just before the program will transmit them to the chosen nntp-server.
On this page, I show you a post-processor-script which can add and change details of a post, in ways that are currently not possible with flnews alone. As the script is configurable, it can probably respond to the needs of some Usenet users. You should, however, rather take it as an example for what can be done and an inspiration for your own creations.
Some background
These resources can help to give you further insight into Usenet (netnews, Newsgroups, the nntp-protocol) and news-readers like flnews:
The flnews newsreader
A fast and lightweight USENET newsreader for Unix by Michael Bäuerle
Netnews
What is Usenet and how does it work in 2023

The early history of Usenet
: The evolution of Usenet when computer networks were in their infancy (a series of 9 blog-posts).
Servers
Open-News-Network, an association of server-operators in Germany, providing high-quality Usenet access for free (registration required).
Commercial Usenet Service Providers – Many sites like this exist, although I fear that they share the same content.
Client software
A list of Usenet newsreaders on en.Wikipedia.org . A search for newsreaders can produce suggestions which are identical with the commercial Usenet providers, as they come with their respective Web-based application. Do not confuse a payed time-limited account with a program that you wish to try for a while. Usenet will be best, when it is free.
The limits of a basic newsreader — what the script can do
While the articles that flnews creates, are complete and ready to be posted, some users may not always agree with the result and for arbitrary reasons:
  • There may be inconveniences when you post to different newsgroups in different languages, as an introductory line which refers to a previous post can only be set once in the flnews-configuration. The consequence can be that your post to a french newsgroup begins with an introduction in English.

    My post-processor script can set an introductory line specifically chosen for one or several newsgroups.

  • The same conflict arises, when you have set a standard signature-text and would like to replace it against another, based on the newsgroup you are about to post to.

    The post-processor script sets specific signatures as configured for one or several newsgroups.

  • Some custom headers may serve to convey additional information to interested readers of your post, like GnuPG key IDs, your language skills or the like. The signature may be a better choice than custom headers. You are free. I just mention face and x-face but prefer that you do not remember I did.

    Custom-headers may be defined in the configuration file for the script and will then be added to each outgoing post.

  • The X-No-Archive header is sometimes set to avoid that an article be saved and stays available to search-engines (Google, notably). Test-postings, for example, do probably not justify at all that they would be referenced in search-results.

    The post-processor script can impose the X-No-Archive header for all posts to certain newsgroups.

    One problem that my script does not yet address:
    A follow-up message to a post carrying the X-No-Archive header, is not automatically exempted from archiving and might render X-No-Archive (as set by a previous poster) partly useless. This is a TODO and will probably be included in a future version of the script.
The script and the configuration
My post-processor is written in Ruby . A scripting language (or interpreter-language) is in my opinion most suitable for a text-munging tool. Apart from the facilitation of testing, you handle the syntax of the text to process much like the syntax of your script in development. Also.., I have been sick of Java and have forgotten most of what I had ever known about C++, a few dozen standards ago.

But, of course, any programming language is okay for writing your own post-processor.
Here is a link to the original script

The following code will not forcibly receive bug fixes and improvements, but the linked version should be up to date.
Code:
#!/usr/bin/env ruby
#encoding: UTF-8

=begin
/***************************************************************************
 *   ©2023-2023, Michael Uplawski <michael.uplawski@uplawski.eu>           *
 *                                                                         *
 *   This program is free software; you can redistribute it and/or modify  *
 *   it under the terms of the GNU General Public License as published by  *
 *   the Free Software Foundation; either version 3 of the License, or     *
 *   (at your option) any later version.                                   *
 *                                                                         *
 *   This program is distributed in the hope that it will be useful,       *
 *   but WITHOUT ANY WARRANTY; without even the implied warranty of        *
 *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the         *
 *   GNU General Public License for more details.                          *
 *                                                                         *
 *   You should have received a copy of the GNU General Public License     *
 *   along with this program; if not, write to the                         *
 *   Free Software Foundation, Inc.,                                       *
 *   59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.             *
 ***************************************************************************/
=end

# 17/3/2023 Linux only
#
# 7/3/2023 Rewrite. Defined functions for routine work, condensed the
# main routine. Read configuration from file.
#
# 5/3/2023 cosmetics, suprefluous module removed, 
# orthograpy in comments, superfluous line-break removed.

#TODO: verify URLs for syntax conformance.

# This script is Linux only
if ! /linux/ =~ RUBY_PLATFORM
  msg = "ERROR ! This script #{$0} cannot be used with #{RUBY_PLATFORM}"
  msg << "\n\tABORTING! Bye."
  STDERR.puts msg
  exit false
end

require 'yaml'
require "ostruct"

# a frugal logging function
def debout(msg)
  if $log && !$log.empty?
    begin
      log = File.open($log, 'a') 
      log.puts "\n" << Time.now.to_s << "\t" << msg
      log.close
    rescue Exception => ex
      STDERR.puts "Cannot use the log-file (#{log}): " << ex.message
      exit false
    end
  end
end
# called by the newsgroup_hook()
# choose a signature, if there is not already one.
def group_sig(groups)
  # .., either by comparing the entire group-name
  if $config.GROUP_SIGS.keys.include?(groups)
    $signature = "" << $config.GROUP_SIGS[groups] 
    # .., or by applying a regexp.
  else
    $config.GROUP_SIGS.each do |g, s|
      $signature = "" << s if groups.match(g)
    end
  end
end

# called by the newsgroup_hook()
def group_intro(groups, article)
  $intro = nil 
  debout "setting intro for group " << groups
  if $config.GROUP_INTROS.keys.include?(groups)
    $intro = $config.GROUP_INTROS[groups] 
  else
    $config.GROUP_INTROS.each do |gr, intro|
      $intro ||= intro if groups.match(gr)
      if $intro
        debout "matched group against " << gr
      end
    end
  end
  debout('group_intro is ' << $intro.to_s)
end

# called by the newsgroup_hook()
# Only 1 group!
def group_xnay(group)
  $XNAY = nil
  xgs = $config.XNAY_GROUPS
  if xgs && !xgs.empty? && xgs.detect {|g| group.match(g) }
    debout("setting XNAY")
    $XNAY = "X-No-Archive: YES" 
  end
end
# -----------> Hooks <-------------
def newsgroup_hook(groups, article)
  # .., if there is only one group 
  if groups.split(',').length == 1 
    groups.strip!
    # find some signature if need be
    group_sig(groups)
    # adapt the intro, too
    group_intro(groups, article)
    # set XNAY if needed
    group_xnay(groups)
  end  
end

# <---------- End Hooks ---------->

# set intro, if intro
def set_intro(body)
  # first line should be empty.
  new_body = Array.new
  fup_name = nil
=begin UNUSED
  fup_date = nil
  fup_time = nil
=end
  debout('FUP_NAME is ' << $config.FUP_NAME)
  body.each_with_index do |line, i|
    # find the name in the intro-line
    if !line.strip.empty? && !fup_name
      fup_name = line.match(Regexp.new($config.FUP_NAME) ) do |md|
        md.length == 2 ? md[1] : md[0]
      end
      # All that follows depends on the presence of a name
      # in the intro-string.
      if fup_name && !fup_name.strip.empty?
        if(body[i+1].start_with?('>'))
          debout("\tfound intro " << line)
          debout "testing group " << $config.FUP_GROUP
          fup_group = line.match(Regexp.new($config.FUP_GROUP) ) { |md| md.length == 2 ? md[1] : nil} 
          debout "group is " << fup_group.to_s
=begin
UNUSED
  fup_date = line.match(Regexp.new($config.FUP_DATE) ) { |md| md.length == 2 ? md[1] : nil} 
  fup_time = line.match(Regexp.new($config.FUP_TIME) ) { |md| md.length == 2 ? md[1] : nil} 
  debout('FUP_DATE is not set!') if !fup_date
  debout('FUP_TIME is not set!') if !fup_time
=end
          debout("name is " << fup_name.to_s)
          # variables are part of the $intro.
          $intro.sub!('%fup_name%', fup_name) if fup_name && $intro && !$intro.empty?
          $intro.sub!('%fup_group%', fup_group)  if fup_group && $intro && !$intro.empty?
          if($intro && !$intro.strip.empty?)
            debout("\tsetting intro " << $intro.to_s)
            new_body << $intro 
          else
            debout("\tkeeping intro " << line)
            new_body << line
          end

        end
      else
        # usual lines are just kept as they are.
        debout('no name in line ' << line)
        new_body << line
      end
    else
      # empty lines too
      new_body << line
    end

  end
  debout "new body is " << new_body.to_s
  new_body
end

# returns the header-lines of the article
def headers(article)
  debout "setting headers"
  hend = false
  headers = article.split(LN).collect do |line|
    # Take note of the empty line between
    # headers and body, skip remainder.
    hend ||= line.strip.empty? 
    if !hend
      header = line.split(':')
      if header && header.length == 2 && "Newsgroups" == header[0].strip
        # Newsgroups header found, react as you must.
        newsgroup_hook header[1], article
      end
      line
      # end of headers reached
    end
  end
  # X-No-Archive
  headers << $XNAY if $XNAY 
  # Custom-headers
  headers += $config.CUSTOM_HEADERS
  headers.compact
end

# returns the body-lines off the article
def body(article)
  bstart = false 
  abody = article.split(LN).collect do |l|
    # Start collecting only at the
    # empty line between header and body.
    bstart ||= l.strip.empty?
    # ... but include that empty line.
    l if bstart
  end
  abody.compact!
  # returns a new version of the body, with a
  # possibly altered intro-line
  set_intro(abody)
end

# The value in $signature is set by the newsgroup_hook.
# This will add the line to the current article body
def set_signature(article)
  nsig = article.split(LN).count{|s| s == "-- "}
  # several signatures should be avoided!
  if nsig > 1 
    debout "Found #{nsig} signatures"
    debout "PSE remove a few and leave ... like only 1 intact."
    exit false
  else
    article << LN.dup << "-- " << LN << $signature << LN if $signature
  end
end

############### main routine ##################
# remove previous log, if existing
# read configuration from file in the same directory as this script (Bugfix)!

CONFIG = File.dirname( __FILE__) << File::SEPARATOR << "flnews_post_proc.conf"
debout('reading config')
if File.exist?(CONFIG) && File.readable?(CONFIG)
  begin
    $config = OpenStruct.new (Psych.load_file(CONFIG))
  rescue Exception => ex
    STDERR.puts("Cannot read from configuration-file: " << ex.message)
    STDERR.puts("Post-processing aborted!")
    exit false
  end
end

$log = $config.DEBUG_LOG
if $log && !$log.empty?
  begin
    File.unlink $log if File.exist?($log)
  rescue Exception => ex
    STDERR.puts 'Cannot delete the log-file (#{$log})'
    exit false
  end
end

debout("config is " << $config.to_s)
# line-break in the final article
LN = "\r\n"

# Default, no signature.
$signature = nil 


if (!STDIN.tty?)
  # read from STDIN
  artext = ARGF.read
  # There is content, get headers and body
  if !artext.strip.empty?
    # extract header- and body-lines
    # ... add custom headers
    # ... change intro, if need be
    ahead = headers(artext)
    abody = body(artext)
    # Join all together again.
    # TODO default signature (currently empty)
    newart = ahead.join(LN) + LN + abody.join(LN) 
    debout('new article: ' << newart)
    # add and/or alter signature
    set_signature(newart)

    # --------------> for debugging
    begin
      outfile = "/tmp/new_Article"
      File.open(outfile, 'w') do |f|
        f.write(newart)
      end
    rescue Exception => ex
      debout "Cannot write #{outfile}: " << ex.message
      exit false
    end
    # <------------------ 

    # The real thing. Write to STDOOUT.
    # ------------ HEUREKA ! ---------
    puts newart
    # ------------ END HEUREKA -------
    # Bin ich'n Tier, ey ...
    exit true
  else
    debout "Cannot read the article"
    exit false
  end
else
  usage = "\nWhat do you want me to do? Where is the article to post-process?"
  usage << "\nUsage: "
  usage << "\n\t#{$0} < article.text"
  debout usage
  STDERR.puts usage
  exit false 
end
# EOF
The configuration file
The configuration file is in YAML syntax and full of explanations. The variables defined in this file can be classified as belonging to one of two categories:
  • Variables describing values originally set by flnews, which should be used or replaced. The important elements are usually matched in a capture group.
  • Variables defining the new or altered content.
Code:
#/***************************************************************************
# *   ©2023-2023, Michael Uplawski <michael.uplawski@uplawski.eu>           *
# *                                                                         *
# *   This program is free software; you can redistribute it and/or modify  *
# *   it under the terms of the GNU General Public License as published by  *
# *   the Free Software Foundation; either version 3 of the License, or     *
# *   (at your option) any later version.                                   *
# *                                                                         *
# *   This program is distributed in the hope that it will be useful,       *
# *   but WITHOUT ANY WARRANTY; without even the implied warranty of        *
# *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the         *
# *   GNU General Public License for more details.                          *
# *                                                                         *
# *   You should have received a copy of the GNU General Public License     *
# *   along with this program; if not, write to the                         *
# *   Free Software Foundation, Inc.,                                       *
# *   59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.             *
# ***************************************************************************/

# This is a YAML file. Keep intact these three dashes.
  ---

# FUP_NAME 
# A Regular Expression, describing the string which contains the name of
# previous poster who is the author of a quoted post.  This string is
# recognized in the original article and may be used with the fitting element
# from GROUP_INTRO, below. The Regexp-format is that of the Regexp class in
# Ruby, noted as a String. Beware to mask a backslash '\' by another one, 
# like in the example. A capture-group '()' serves to extract the name from the
# match result. 
# Leave this field empty to keep the default from the FLNews configuration
# intact.
# CONTENT: A String equivalent of a regular expression.
# DEFAULT: EMPTY
# EXAMPLE1: "Am \\d+.\\d+.\\d{2,4} um \\d+:\\d+ schrieb (.*):"
# EXAMPLE2: "(.*) wrote:"
  FUP_NAME: '(.*) wrote in'

# FUP_DATE (unused)
# A Regular Expression, describing the string which contains the date of the
# previous post, that you are referring to in the followup.  Leave this field
# empty to ignore the date depending on your chosen GROUP_INTROS (see below).
# CONTENT: A String equivalent of a regular expression.
# DEFAULT: EMPTY
# EXAMPLE: "Am (\\d+.\\d+.\\d{2,4}) um"
  FUP_DATE: ''

# FUP_TIME (unused)
# A Regular Expression, describing the string which contains the time of day,
# when the previous post, that you are referring to in the followup, had been
# published.  
# Leave this field empty to ignore the time, depending on your chosen
# GROUP_INTROS (see below).  
# CONTENT: A String equivalent of a regular expression.
# DEFAULT: EMPTY
# EXAMPLE: "um \\d+:\\d+ schrie[b]  FUP_TIME: ''

#FUP_GROUP
# A Regular Expression, describing the string which contains the newsgroup
# where the previous post, that you are referring to in the followup, had been
# published.  
# Leave this field empty to ignore the precise group.
# CONTENT: A String equivalent of a regular expression.
# DEFAULT: EMPTY
# EXAMPLE: "wrote in (.*)"
  FUP_GROUP: 'wrote in (.*):'

# GROUP_INTROS:
# Introductory strings, referring to the previous poster who is the author of a
# quoted post. If you match the newsgroup of the post (see FUP_GROUP), you can 
# use these variables in the result. 
# Currently only %fup_name% and %fup_group% are reproduced in the resulting 
# introductory string.
# CONTENT: A newsgroup or regexp, followed by a colon, a space and a String
# ending in \r\n.  
# DEFAULT: As configured in FLNews
# EXAMPLE: alt.test: "Thus spoke #{fup_name} on that baleful #{fup_date}:\r\n"
    GROUP_INTROS:
    de.*: "%fup_name% hat geschrieben:\r\n"
    uk.*: "%fup_name% wrote:\r\n"
    fr.*: "%fup_name% a écrit:\r\n"

# GROUP_SIGS
# A signature line per Newsgroup. 
# CONTENT: A newsgroup or regexp, followed by a colon, a space and a String, ending in \r\n
# DEFAULT: As configured in flnews
# EXAMPLE: alt.test: "Signature for alt.test"
    GROUP_SIGS:
    fr.test: "newsgroup_hook fr.test\r\n2ème ligne, guillemets"
    de.*: Es ist an der Zeit
    fr.*: "Le progrès, ce n'est pas l'acquisition de biens. C'est l'élévation de\r\n\nl'individu,
    son émancipation, sa compréhension du monde. Et pour ça il\r\n\nfaut du temps
    pour lire, s'instruire, se consacrer aux autres.\r\n\n(Christiane Taubira)\r\n"

# CUSTOM_HEADERS
# Additional headers for the outgoing article
# CONTENT: A dash and space, then a String, comprising the name of the header, ending in a 
#          colon and the value of the header
# DEFAULT: undefined
# EXAMPLE: - 'X-My-Header: nothing fancy'
    CUSTOM_HEADERS:

# XNAY_GROUPS:
# The newsgroups, where a header X-No-Archive: YES shall be set.
# CONTENT: a dash and space, then a String, containing the name of the group
#          or a regexp.
# DEFAULT: empty
# EXAMPLE: - "alt.test"
XNAY_GROUPS:
- ".*.test"

# DEBUG_LOG:
# The name of a file, where debug messages are written. Setting this
# variable will enable the log. Leave empty to disable logging.
# CONTENT: The name of a writable file, which will be overwritten.
# DEFAULT: empty
# EXAMPLE: '/tmp/a_log-file.txt'
DEBUG_LOG: '/tmp/flnews_post_proc.log'
The original configuration file is here .
Usage
The post-processor must be known and accessible to flnews. For this, in the configuration file for flnews (usually in ~/.config), you can set the value for post_proc to the path of your chosen routine, e.g. post_proc: /home/[user]/bin/flnews_post_proc
Testing
The effects that the execution of the script will have on a posting can be verified in two ways:
  1. By piping-in a post that had previously been saved to a file:
    :~$ [post-processor] < [test-article]
    In the case of my Ruby-script, above, this will show the resulting new version of the article on screen, but you can also pipe the output into another file. This is a great way to test a program during development or to test your own configuration of the script.
  2. By posting directly into a test-newsgroup (like alt.test or similar). This is mandatory before you really post to thematic newsgroups and when the settings of the post-processor will affect the article.
Ω
15 April 2023
Attached Images
File Type: jpg lq_sc_flnews.jpg (183.7 KB, 8 views)
File Type: png lq_th_sc_flnews.png (16.2 KB, 26 views)
« Prev     Main     Next »
Total Comments 0

Comments

 

  



All times are GMT -5. The time now is 09:53 AM.

Main Menu
Advertisement
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration