Creating web mirrors for newsgroups and mailing lists Today I want to
introduce the
Korallenriff
project. It is a
software I wrote
that can fetch
newsgroups and POP3
accounts. It can be
used to mirror
newsgroups and
mailing lists. Autor/Author: Dipl.-Inform. (FH) Steffen Wendzel (Date/Datum: 2009-05-02-19:36, Hits: 1117)
It works pretty simple: Download, install and configure Korallenriff. Korallenriff will store all the messages within a MySQL database table. You can then load all these messages by using own code (for example a PHP web interface like the one provided by groups.google.com).
1. Download Korallenriff
Download the .tgz file from https://sourceforge.net/project/showfiles.php?group_id=236591 and unpack it using:
$ tar -xzf korallenriff-x.y.z.tgz (x.y.z is the version name, e.g. '0.2.2beta')
2. Compile and install it
Change into the korallenriff directory and run 'make' (make sure you have 'make', the GNU C compiler, libc-dev, flex, bison and the MySQL C development files installed on your machine -- on ubuntu you would need to install the package 'libmysqlclient-dev' (or similar name)).
$ make
...
Now either:
$ su -
..
# make install
...
Or (e.g. Ubuntu Linux):
$ sudo make install
Okay, Korallenriff is now installed on your machine. Now you have to configure it by creating a MySQL database and by editing /etc/korallenriff.conf
3. Creating the MySQL database
Either use phpmyadmin and create the table or run these commands:
$ mysql -u root -p
...
mysql> create database KorallenriffDB;
Query OK, 1 row affected (0.05 sec)
mysql> connect KorallenriffDB;
Connection id: 2
Current database: KorallenriffDB
mysql> create table Messages ( ID INTEGER PRIMARY KEY NOT NULL AUTO_INCREMENT,
`unixname` VARCHAR(56), `sourcetype` VARCHAR(10) NOT NULL, `from` VARCHAR(196),
`to` VARCHAR(196), `subject` VARCHAR(1024), `date_send` VARCHAR(100),
`date_recv` VARCHAR(100), `msglen` INTEGER, `buffer` LONGTEXT );
Query OK, 0 rows affected (0.02 sec)
mysql> quit;
Bye
4. Editing /etc/korallenriff.conf
Before you can really use korallenriff, you have to configure it. The
configuration is done via /etc/korallenriff.conf. The configuration is pretty
easy. Comments are introduced with a '#' character (like in shell scripts).
** Setup the database access: You have to configure the database access using
such a line in the config file:
cfg database=mysql host=localhost port=3306 user=root pass=blue
This means that you want to connect to a MySQL database (other databases are
under development) and that the database runs on the host 'localhost' with
port 3306. The authentication is made with user 'root' and password 'blue'.
** Setting up load lines: Now the begins fun. The 'load' lines are needed
to fetch new data from servers. All 'load' lines contain a 'unixname' used
to identify the source. You can choose a unix-fs compatible name there with
a maximum lenght of 56 characters. Later you can use the unixname in database
querys (like 'SELECT * FROM Messages where `unixname`="mypop3";').
Example #1: Fetch a POP3 accounts (Note: Mails are deleted after they where
received!)
load unixname=mypop3 proto=pop3 server=pop.t-online.de port=110 auth=true user=nobody pass=abc
This will fetch mails from pop.t-online.de (port 110) using the POP3 protocol).
The user 'nobody' is authenticated with the password 'abc' in this case. If you
have a POP3 account _without_ authentication you can simply cut auth+user+pass.
Example #2: Fetch a NNTP newsgroup:
load unixname=MyMirror proto=nntp server=news.t-online.de port=119 newsgroup=de.org.ccc
This will fetch the newsgroup 'de.org.ccc'.
Final Step: Test Korallenriff
Please note that Korallenriff is still a beta quality software (please send bug reports!) but in most cases the software should work now. A first test run can be done by simply starting 'korallenriff' as super user.
$ korallenriff -v
If everything worked fine, you maybe want to run korallenriff every 60 minutes or so. Korallenriff will then add new messages in newsgroups and from POP3 accounts to your database table. You have two choices to do that:
1. You use a cronjob to execute it every 30 minutes or so.
2. You run Korallenriff in daemon mode. It fetches everything, sleeps 60 seconds and then fetches everything new again. To do so use the '-d' parameter:
$ korallenriff -d
If 60 seconds are not the optimal choice for you, you can change the sleep value using "-s seconds". This would fetch new messages every hour (3600 seconds):
$ korallenriff -d -s 3600
Your Idea
Now it is time for your idea. You could write a PHP web interface for your website that uses the fetched data in the database and shows postings in newsgroups or new postings to mailing lists or lets the users browser newsgroups/mailing list archives.
PS. The Korallenriff website can be found here. ____________
|