Newsgroups: alt.sources Subject: ninpaths - new inpaths reporting program Followup-To: alt.sources.d,news.software.nntp Archive-name: ninpaths Submitted-by: Olaf Titz Environment: INN This is an efficient and space-saving inpaths reporting program. It works as follows: you feed it the Path lines via an INN channel feed, and from time to time the program writes all its internal counters accumulated so far to a dump file. Another instance of the program picks up all the dump files, adds them up and formats them into the report. #!/bin/sh # This is a shell archive (produced by GNU sharutils 4.2). # To extract the files from this archive, save it to some FILE, remove # everything before the `!/bin/sh' line above, then type `sh FILE'. # # Made on 1999-02-24 13:48 CET by . # Source directory was `/var/home/olaf/mysrc/ninpaths'. # # Existing files will *not* be overwritten unless `-c' is specified. # # This shar contains: # length mode name # ------ ---------- ------------------------------------------ # 2536 -rw-rw-r-- ninpaths.README # 11346 -rw-rw-r-- ninpaths.c # 721 -rw-rw-r-- sendinpaths # 1174 -rw-rw-r-- Checksums # save_IFS="${IFS}" IFS="${IFS}:" gettext_dir=FAILED locale_dir=FAILED first_param="$1" for dir in $PATH do if test "$gettext_dir" = FAILED && test -f $dir/gettext \ && ($dir/gettext --version >/dev/null 2>&1) then set `$dir/gettext --version 2>&1` if test "$3" = GNU then gettext_dir=$dir fi fi if test "$locale_dir" = FAILED && test -f $dir/shar \ && ($dir/shar --print-text-domain-dir >/dev/null 2>&1) then locale_dir=`$dir/shar --print-text-domain-dir` fi done IFS="$save_IFS" if test "$locale_dir" = FAILED || test "$gettext_dir" = FAILED then echo=echo else TEXTDOMAINDIR=$locale_dir export TEXTDOMAINDIR TEXTDOMAIN=sharutils export TEXTDOMAIN echo="$gettext_dir/gettext -s" fi touch -am 1231235999 $$.touch >/dev/null 2>&1 if test ! -f 1231235999 && test -f $$.touch; then shar_touch=touch else shar_touch=: echo $echo 'WARNING: not restoring timestamps. Consider getting and' $echo "installing GNU \`touch', distributed in GNU File Utilities..." echo fi rm -f 1231235999 $$.touch # if mkdir _sh14042; then $echo 'x -' 'creating lock directory' else $echo 'failed to create lock directory' exit 1 fi # ============= ninpaths.README ============== if test -f 'ninpaths.README' && test "$first_param" != -c; then $echo 'x -' SKIPPING 'ninpaths.README' '(file already exists)' else $echo 'x -' extracting 'ninpaths.README' '(text)' sed 's/^X//' << 'SHAR_EOF' > 'ninpaths.README' && This is an efficient and space-saving inpaths reporting program. It works as follows: you feed it the Path lines via an INN channel feed, and from time to time the program writes all its internal counters accumulated so far to a dump file. Another instance of the program picks up all the dump files, adds them up and formats them into the report. X A working setup goes as follows: 0. Make a directory MOST_LOGS/path 1. Make a channel feed like X !inpaths::Tc,WP:/var/lib/news/localbin/ninpaths -p\ X -d MOST_LOGS/path/inpaths.%d X if your INN supports WP, else see below. 2. Enter in your crontab: X 6 6 * * * ctlinnd flush !inpaths X (the actual time doesn't matter) X This will generate the dump files, one a day. 3. Once a month run the "sendinpaths" script, which collects the dumps, X makes a report and deletes old dumps. X (You can generate a report without mailing it and without deleting X any old files with "sendinpaths -n".) X On my small test box, where the collected Path lines from the older inpaths accumulated about 4MB daily, the daily dump file runs around 50-100k. The program is fast; writing the dump doesn't take more than a few seconds, preparing a report a bit more. X ninpaths takes the following parameters: -p Read Path lines from standard input. -d file Dump data to file. A "%d" in the file name will be replaced by X the current system time when the dump proceeds. -u file Read a dump file. X The "-u file" argument can be repeated to collect several dumps. -r site Produce a report for "site". -v n Set report verbosity (n=0,1,2; default=2). No check is made (by now) whether the given arguments make any sense. X If your INN doesn't have the WP feed flag (1.5 has not, 1.6 has, 1.7 I don't know, 2.0 has) use the following: X !inpaths::Tc,WH:/var/lib/news/localbin/ginpaths where "ginpaths" is the following script: #!/bin/sh exec egrep '^Path: ' | ninpaths -p -d MOST_LOGS/path/inpaths.%d X (of course, replace MOST_LOGS with the right directory in all of the above) X The idea and some implementation details for ninpaths come from the original inpaths program, but most of the code has been rewritten for clarity. The copyright status of the original source is unclear, but as far as I'm concerned, this package can be used and distributed under the same terms as INN. X This program is posted to alt.sources and kept on my WWW archive under . X Olaf Titz $Id: ninpaths.README,v 1.5 1999/02/24 12:47:57 olaf Exp $ SHAR_EOF $shar_touch -am 0224134799 'ninpaths.README' && chmod 0664 'ninpaths.README' || $echo 'restore of' 'ninpaths.README' 'failed' if ( md5sum --help 2>&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \ && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then md5sum -c << SHAR_EOF >/dev/null 2>&1 \ || $echo 'ninpaths.README:' 'MD5 check failed' 78f257c742bca4d147afaab53d7980ea ninpaths.README SHAR_EOF else shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'ninpaths.README'`" test 2536 -eq "$shar_count" || $echo 'ninpaths.README:' 'original size' '2536,' 'current size' "$shar_count!" fi fi # ============= ninpaths.c ============== if test -f 'ninpaths.c' && test "$first_param" != -c; then $echo 'x -' SKIPPING 'ninpaths.c' '(file already exists)' else $echo 'x -' extracting 'ninpaths.c' '(text)' sed 's/^X//' << 'SHAR_EOF' > 'ninpaths.c' && /* X ninpaths.c - new inpaths reporting program X X Idea, data structures and part of code based on inpaths 2.5 X by Brian Reid, Landon Curt Noll X X This version written by Olaf Titz, Feb. 1997 */ X #include #include #include #include #include #include X const char id[]="$Id: ninpaths.c,v 1.6 1999/02/24 12:47:58 olaf Exp $"; X #define MAXFNAME 1024 /* max length of file name */ #define MAXLINE 1024 /* max length of Path line */ #define HASH_TBL 65536 /* hash table size (power of two) */ #define MAXHOST 128 /* max length of host name */ #define HOSTF "%127s" /* scanf format for host name */ #define RECLINE 120 /* dump file line length softlimit */ X #define VERSION "3.1" /* this program */ X /* structure used to tally the traffic between two hosts */ struct trec { X struct trec *rlink; /* next in chain */ X struct nrec *linkid; /* pointer to... */ X long tally; /* count */ }; X /* structure to hold the information about a host */ struct nrec { X struct nrec *link; /* next in chain */ X struct trec *rlink; /* start of trec chain */ X char *id; /* host name */ X long no; /* identificator for dump file */ X long sentto; /* tally of articles sent from here */ }; X struct nrec *hosthash[HASH_TBL]; X time_t starttime; /* Start time */ double atimes=0.0; /* Sum of articles times wrt. starttime */ long total=0, /* Total articles processed */ X sites=0; /* Total sites known */ X /* malloc and warn if out of mem */ void * wmalloc(s) { X void *p=malloc(s); X if (!p) X fprintf(stderr, "warning: out of memory\n"); X /* Out of mem is non-fatal here. We will just miss a few counters. X Better than aborting and missing the whole dump. */ X return p; } X /* Hash function due to Glenn Fowler / Landon Curt Noll / Phong Vo */ int hash(const char *str) { X unsigned long val; X unsigned long c; X X for (val = 0; (c=(unsigned long)(*str)); ++str) { X val *= 16777619; /* magic */ X val ^= c; /* more magic */ X } X return (int)(val & (unsigned long)(HASH_TBL-1)); } X /* Look up a host in the hash table. Add if necessary. */ struct nrec * hhost(const char *n) { X struct nrec *h; X int i=hash(n); X X for (h=hosthash[i]; h; h=h->link) X if (!strcmp(n, h->id)) X return h; X /* not there - allocate */ X h=wmalloc(sizeof(struct nrec)); X if (!h) X return NULL; X h->link=hosthash[i]; X h->rlink=NULL; X h->id=strdup(n); X if (!h->id) { X free(h); return NULL; X } X h->no=h->sentto=0; X hosthash[i]=h; X sites++; X return h; } X /* Look up a tally record between hosts. Add if necessary. */ struct trec * tallyrec(struct nrec *r, struct nrec *h) { X struct trec *t; X for (t=r->rlink; t; t=t->rlink) X if (t->linkid==h) X return t; X t=wmalloc(sizeof(struct trec)); X if (!t) X return NULL; X t->rlink=r->rlink; X t->linkid=h; X t->tally=0; X r->rlink=t; X return t; } X X /* Dump file format: X "!!NINP" "\n" X followed by S-records, X "!!NLREC\n" X [3.0] X followed by max. ^2 L-records X [3.1] X followed by max. L-records X "!!NEND" "\n" X starttime, endtime, avgtime as UNIX date X the records are separated by space or \n X an S-record is "site count" X [3.0] X an L-record is "sitea!siteb!count" X [3.1] X an L-record is ":sitea" { "!siteb,count" }... X ",count" omitted if count==1 X where sitea and siteb are numbers of the S-records starting at 0 */ X int writedump(FILE *f) { X int i, j; X long n; X struct nrec *h; X struct trec *t; X X if (!total) { X return -1; X } X fprintf(f, "!!NINP " VERSION " %ld %ld %ld %ld %ld\n", X starttime, time(0), sites, total, X (long)(atimes/total)+starttime); X n=j=0; X /* write the S-records (hosts), numbering them in the process */ X for (i=0; ilink) { X h->no=n++; X j+=fprintf(f, "%s %ld", h->id, h->sentto); X if (j>RECLINE) { X j=0; X fprintf(f, "\n"); X } else { X fprintf(f, " "); X } X } X if (n!=sites) X fprintf(stderr, "internal error: sites=%ld, dumped=%ld\n", sites, n); X X fprintf(f, "\n!!NLREC\n"); X X n=j=0; X /* write the L-records (links) */ X for (i=0; ilink) X if ((t=h->rlink)) { X j+=fprintf(f, ":%ld", h->no); X for (; t; t=t->rlink) { X j+=fprintf(f, "!%ld", t->linkid->no); X if (t->tally>1) X j+=fprintf(f, ",%ld", t->tally); X n++; X } X if (j>RECLINE) { X j=0; X fprintf(f, "\n"); X } X } X fprintf(f, "\n!!NLEND %ld\n", n); X return 0; } X /* Write dump to a named file. Substitute %d in file name with system time. */ X void writedumpfile(const char *n) { X char buf[MAXFNAME]; X FILE *d; X X if (n[0]=='-' && n[1]=='\0') { X (void) writedump(stdout); X return; X } X snprintf(buf, sizeof(buf), n, time(0)); X d=fopen(buf, "w"); X if (d) { X if (writedump(d)<0) X unlink(buf); X } else { X perror("writedumpfile: fopen"); X } } X /* Read a dump file. */ X int readdump(FILE *f) { X int a, b; X long i, m, l; X time_t st, et, at; X long sit, tot; X struct nrec **n; X struct trec *t; X char c[MAXHOST]; X char v[16]; X X #define formerr(i) {\ X fprintf(stderr, "dump file format error #%d\n", (i)); return -1; } X X if (fscanf(f, "!!NINP %15s %ld %ld %ld %ld %ld\n", X v, &st, &et, &sit, &tot, &at)!=6) X formerr(0); X X n=calloc(sit, sizeof(struct nrec *)); X if (!n) { X fprintf(stderr, "error: out of memory\n"); X return -1; X } X for (i=0; isentto+=l; X } X if ((fscanf(f, HOSTF "\n", c)!=1) || X strcmp(c, "!!NLREC")) X formerr(2); X m=0; X if (!strncmp(v, "3.0", 3)) { X /* Read 3.0-format L-records */ X while (fscanf(f, "%d!%d!%ld ", &a, &b, &l)==3) { X t=tallyrec(n[a], n[b]); X if (t) X t->tally+=l; X ++m; X } X } else if (!strncmp(v, "3.1", 3)) { X /* Read L-records */ X while (fscanf(f, " :%d", &a)==1) { X while ((i=fscanf(f, "!%d,%ld", &b, &l))>0) { X t=tallyrec(n[a], n[b]); X if (i<2) X l=1; X if (t) X t->tally+=l; X ++m; X } X } X } else { X fprintf(stderr, "version %s ", v); X formerr(9); X } X if ((fscanf(f, "!!NLEND %ld\n", &i)!=1) X || (i!=m)) X formerr(3); #ifdef DEBUG X fprintf(stderr, " dumped start %s total=%ld atimes=%ld (%ld)\n", X ctime(&st), tot, at, at-st); #endif X /* Adjust the time average and total count */ X if (starttime>st) { X atimes+=(double)total*(starttime-st); X starttime=st; X } X atimes+=(double)tot*(at-starttime); X total+=tot; #ifdef DEBUG X fprintf(stderr, " current start %s total=%ld atimes=%.0f (%.0f)\n\n", X ctime(&starttime), total, atimes, atimes/total); #endif X free(n); X return 0; } X /* Read dump from a file. */ X int readdumpfile(const char *n) { X FILE *d; X int i; X X if (n[0]=='-' && n[1]=='\0') X return readdump(stdin); X X d=fopen(n, "r"); X if (d) { X /* fprintf(stderr, "Reading dump file %s\n", n); */ X i=readdump(d); X fclose(d); X return i; X } else { X perror("readdumpfile: fopen"); X return -1; X } } X X /* Process a Path line. */ X void pathline(char *c) { X char *c2; X struct nrec *h, *r; X struct trec *t; X X r=NULL; X while (*c) { X for (c2=c; *c2 && *c2!='!'; c2++); X if (c2-c>MAXHOST-1) X /* looks broken, dont bother with rest */ X return; X while (*c2=='!') X *c2++='\0'; /* skip "!!" too */ X h=hhost(c); X if (h) { X ++h->sentto; X if (r && r!=h) { X t=tallyrec(r, h); X if (t) X ++t->tally; X } X } X c=c2; X r=h; X } } X /* Take Path lines from file (stdin used here). */ X void procpaths(FILE *f) { X char buf[MAXLINE]; X char *c, *ce; X int v=1; /* current line is valid */ X X while (fgets(buf, sizeof(buf), f)) { X c=buf; X if (!strncmp(c, "Path: ", 6)) X c+=6; X /* find end of line. Some broken newsreaders preload Path with X a name containing spaces. Chop off those entries. */ X for (ce=c; *ce && !isspace(*ce); ++ce); X if (!*ce) { X /* bogus line */ X v=0; X } else if (v) { X /* valid line */ X for (; ce>c && *ce!='!'; --ce); /* ignore last element */ X *ce='\0'; X pathline(c); /* process it */ X /* update average age and grand total */ X atimes+=(time(0)-starttime); X ++total; X } else { X /* next line is valid */ X v=1; X } X } } X /* Output a report suitable for mailing. From inpaths 2.5 */ X void report(const char *hostname, int verbose) { X double avgAge; X int i, columns, needHost; X struct nrec *list, *relay; X struct trec *rlist; X char hostString[MAXHOST]; X time_t t0=time(0); X X if (!total) { X fprintf(stderr, "report: no traffic\n"); X return; X } X /* mark own site to not report it */ X list=hhost(hostname); X if (list) X list->id[0]='\0'; X X avgAge=((double)t0 - (atimes/total + (double)starttime)) /86400.0; X printf("ZCZC begin inhosts %s %s %d %ld %3.1f\n", X VERSION,hostname,verbose,total,avgAge); X for (i=0; iid[0] != 0 && list->rlink != NULL) { X if (verbose > 0 || (100*list->sentto > total)) X printf("%ld\t%s\n",list->sentto, list->id); X } X list = list->link; X } X } X printf("ZCZC end inhosts %s\n",hostname); X X printf("ZCZC begin inpaths %s %s %d %ld %3.1f\n", X VERSION,hostname,verbose,total,avgAge); X for (i=0; i 1 || (100*list->sentto > total)) { X if (list->id[0] != 0 && list->rlink != NULL) { X columns = 3+strlen(list->id); X sprintf(hostString,"%s H ",list->id); X needHost = 1; X rlist = list->rlink; X while (rlist != NULL) { X if ( X (100*rlist->tally > total) X || ((verbose > 1)&&(5000*rlist->tally>total)) X ) { X if (needHost) printf("%s",hostString); X needHost = 0; X relay = rlist->linkid; X if (relay->id[0] != 0) { X if (columns > 70) { X printf("\n%s",hostString); X columns = 3+strlen(list->id); X } X printf("%ld Z %s U ", rlist->tally, relay->id); X columns += 9+strlen(relay->id); X } X } X rlist = rlist->rlink; X } X if (!needHost) printf("\n"); X } X } X list = list->link; X } X } X printf("ZCZC end inpaths %s\n",hostname); X } X extern char *optarg; X int main(int argc, char *argv[]) { X int i; X int pf=0, vf=2; X char *df=NULL, *rf=NULL; X X for (i=0; i&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \ && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then md5sum -c << SHAR_EOF >/dev/null 2>&1 \ || $echo 'ninpaths.c:' 'MD5 check failed' d25764e3fba2bd5e0a49a614fc34a073 ninpaths.c SHAR_EOF else shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'ninpaths.c'`" test 11346 -eq "$shar_count" || $echo 'ninpaths.c:' 'original size' '11346,' 'current size' "$shar_count!" fi fi # ============= sendinpaths ============== if test -f 'sendinpaths' && test "$first_param" != -c; then $echo 'x -' SKIPPING 'sendinpaths' '(file already exists)' else $echo 'x -' extracting 'sendinpaths' '(text)' sed 's/^X//' << 'SHAR_EOF' > 'sendinpaths' && #!/bin/sh # # Submit path statistics based on ninpaths # $Id: sendinpaths,v 1.3 1999/02/24 12:47:58 olaf Exp $ X # Assuming the ninpaths dump files are in ${MOST_LOGS}/path/inpaths.%d X X. /var/lib/news/innshellvars cd ${MOST_LOGS}/path ME=`innconfval pathhost` report=30 keep=14 TMP=tmp$$ X trap "rm -f $TMP; trap 0; exit" 0 1 2 15 # Make report from (up to) $report days of dumps find . -name 'inpaths.*' -not -empty -mtime -$report -print |\ X sed 's/^/-u /' > $TMP X if [ "$1" = "-n" ] ; then X ninpaths `cat $TMP` -r $ME else X ninpaths `cat $TMP` -r $ME |\ X $MAILCMD -s "inpaths $ME" pathsurvey@top1000.org X # remove dumps older than $keep days X find . -name 'inpaths.*' -mtime +$keep -print | xargs -r rm fi X Xexit 0 SHAR_EOF $shar_touch -am 0224134799 'sendinpaths' && chmod 0664 'sendinpaths' || $echo 'restore of' 'sendinpaths' 'failed' if ( md5sum --help 2>&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \ && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then md5sum -c << SHAR_EOF >/dev/null 2>&1 \ || $echo 'sendinpaths:' 'MD5 check failed' 6d2348747233d75da1398e68be213f1c sendinpaths SHAR_EOF else shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'sendinpaths'`" test 721 -eq "$shar_count" || $echo 'sendinpaths:' 'original size' '721,' 'current size' "$shar_count!" fi fi # ============= Checksums ============== if test -f 'Checksums' && test "$first_param" != -c; then $echo 'x -' SKIPPING 'Checksums' '(file already exists)' else $echo 'x -' extracting 'Checksums' '(text)' sed 's/^X//' << 'SHAR_EOF' > 'Checksums' && # This file contains a list of the files in this package with checksums. # If you run this as a /bin/sh script, it will verify the PGP signature on # this file and verify the individual files' checksums. # # Packaged by olaf@bigred, Wed Feb 24 13:48:35 CET 1999 X :<<'_marker_' -----BEGIN PGP SIGNED MESSAGE----- X ____size_ _______________date_ ___________________md5-checksum_ _name________ X 2536 Feb 24 13:47:57 1999 78f257c742bca4d147afaab53d7980ea ninpaths.README X 11346 Feb 24 13:47:58 1999 d25764e3fba2bd5e0a49a614fc34a073 ninpaths.c X 721 Feb 24 13:47:58 1999 6d2348747233d75da1398e68be213f1c sendinpaths ____size_ _______________date_ ___________________md5-checksum_ _name________ _marker_ X echo "No output after the PGP signature check means everything is okay." pgp -tf <$0 | sed -e '/^_marker_/q' -e '/^_/d' | cut -b32- | md5sum -c exit $? X -----BEGIN PGP SIGNATURE----- Version: 2.6.3ia Charset: latin1 X iQCVAwUBNtP1KcDBbWIyVM0dAQFIYAQAp3WZD+qL2EeYzcOgYunJ6jIadPWMq2ft SpzmA6tZg7Gh7Yj1GCKI3UdS2YOGS1nSv5W3Pdc7Tsn6BCA2ihAMuDHq2nm7mIpZ XXBhK5onhKjvCYgHbqEADqlY+LkYLnNWgxESPfirQp+yvOn4oouH1+EAUVKOa9p7U WLvG0k1jGE0= =TnWh -----END PGP SIGNATURE----- SHAR_EOF $shar_touch -am 0224134899 'Checksums' && chmod 0664 'Checksums' || $echo 'restore of' 'Checksums' 'failed' if ( md5sum --help 2>&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \ && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then md5sum -c << SHAR_EOF >/dev/null 2>&1 \ || $echo 'Checksums:' 'MD5 check failed' 07e504608b1529e7dd4e8b02bbd848dd Checksums SHAR_EOF else shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'Checksums'`" test 1174 -eq "$shar_count" || $echo 'Checksums:' 'original size' '1174,' 'current size' "$shar_count!" fi fi rm -fr _sh14042 exit 0