Ilich Ramiras
террорист в запасе | Редактировать | Профиль | Сообщение | ICQ | Цитировать | Сообщить модератору Как говорится. за что купил. за то продал: Код: New beta-build of anti-spam plugin BayesIt! is available. The changes are: + One more filtering method - the "hybrid" of Paul Graham's and Gary Robinson's methods. + "White list" of kludges now works with incomplete strings. (for example, you can enter "x-spam-..." which will be treated as any kludge, begins with "x-spam-" (like "x-spam-level","x-spam-grade" and so on). - One more bug fixed, (thanks to Alexander V. Hramov), which caused the filter and manager stuck if the very last kludge in letter header contains an empty value. + Finally autotraining became to work! Below I describe how it is done. + If you run BayesIt! manager (learnengine.exe) with the path to a letter, saved as .eml (or .msg) as command-line parameter, then manager will inverse the grade of the letter in the regarding base. It is exactly necessary if autotraining make a mistake. In this case you can just found the wrongly trained letter by it's MSID (can be found in the log-file) in The Bat!, save it to disk as, for example, wrong.eml and just run "learnengine wrong.eml". The letter regarding will be inverted to the opposite category. Now the couple of words about autotraining. A letter will be autotrained only with these conditions: a) Autotraining in options is set to be "on". b) the quantity of letters in regarding base ("power" of base) by every of two parts (spam and non-spam) is more than the number, defined in training options, "Size of base to autotrain". If this two conditions are well, then autotraining idle process will run together with The Bat! Then, for every previously regarded letter, if it's grade is more or equal to "Minimal SPAM grade to autotrain" or less or equal to "Maximal NON-SPAM grade to autotrain", the filter regard the letter as surely regarded, confirms it into preferred corpus (spam or non-spam, depends on regard), then recalculates and saves regarding base. Note, that this grade is counted in the scale 0..1000000, where "0" is grade of non-spam and "1000000" is grade of spam. Also note, that Gary Robinson's method give you more soft value, and if you use this method then autotraining process will be really seldom event. The most safe value (from the viewpoint of possibility to mistake) is exactly 0 and 1000000. This operation is repeated until there are no more letters which are "surely" regarded and can be confirmed automatically. After finishing all the letters, autotraining thread go to sleep. When a new letters arises, it wakes up again after about 10 seconds after last letter in the session received, and autotrain appropriate letters again. When you exit The Bat!, the filter finally check how many letters rest untrained, and if this number is more than parameter, defined in "Autostart manual training...", it ask you to run autotrain process. Nearest "To-do" list now includes .bye export/import features, and, of course, fixing the bugs, if somebody found them. As usually, this version is available here: http://klirik.narod.ru/arc/bayesit02c.zip (95kb). | Полагаю, что это актуально для версий 1/63x, к 1.62r прикрутить не смог:)) Хочу заметить, что плюгин от этого же производителя для spampal показал отвратительные результаты и был убит... :))
---------- -=El pueblo unido jamás será vencido!=- Пух, террорист в запасе тебя не забудет. Вечная тебе память. |
|