|
|
ffrinch
5 points
18 hours ago
I don't know how well it'd work for reddit, though.
It just seems to me that you'd get something a lot more effective with a little custom programming (and this is programming.reddit). How long would it really take to whip something up with Reverend or Orange ? |
| permalink parent |
|
|
joelthelion
3 points
14 hours ago
You raise some very good points. I agree what I did is very crude, but I wanted a way to actually test my ideas without spending too much time on it. I did try to do something with Reverend and straw, but understanding the existing code of the rss reader and hacking the GUI are two non trivial tasks, especially if you don't know gtk. Anyways, I've been testing my experiment for a few hours now, and it turns out that it already gives interesting results. So I guess it would be very intersting to do something better, but I really don't have the time to do it. |
| permalink parent |
|
|
joelthelion
2 points
13 hours ago
*
I just emailed the reddit staff about the content of the rss feeds, requesting them to add the nick of the submitter, the domain name (ex: slate.com) and a few words from the page, just like google does in its search results. We'll see what they'll do about it :) |
| permalink parent |
|
|
indigoviolet
4 points
19 hours ago
*
opera does something like this out of the box. www.opera.com Opera's RSS reader is integrated with its mail client, M2, which in turn has trainable filters. |
| permalink parent |
|
|
indigoviolet
3 points
18 hours ago
I edited the comment above to add some information. |
| permalink parent |
|
|
joelthelion
2 points
10 hours ago
I've created this little python script to help explore bogofilter's database:
|
| permalink parent |
Pondering the idea of bayesian filtering (among the applications of which a per-user bayesian reddit isn't the least!), I started looking at an easy way of implementing it. After considering hacking existing RSS aggregators, I think I finally found a way to get bayesian filtering of RSS feeds without coding anything: rss2email coupled with a spam-enabled mail client. Anybody can have it running in a few minutes, especially if you're running linux. Here's how you do it:
EDIT: don't forget to mark good messages as "ham". Bogofilter requires training both on ham and spam.