90
edits
Rkentjames (talk | contribs) (→rkent) |
|||
Line 128: | Line 128: | ||
===== rkent ===== | ===== rkent ===== | ||
I have soft tags working well enough now that I am running it as part of my normal email processing - though I have to run a custom TB build with several bug fixes to do that. I'm particularly pleased that the UI is so easy to use in the training. That is, once you turn on soft tagging for a particular tag, you then just continue to do manual tagging, including correcting of any tagging errors, and that does all of the training. I need to get any backend fixes into beta2 so that the TaQuilla soft-tagging extension can run with the standard release. | |||
I will need to touch the Bayesian tokenizer to fix some issues that are arising in soft tagging (bug 472005). While I am at it I intend to revive my old spam corpus testing extension, and make some minor changes to the tokenization of headers, after I test to make sure that the changes are a net improvement in performance. | |||
I am tempted to add the bayesian filter processing soon to RSS and newsgroups, as I can see that a big win for soft tagging would be to mark certain blogs or news posts as "interesting", and then only read soft-tagged "interesting" posts when pressed for time - or sort them in "interesting" order, and read the more interesting first. This would be really useful in things like planet.mozilla, where there are more regular posts than I have time to read. | |||
===== rebron ===== | ===== rebron ===== |
edits