[ic] Just upgraded 4.8.9->5.2 - RobotUA question
DB
DB at M-and-D.com
Sun Dec 12 12:57:14 EST 2004
I just upgraded by foundation based catalog from 4.8.9 to 5.2.0. I
followed the UPGRADE file instructions and things went pretty smoothly.
My main reason for the upgrade was to take advantage of the RobotUA feature.
After the upgrade, I added the section below to the end of my
interchange.cfg, however I still entries like this in my apache access_log:
"GET /unlisted.html?id=gAW3nswb HTTP/1.0" 200 17202 "-" "ia_archiver"
"GET /helpfaq.html?id=SRvEvzVq HTTP/1.0" 200 32017 "-" "msnbot/0.3
(+http://search.msn.com/msnbot.htm)"
Now I thought the RobotUA prevented spiders from obtaining session ids?
Am I confused, or can someone tell me why these spiders appears to be
still obtaining session ids?
Here's what I added to my interchange.cfg, and yes I did restart :)
Thanks for any input. - DB
# Robots stuff - 12/12/2004
RobotUA <<EOR
ATN_Worldwide, AltaVista, Arachnoidea, Aranha, Architext, Ask, Atomz,
BackRub, Builder, CMC, Contact, Digital*Integrity, Directory, EZResult,
Excite, Ferret, Fireball, Google, Gromit, Gulliver, Harvest, Hubater,
H?m?h?kki, INGRID, IncyWincy, Jack, KIT*Fireball, Kototoi, LWP, Lycos,
MegaSheep, Mercator, Nazilla, NetMechanic, NetResearchServer, NetScoop,
ParaSite, Refiner, RoboDude, Rover, Rutgers, Scooter, Slurp, Spyder,
T-H-U-N-D-E-R-S-T-O-N-E, Toutatis, Tv*Merc, Valkyrie, Voyager, WIRE,
Walker, Wget, WhizBang, Wire, Wombat, Yahoo, Yandex, ZyBorg, appie,
asterias, bot, contact, crawl, collector, fido, find, gazz, grabber,
griffon, archiver, legs, marvin, mirago, moget, newscan, seek, speedy,
spider, suke, tarantula, agent, topiclink, whowhere, winona, worm,
xtreme,
ia_archiver
EOR
RobotIP <<EOR
202.9.155.123, 204.152.191.41, 208.146.26.19,
208.146.26.233, 209.185.141.209, 209.185.141.211,
209.202.148.36, 209.202.148.41, 216.200.130.207,
216.35.103.6?, 216.35.103.70, 66.196.65.??,
209.237.238.173,
EOR
RobotHost <<EOR
*.crawler*.com, *.excite.com, *.googlebot.com,
*.infoseek.com, *.inktomi.com, *.inktomisearch.com,
*.lycos.com, *.pa-x.dec.com, add-url.altavista.com,
westinghouse-rsl-com-usa.NorthRoyalton.cw.net,
EOR
More information about the interchange-users
mailing list