One of my clients is using BP Group Calendar on their MU site. I don't know what changed or when, but over the last couple weeks the calendars (one for each 'group', at least 100 groups) is getting abused by various robots, and especially by Google. At one point last week there were over 800 simultaneous connections from one of Google's IP addresses to this site, all of the requests from that IP (over 600k that day) were to group calendars - and only a tiny tiny tiny fraction of those (less than 200) were for calendars that are remotely current. Almost all of them were for calendars in 2020 to 2080, though there were also a number of calendars from the 1990s being harvested.
Obviously, this is something that needs to be addressed. I'd like to be able to prevent indexing of many of these events through a robots meta and add nofollow to any links to these pages to ensure that they're not wasting resources or encouraging robots to get into this spider trap.
Ideally it would be configurable, but must allow me to set, for example, "any calendar older than 5 years or more than 3 years in the future should not be indexed or followed". The important thing is not simply that the page itself not be indexed (robots meta noindex), but that the links TO this page from previous calendars ALSO have the rel=nofollow attribute. Without this, the robots will continue to be trapped and index the content because the links themselves will encourage broader link indexing.
I haven't looked at the code (or even logged into her site to check out the options yet), but is this something that can be done - or is it even already a feature that she's simply failed to enable?