AutoBlog - Ignores entries in feed. Thinks all is good.

So we're trying to make use of the AutoBlog plugin without much success. Here is what happens!

We've launched an App for Android and iOS from which people can take a photo, add a text and hit send with an option to authenticate with twitter, facebook or to be anonymous. Every accepted entry from this App is then pushed to a RSS-feed. We purchased the AutoBlog plugin so that entries from the App is inserted to the website under /app (and tagged with app). So basically, the entire idea is based upon AutoBlog fetching the entries from the feed and creating posts for each entry.

Now, for the problem(s)...
It skips posts. We currently have about 50 entries of which only 29 has been posted to the website. When running the feed processor or tester, it doesn't even search for entries that haven't been added. It simply checks the most recent post - if it exists, stop... Now, for some reason it can even say it has added 'the latest post' without it having been added to WP. As you might have figured out, the problem is some contributions are lost even though they are in the feed and we can't have that.

Here is the feed we've set up AutoBlog to use:
http://apps.impera.se/thisisorebro/ParseRSSFeed.aspx

What can we do to make sure it doesn't skip any entries from the feed? The process/test procedure shouldn't just simply check if the last entry has been added as a post (which also fails), it should check for guids not already added and with a fallback on title or something of sorts.

Thanks in advance,
NPP

  • SOUL NPP

    Addition:
    Notice: I have already imported the first post in the feed "Örebro i vintersk..." so I would have stopped processing.

    This is false, that post is nowhere to be found (not even in the database). This is very irritating. First it stops because it thinks it has imported it, and then even if it would've imported the post it'd have stopped even though there are entries from the feed that has yet to be added.

    @Barry

  • aecnu

    Greetings NPP Reklambyr,

    Sorry to see that you are having an issue with Autoblog and I have been digging in a bit to offer some help in this issue.

    I have indeed tested this in my production server install and it is indeed a brand new install but I was only able to get imported 9 entries http://autoblog.aecnuwpmu.us/category/npp-feed/ on the first pass for some reason.

    I set the first date to be 1 January 2012 through 1 January 2014 for the test and anticipated all the posts to be pulled in that fell within that date range.

    In addition, it is certain that there is not a timeout issue on my end.

    Patrick indeed was spot on tagging the lead developer on this issue since I am having a replicated issue - at least with this particular feed.

    Hopefully the lead developer will be able to check this out sooner then later and I will let the feed run and see what response we get - the feed is set to run again in 4 hours.

    Last but not least here is exactly what the Auoblog feed report indicates for this feed and manual process:

    2013-01-18 at 08:39
    • Notice: Reached an already imported entry so stopped processing the feed - http://apps.impera.se/thisisorebro/ParseRSSFeed.aspx
    • Processed: I have processed 9 of the 53 posts in the feed - http://apps.impera.se/thisisorebro/ParseRSSFeed.aspx

    It appears that it thinks two of the posts are duplicates.

    Cheers, Joe

  • SOUL NPP

    Thanks for replying. It is indeed a matter we need fixed as soon as possible. However, here are some thoughts that would benefit the plugin:

    Allow for multiple posts with the same name
    Use other fields to determine the uniqueness. If the feed has provided a GUID-tag for the entries (items), use this as a unique identifier. If it hasn't gotten a GUID tag, continue to check for meta data -> author/enclosure/link/description > title. This way, posts that are unique will be added and identical ones not. This, what I believe, is to be expected.

    Processing or testing a feed
    This little feature needs to be tweaked to not only check for the latest added item and then stop. It should, as above, check for uniqueness. If there are unique posts that have yet to be added, either prompt the user to add these or just do it. Another thing with this 'check' is that it fails. I've got the message that post X is fetched and therefore the script has stopped where that isn't the case at all. X is nowhere to be found but the plugin thinks all is good.

    Using RSS 2.0 standards, enclosure == featured image
    Using standard XML (RSS2.0 speccs), include the enclosure as the featured image by default. Or even better, make an option "Set enclosure as featured image?" and additionally store this information as meta data. As it is now I've only found a users contribution, seen here https://premium.wpmudev.org/forums/topic/media-enclosure-addon-for-autoblog - to deal with enclosures. It works (with a few other add-ons enabled), but when updating the plugin all custom add-ons are removed. Perhaps make the update only overwrite changed files?

    I'm sure I can come up with a lot more, a few bugs as well e.g. the "AutoRepair" feature is broken. It does not follow the new guidelines for using wpdb prepare, it now needs 2 arguments.

    I hope we can get this sorted as our client is very unhappy with contributions made through the app aren't being uploaded to the website. We're checking the feed each 30min (would be good with more options for the cron job, like 5minutes etc). Also, I believe this might be related to caching? See http://codex.wordpress.org/Plugin_API/Filter_Reference/wp_feed_cache_transient_lifetime

    Best regards,
    NPP

  • SOUL NPP

    We could really use an update regarding this issue to let us know if we need to look for another solution or if this plugin will be fixed to function as intended?

    If a fix will come shortly, we could inform the client that the problem has been recognized and a fix is on the way.. If not we'll have to inform them that we relied on a plugin that did not function and so we have to either develop our own or keep looking for another.

    We feel that this plugin is the right choice if it would function as expected - not loosing posts etc.

    @Barry

  • Barry

    I'm working on updates and fixes for autoblog at the moment - but in the meantime

    http://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Fapps.impera.se%2Fthisisorebro%2FParseRSSFeed.aspx

    The autoblog plugin uses the simplepie class which relies on a feed being valid in order for it to be processed. The only thing it can do with invalid entries in a feed is ignore them, so this may be causing some issues with your feed.

  • SOUL NPP

    <author> is used to show a string, not an email in our case. As it doesn't validate cause it expects a email shouldn't really affect the plugin though? As we see it, an author might not have an email so therefore we use the field to populate either anonymous entries or the users name.

    The UTF-8 is being looked into as well as the rel="self" and atom:link w/e that is. The WP function fetch_feed() which is using simplepie has no issues with grabbing the feeds - the issues we are experiencing are missing posts due to unknown reasons and lack of error handling doesn't help us in narrowing it down.

    If the plugin could search for a given guid tag mainly for uniqueness, then fallback onto link/author/description/title it would defeat the duplicity issues? Anyways, we tried another plugin meanwhile this is getting looked at. They do provide an alternative to the psuedo WP cron, unix cron. This would also be a good option for AutoBlog since sometimes each 30min might be too big of a gap. Especially seeing as it only takes the latest post within the 30min span.

    Looking forward to seeing the changes to the plugin! Can you explain what is being looked at and what kind of updates we might be seeing in future releases?

    Best regards,
    NPP

  • Barry

    If the plugin could search for a given guid tag mainly for uniqueness, then fallback onto link/author/description/title it would defeat the duplicity issues?

    It does check the link at the moment. Which is usually the only thing it needs to check as it is very rare that you would want to import the same content (which would be at the same url) more than once into the same site.

  • SOUL NPP

    Yes we're adjusting the feed as we speak - some errors are yet to be fixed. But as earlier mentioned, SimplePie didn't have any issues fetching the feed (manually using fetch_feed()) but we're looking into validating the feed so that it meets the requirements, as in a successful feed according to the validator. It should then work as intended?

    So when I tried processing the feed manually I made use of the !is_wp_error() function, like:

    $rss = fetch_feed('url');
    if( !is_wp_error( $rss ) ) { do stuff }

    And it "did stuff", seeing as the object was successfully created being fetched with fetch_feed which is built on SimplePie.

    But as earlier mentioned, we're fixing the feed to meet the requirements!

    I'll post any updates here. However, if there are any issues with the feed perhaps the plugin could notify the user that it has issues processing instead of simply ignoring entries? It's rather odd that it can post a couple of entries but not all? Trying to understand the logic behind it all but it's quite difficult!

    Thanks again,
    NPP

  • SOUL NPP

    Ah, good to know. Thanks! I was searching for any type of documentation of the plugin but I couldn't find any. Stuff like how it processes feeds, what tags it looks for and how it functions. Perhaps this is on the way? Saw a "documentation" button on the home page of wpmudev but it doesn't lead anywhere (yet?). How many does it try to process since last run? Does it scan for entries older than last check if they haven't been added? As it appears it only checks the most recent one but maybe that is false?

    Thanks for looking into this, we're going to try it out again once the feed validates and see if it skips any entries. Overall this plugin is amazing and it should get even more downloads if it was clear what it does and what it doesn't do.

    A feature addition that I think would benefit the plugin a lot is to check for enclosure or media tags and the option to have this posted as a featured image (preferably with importing the image locally as well).

    Thanks again Barry!